Developing and Marketing a Client/Server-Based Data Warehouse Copyright 1993 CAUSE From _CAUSE/EFFECT_ Volume 16, Number 3, Fall 1993. Permission to copy or disseminate all or part of this material is granted provided that the copies are not made or distributed for commercial advantage, the CAUSE copyright and its date appear,and notice is given that copying is by permission of CAUSE, the association for managing and using information resources in higher education. To disseminate otherwise, or to republish, requires written permission. For further information, contact CAUSE, 4840 Pearl East Circle, Suite 302E, Boulder, CO 80301, 303-449-4430, e-mail info@CAUSE.colorado.edu DEVELOPING AND MARKETING A CLIENT/SERVER-BASED DATA WAREHOUSE by Michele Singleton, Elizabeth Taylor, Cheryl Bach-Scott, Htay Hla, and Paul Teitelbaum ABSTRACT: Information for decision-making at the University of Arizona has not always been readily available when needed. To provide better service in this area, the information technology center has developed a client/server-based data warehouse accessible from the desktop. A team approach has proved successful in introducing and demonstrating a prototype to the campus community. The University of Arizona is developing a data warehouse in a relational database to be accessed by desktop tools using client/server technology. The warehouse is a repository of data extracted from "legacy" systems and loaded into a relational database. This article provides an overview of how information is acquired for decision support and other management needs at the University, describes the prototype we developed, shares the marketing techniques used to introduce the prototype to the campus, and summarizes some of the issues we have had to deal with in the implementation stage. The context The University of Arizona campus, located in Tucson, is large in both acreage and population. The campus covers 339 acres with 150 buildings, has 4,000 faculty, and serves over 35,000 students each year. Offering some 150 majors in twelve colleges, the University is ranked 19th in the nation for research and development. Over 5,000 staff members provide supporting services to this large and diverse campus environment. Computing services are provided by the Center for Computing and Information Technology, commonly referred to as CCIT. Led by the associate vice president for computing and information technology, CCIT has four major areas of responsibility: decision support, user support, telecommunications, and computer operations. Each area plays a critical role in supporting the University's administrative systems, but it is the decision support analysts and technicians who are responsible for the maintenance and enhancement of the major administrative computing systems. The decision support area comprises five groups: distributed data management, human resources, financial records, student information, and production services. The decision support area provides the support for the three major administrative systems: the student information system (SIS), the personnel services operating system (PSOS), and the financial records system (FRS). The student information system provides detailed information about student admissions, financial aid, registration records, and accounts receivable. The personnel services operating system is responsible for payroll, applicant processing, budget, and employee records. The financial records system maintains all accounting, purchasing, accounts payable, and fixed assets data for the University. Each of these legacy systems is extremely large and contains millions of lines of COBOL code, hundreds of reporting programs, and dozens of screens, combining to support various processes on campus. The application environment is very diverse. SIS and FRS reside on an IBM 3090 with FRS data stored in IDMS/R and SIS data stored in VSAM files. PSOS executes on a PRIME minicomputer with its data in ISAM files. This configuration of multiple hardware platforms and separate, incompatible database management systems complicates the data processing environment and adds a level of difficulty when combining data items that span the systems. Meeting customer information needs These three administrative computing systems serve a myriad of customers located throughout the large campus. The growing information requirements are as diverse as the customers themselves. To support their decision-making process many of these customers have gained expertise in the report-writing tools available to them (e.g., SAS, Z- writer, OLQ, Focus[1]). While these tools satisfy many of their reporting needs, there are many occasions when customers determine a need for information that they cannot provide for themselves. Typically they rely on decision support group staff to provide them with this information. The interaction between the customer and the the decision support analyst is a paper process known as a service request. The service request form allows the customer to identify problems, enhancements, or information requirements. The form is routed to the decision support group where it is assigned to an analyst and then passes through the typical analysis, prioritization, coding, testing, and customer approval steps. The coding, testing, and customer approval phases are often repetitive and time-consuming. These phases may include mailing output from analyst to customer and back again, or scheduling meetings where the analyst and customer can review the output and determine modifications. Once the customer approves the output and signs the service request completion form, the job is scheduled and formally documented. Finally, the job can be placed into the production environment. This entire process of gaining information to make a decision can take anywhere from a few hours to several weeks depending on the complexity of the request, the mainframe loads, and the workload of the analysts. Often the lengthy service request process results in customers making decisions without information that would be available to them in a more dynamic computing environment. The combination of the time it takes to complete a service request and the increasing information demands by customers has resulted in the service request queue stretching far into the future. Additionally, increasing demands for information have resulted in an increased mainframe workload. The mainframes are rapidly becoming saturated; there are not enough computing cycles available to service all computing needs. Seeking a solution In response to the challenge to provide more satisfaction to our customers, CCIT began to explore alternatives to provide better computing services. The available options seemed to be either a costly hardware upgrade to the mainframes or moving applications toward a technology that would provide a more dynamic environment by taking advantage of both new technology and the computing resources residing on customers' desktops. The decision support group began pursuing client/server technology as a means of providing the dynamic computing environment that was needed. We pursued a client/server solution for several reasons. First, the model would make the most effective use of existing computing resources by cooperatively executing a process or transaction on two or more computers. In addition, client/server technology would allow for the migration of computing processes away from the mainframe. The desktop would provide the computing resources for data presentation and manipulation. The resources associated with data management would be provided by a separate machine fine-tuned for that purpose. Finally, the desktop tools would provide a much friendlier interface for our customers and allow them to have control over their own data needs. Developing the prototype A design and implementation team was formed and a pilot project was launched to develop a data warehouse utilizing client/server technology. This project was to be a prototype; its primary objectives were to prove the technology and to provide reporting and inquiry access to the data currently housed in the three legacy systems. The project was named the University Information System (UIS). Early in the project, the UIS implementation team determined that the data warehouse had to be stored in a relational database. The relational model offered many advantages over the older hierarchical and network databases: data could be accessed more easily, the database design was more flexible, and changes to a relational database structure could be effected in minutes, as compared to hours or sometimes days for the network or hierarchical models. As well, structured query language (SQL)--recognized as the standard for accessing relational databases--would provide an effective data access mechanism. Digital Equipment Corporation's Rdb product was chosen as the relational database and a DEC/VAX-4000 chosen as the UIS server. The Digital Educational Software Library/Campus-wide Software License Grant provided the University with the ability to purchase the software at a significant discount. The next step in developing the prototype was determining the data elements to reside in the warehouse. The server did not have enough disk storage to duplicate the volume of data in the three administrative systems. To effectively determine which elements to select for populating the new UIS database, we took an inventory of data elements used to process ad-hoc reports, selecting any element referenced more than once. These selected elements, which currently represent approximately 20 percent of the total data elements in the mainframe administrative systems, comprise the UIS database. With the contents of the data warehouse resolved, the implementation team began experimenting with the many tools available to access the Rdb data warehouse. Today, customers who wish to access the data in the UIS data warehouse may choose from a variety of client tools depending on their needs, the desktop environment, and their familiarity with various products. For example, many of the more popular desktop applications, such as Lotus or Excel, provide add-on tools that allow the application to access data in remote databases.[2] Other tools, such as DATAPRISM and DECquery, provide data access to the remote databases and the ability to export query results for use by other data manipulation packages.[3] Thus, a customer can create a query in a familiar desktop environment, ship the query across the network to the UIS server, retrieve the results, and manipulate the query results at his or her desktop. Figure 1 shows many of the configuration options now available to the UIS customer. The top box in the chart represents the server, showing the Rdb database where the UIS data reside and delineating the various communication protocols under investigation or currently in use. The bottom box in the figure shows the desktop configurations under development or in use. The SQL services and DAL connections are currently supported. Many more desktop, database, and communications products are under investigation. Figure 1: UIS Configuration [FIGURE 1 NOT AVAILABLE IN ASCII TEXT VERSION] As it exists today, the data warehouse provides the beta users with timely, accurate data. As the figure shows, users format their queries on the desktop, using the desktop resources. Queries are transported over the campus Ethernet where they are processed by the server, which returns the data back across the Ethernet to the customer on his or her desktop. The customer may choose to manipulate the data further using a variety of spreadsheets, word processors, or even other database products. To provide timely, accurate data, the data in UIS must be updated from the legacy systems. To achieve this, the appropriate data are extracted from the legacy system databases and then transferred to the UIS machine and loaded into the Rdb database. This process occurs in batch mode with different data being downloaded on varying schedules depending on the type of data and the frequency of change on the legacy system. For example, SIS student demographic data are propagated daily, while financial system transactions are loaded daily and payroll data are updated after each bi-weekly payroll. These timeframes are coordinated by the analysts responsible for the legacy systems. Typically, the time stamps are not a concern for the user; however, data are available regarding the downloads if needed. These data are referred to as "metadata." Currently customers of UIS have access to limited metadata. This information includes the table or element name, a brief definition, the legacy system datasource, and table load date and time stamp. An enhanced metadata system is under development to provide more detailed information. This detail will include an improved definition for each data element, a list of the valid values, information about where the data came from, how the data are used, how the data are entered, reporting use, owner of data, data names from legacy systems (both COBOL and database dictionary names), and the date-time stamp of the last data load for each table. The system was originally developed on a PC and accessed using a Clipper[4] application, but will be migrated to the UIS platform. The data administrator is working with the data stewards, users, and creators of the data to determine the information that will be put into the database. These metadata are critical given the number of data elements that are available in the tables on UIS. Introducing the prototype With the prototype developed and the client/server tools adequately tested, it became evident that we needed a method for introducing this new technology to the campus community. CCIT knew that acceptance by the campus community would be imperative for a successful implementation of any new technology. Campus departments would have to be sold on the new technology, new ideas, and new ways of doing business. Many people felt that a live demonstration of UIS would put the project in the limelight and generate the most interest. Therefore, a team was established that would be responsible for demonstrating the new technology and the prototype. This team consisted of three application analysts, one from each of the application support groups within the decision support area, along with two members of the original implementation team. All members of the "UIS demo team" were very familiar with the user community and understood their concerns and needs. The demo team began developing the format of the presentation. We decided that the demonstration to the customers had to be state-of- the-art and could not consist merely of simple overheads--we wanted to use software on a workstation to show the use of technology. We evaluated and selected presentation software, then developed a slide show consisting of fifteen slides which explained in a logical sequence how business was currently done at the University, and why a new way of accessing and sharing information was needed. This was followed by an introduction to the client/server environment, an overview of the UIS prototype, and the benefits derived from implementation of this new technology. Finally, an introduction to the terminology associated with relational databases and client/server technology and a live demonstration accessing the UIS database were provided. The presentation was designed so that each team member participated in the demonstration by discussing a different slide. During the live demonstration, the teamwork approach was used again, with one person speaking and another running the software. The team presentation method proved to be successful for a variety of reasons. First, it made each member more effective by allowing him or her to focus on particular portions of the presentation. Second, shifting the audience attention from one person to another made the presentation more interesting. Third, one or more persons on the team are familiar and respected by the audience. Having familiar and trusted applications analysts has helped build confidence. Finally, the cross-sectional team within the decision support group helps to communicate a joint effort by all three administrative systems groups. Many people have been involved in the evolution of the presentation. Initially, the team made their presentation to the managers of the decision support group. After this presentation, a critique session was held. This feedback was incorporated, the slide show was modified, and the live presentation was altered. The next group targeted was within the decision support group. This presentation served two purposes: it introduced the staff members to client/server concepts, while allowing them to give feedback to help the team further refine the presentation. Following these first two sessions, each department in CCIT was given a chance to see the presentation and offer suggestions for improvement. After the internal presentations were completed, the team began giving demos outside CCIT. The demo has been presented over fifty times with campus audiences ranging from vice presidents and the provost to departmental systems analysts.[5] Because the audiences were so diverse, the live portion of the demonstration was customized. For example, if the audience was a group of high-level administrators, the live demonstration was tailored to answer general University questions such as, "What is the undergraduate enrollment by residency and class level?" If the audience was a group of business managers the demonstration might contain a budget report for a particular department. After each presentation, a question-and-answer period followed. Often the same issues surfaced. Many of the concerns were anticipated, but required policy or cultural changes for the University. For example: * What server should we move to, and who will pay for it? * How will we use other servers? * Who will pay for Ethernet connections? * What client software should be purchased and who will pay for it? * What desktop hardware should be purchased, and who will pay for it? * Who will install the software and hardware, and who will provide training on it? * Who will troubleshoot problems and provide documentation? * Who will develop queries, and how will the queries be made public? * Who will provide consulting for query development? * Who needs the information? * Should there be data access/security policies, and who will administer them? * Who determines the data to be housed, and who will interpret them? Where we are today The issues and concerns that were raised in the demonstrations as well as other concerns are being addressed in a variety of ways. The most influential effort under way is the University's total quality management project. This project is known as CORe (continuous organizational renewal). The CORe project has developed the following strategic objectives: 1. Implement a system of continuous organizational renewal (CORe). 2. Demonstrate the impact of CORe by supporting projects that lead to improvements in critical processes. 3. Develop an information and analyis system to support planning and decision-making. The third strategic objective will obviously have an impact on the UIS prototype. It is possible that the UIS prototype will be folded directly into that objective, which has been broken into four strategies: 1. Identify the management decisions needed to support the objective. 2. Identify the data needed to support the management decisions. 3. Identify the administrative structure and processes to support the objective. 4. Determine the technology and infrastructure to support the objective. The CORe project has effectively raised the awareness of the need for data for decision-making. Much of the data needed for decision- making reside in the three legacy systems, and now the UIS database. The outcomes from the third CORe strategic objective will help to define the future of the UIS prototype. It is anticipated that CORe will begin to affect the prototype in the next six to twelve months. In the meantime, CORe and other projects have helped to define some of the issues and concerns that have been raised. Hardware and database issues CORe has funded the purchase of a new server for housing institutional decision-making data. This funding, along with a major project to replace the PRIME machine that runs the current payroll system, will allow CCIT to explore new options for hardware and database platforms. Both of these projects are well under way and solutions should be forthcoming shortly. Other servers and Ethernet access Thanks to the continued administrative support for the campus Ethernet (UANet), the past year has seen hundreds of new Ethernet connections and a new awareness on campus of the value of these connections. Regardless of the budget constraints, some departments are finding the funds to connect to the UANet. These connections are also providing access to other servers. Continued support of the UANet will provide more customers with access to data. However, financial constraints have impeded and will continue to impede the growth. Even though many buildings on campus are connected to the Ethernet, those departments that are not in buildings with connections are having a harder time finding the funds to get connected, and thus gain access to the database. Client software purchases Figure 1 shows many of the products that are currently being tested and used by customers of UIS. Recently, a software development team has been formed and charged with providing recommendations for future client application development software purchases. Currently, DECquery is the most popular tool, due to the proliferation of DOS machines on campus as well as the price of the software. However, if a new database and hardware platform are determined, DECquery users would be converted to a new product not yet determined. Desktop purchases Again, budget constraints have limited the purchases of equipment. However, for those departments that are able to find the funds, the user support group of CCIT has developed written recommendations and will assist users in the purchase of desktop hardware. The user support group also provides consulting for the desktop software installation. Even though financial times are tough, we have seen many new purchases on campus and many times these purchases are being made to allow access to the campus Ethernet and to UIS. UIS troubleshooting and query development Until the CORe process helps to determine the next step for the UIS prototype, troubleshooting and query development are supported by the database management group in the decision support area with the assistance of analysts from each of the legacy systems. Most of the beta users develop their own queries, and the database management group is the first line for problems or questions. Any problems or questions that cannot be answered by this group are deferred to the responsible analysts in the legacy systems groups. Though this is not the ideal situation, it works, and UIS continues to be supported in a beta version. Access and security Recently, the Administrative Computing Advisory Committee drafted an open access policy for data on campus. This policy basically states that everyone has access to data that are relevant to their job and any misuse of the data will be handled according to personnel policy. This is a very important document in terms of providing the means to develop a security system attached to the prototype[6] For now, more and more beta users continue to use the new prototype to answer their daily business questions. The prototype has been successful in proving the technology, and with TQM being implemented, we hope the prototype will also be successful in helping people do their jobs more effectively and efficiently. ======================================================================== Footnotes: 1 SAS is a product of SAS Institute Inc., Z-writer is a product of Information Associates, OLQ is a product of Cullinet, and Focus is a product of Information Builders. 2 Lotus is a product of Lotus Development Company, and Excel is a product of Microsoft Corporation. 3 DATAPRISM is a product of Brio, and DECquery is a product of Digital Equipment Corporation. 4 Clipper is a Nantucket product. 5 The demo was also presented to an audience at the 1992 CAUSE Annual Conference. 6 This document is not yet available from the CAUSE Exchange Library. For more information send e-mail to: norin@lan.ccit.arizona.edu or call 602-621-5972. ======================================================================== The authors are all employed in the decision support area of the Center for Computing and Information Technology (CCIT) at the University of Arizona. Htay Hla is the director of the decision support area and was the project leader for the UIS research and development project described in this article. Michele Singleton, a senior application systems analyst, works with the human resources systems; Liz Taylor, a principal application systems analyst, is a project leader on the student system; Cheryl Bach-Scott, a senior management analyst, is the database administrator for the UIS research and development team; and Paul Teitelbaum, a senior applications systems analyst, is a project leader on the financial system. Michele, Liz, Cheryl, and Paul are also all members of the UIS demo team. ************************************************************************