This paper is the intellectual property of the author(s). It was presented at EDUCAUSE '99, an EDUCAUSE conference, and is part of that conference's online proceedings. See http://www.educause.edu/copyright.html for additional copyright information.
A Data Warehouse Solution Using University Consortia and Business Out-Sourcing
Authors
Harold George, Director, Director of Technology & Information
Services|
St. Mary�s University
San Antonio
Texas
Gerard A. Dizinno, Ph.D., Assistant to the President for
Planning and I.R.
St. Mary�s University
San Antonio
Texas
Susan Ramsey, Principal Consultant
The Intellisolve Group, Inc.
San Antonio
Texas
Abstract
The idea of data warehousing to provide information effectively and efficiently within organizations was initially developed in the for-profit business arena. Relatively recently, institutions of higher education have begun to develop data warehousing solutions for their information needs. As a way to develop both a "state of the art" data warehouse, a consortium of higher education institutions joined with two business partners. We believe that the solution arrived at is both economically advantageous and technically desirable - and may be the most effective way for small to midsize institutions to benefit from this technology.
Introduction
What is a Data Warehouse?
Inmon (1993) was among the first to describe the idea of a data warehouse in response to perennial problems associated with institutional data management systems. Typically, institutions have several databases dealing with different aspects of the university�s functioning. The result is a patchwork of systems designed to conduct different business processes, with no connection between them, either functionally or temporally. Yet, decision-makers require information that combines information in ways that make sense to them.
Perhaps the most succinct and clear definition of a data warehouse can be found in the recent monograph by McLaughlin, Howard, Balkan, and Blythe (1998), People, Processes, and Managing Data. In that work, the following citation explains a data warehouse:
"A data warehouse is a collection of data from many systems brought together to support the needs of management. It is a �user friendly� version of the institutional research collection of census-date data sets."(p. 49)
The term "user friendly" is especially important, since developing relational databases combining information from many sources is possible. In our experience, however, those new databases still required special skills in report-building to make the data accessible to most users. McLaughlin, et al. (1998) goes on to explain the four basic problems that a data warehouse addresses: data access, data integration, data availability, and data integrity. These four issues were of great import on our campus and are addressed below.
Data warehouses have the following characteristics (McLaughlin, et al., 1998):
Subject Oriented
Organization is around subjects, like students, faculty members, etc., rather than administrative functions like the payroll, registration, etc.
Integrated
Warehouses bring together information from multiple transactional systems to form one integrated whole.
Nonvolatile
Transactional systems are dynamic/volatile, in that the data can change, literally, from moment to moment. This is valuable for a transactional system, where users need up to the minute data. It is not desirable, however, for most institutional research or management purposes.
Time Variant
The data in a data warehouse are time-stamped. They are not "updated," but remain in the warehouse for an agreed-upon length of time. New data are added, with new time-stamps, at regular intervals depending on the management needs of the organization.
St. Mary�s Context
St. Mary�s University is a moderately sized (about 4,200 students), Catholic university, in the barrio of San Antonio Texas� west side. Our undergraduates are 65% Hispanics (primarily Mexican-American) and female (about 60%). We are the largest Catholic University in the southwest, and have been in existence since 1852.
Approximately 180 full-time faculty members and 300 administrators and support staff work at the university. We, like many private, Catholic, universities are highly tuition-dependent, and budget funding for all requests each year are difficult to fund. Computer and information technology needs are great, especially on the academic side of the university, and staffing in computer-support positions is small. Administrative computing currently has three full-time staff members and two student assistants. These individuals are responsible for all maintenance and upgrades of hardware and software, development of new applications, and orientation/training of administrative system users.
Despite the size of our institution, however, the needs for accurate and timely information for decision-making are similar to those seen on larger campuses. We are required to provide the same level of governmental reporting, we choose to provide data for the same surveys and other requests for information (e.g., U.S. News & World Report) and, more important, we need to make the same smart decisions as our larger "cousins."
The problem that many institutions of our size face is that while we have similar information needs as do larger institutions, we have far fewer resources on campus to provide this information. As noted above, our IT staff is small, our budget is limited, and there are more than enough "basic" tasks to accomplish in each day to keep everyone busy.
Information Needs
The following describes St. Mary�s situation, using the McLaughlin; et al. (1998) schema regarding the four basic issues that data warehouses address.
Data Access
For McLaughlin, et al. (1998), they discuss data access in terms of documentation/definitions, security issues, and hardware/software inconsistencies. The university has a sophisticated transactional database, the CARS Information System, which has, usually, reduced these access problems, but they do exist sometimes.
Data Integration
The CARS information system is based on an Informix relational database, with more than 600 tables and more than 6,000 data elements, related to every aspect of university functioning. It is, we feel, a superior transactional system, but its sheer complexity makes it inaccessible to all but the most sophisticated users. In addition, integration of information across tables also requires a great deal of expertise.
Data Availability
At St. Mary�s, especially over the last five years, we have experienced a dramatic increase in the expressed need for information. Since our transactional system, CARS, requires a high level of sophistication regarding its organization report-writing ability, only few users on campus outside the administrative computing staff are sufficiently expert enough to use the system
Data Integrity
Since data change from moment to moment, reporting different results for the same query in transactional systems is possible for two equally sophisticated users. Thus, one is always faced with the question of, "Do I have accurate results?" associated with every report. An additional issue that we faced at St. Mary�s is that many different individuals engage in data entry without any real "vested interest" in the integrity of the data entered. By "vested interest" we are referring to a situation where everyone involved knows that the accuracy of information provided has serious consequences for decisions that they might make based on that information. If individuals never see the "results" of one�s efforts as reports, analyses, etc., then they never develop this "vested interest."
Development of a Data Warehouse at St. Mary�s
At St. Mary�s, two specific information needs served to proximately stimulate the development of a sophisticated data warehouse. The first of these was the ongoing desire to adequately analyze the patterns of, and reasons for, retention/attrition among our undergraduates. Efforts in the past were limited due to difficulties in using our transactional system for this task. We created a "mini" data warehouse, designed specifically for the purposes of retention analysis, and used successfully for over a year. The success of that system served to stimulate others in administration regarding application of the same type of solution to their own needs.
The second stimulus for development of the current data warehouse came from a series of frustrating attempts to conduct accurate cost analyses for academic departments. Our Vice President for Administration and Finance served to emphatically define the problems and the data required to conduct such an analysis. Because of his involvement, the data warehouse project took on new, larger, dimensions, and this acted to convince many "doubters" that they needed this solution at our university.
How Should Development of the Warehouse Proceed?
For St. Mary�s, once we decided to engage in development of our "mini" warehouse, we were faced with the question of how to do it. Two obvious choices existed: in-house development or outsourcing.
The In-House Option
In trying to decide the answer to this question, the issues of workload and costs were considered. Given the small number of full-time staff in administrative computing who were already more than fully occupied with a host of other tasks, we had two choices if we decided to use in-house development. First, we could take staff members from other, important tasks, and charge them with developing the warehouse. This solution was unacceptable since it would not only diminish service to other areas, but also would require significant training for our staff who were not hired with data warehouse-building skills as part of their job descriptions. Second, we would have to hire at least one new full-time administrative computing staff member, devoted solely to data warehouse development and maintenance. In pricing-out this option, we found that the annual investment for such an individual, including salary, benefits, and other support needs, would be more than $80,000 per year. This appeared very costly, especially for the "mini" system needed at the time.
Outsourcing
The notion of outsourcing key university functions is not new. Food services and bookstores are probably the two most frequently outsourced functions on campuses across the nation. In addition, some university�s are experiencing success in outsourcing residence hall management and financial management (billing, the payroll, etc.). The advantage of outsourcing is that an institution does not directly pay for the salaries, benefits, or training needs of the individuals who are managing the project. The downside can be, of course, is the lack of direct management that results. We can reduce this disadvantage, if not eliminate it, by establishing close connections and communications between the vendor and the university�s administration.
At St. Mary�s we had prior positive experiences in outsourcing small projects within the administrative computing area prior to the data warehouse project. Our business partner in some of those endeavors was The Intellisolve Group, Inc., a local company that was of great help in our data warehouse solution.
The Initial Solution: Retention/Attrition Warehouse
As noted above, a major development toward a comprehensive data warehouse was the development of a warehouse designed, initially, to help in reporting and analysis of retention and attrition. Partnering with Intellisolve, we were able to complete this relatively small application in less than six months, from initial identification of data elements to complete functionality. The first cohort extracted for the warehouse were students in the fall semester of 1997.
As it turned out, this application was useful for more than only retention analyses. The registrar used it for assistance in producing IPEDS data. The office of Planning and Institutional Research also found many ways to use the system. This warehouse was limited to undergraduates, with associated data. Plans were put in place almost immediately to upgrade this portion of the warehouse to include all other students (graduate students and law students). The upgrade of what we now call the "student" component of the data warehouse was completed for the Spring, 1999, semester.
Partnerships
It became apparent to us that to develop our warehouse more fully, we would have to involve not only our business partner, Intellisolve, but other universities and businesses. We believed, and we feel it has been confirmed, that involvement with other partners would not only be cost-effective, but would also result in our being able to take advantage of expertise that we would not otherwise have access to.
Our retention warehouse used software from the Cognos Corporation: Impromptu and Powerplay. We arrived at an agreement with Cognos for a site-license for these software products, which greatly helped in keeping the costs down. These products are desktop solutions that allow access to many individuals. Because they are Windows-based, users familiar with other Windows products more intuitively learn them. This enables us to reduce training costs.
The third partnership component was with CARS, the organization that builds and maintains our transactional system. Seeing the importance of developing a data warehouse solution that was compatible with their system, we partnered with them and they helped us in development.
The last part of our partnership was, perhaps, the most important. This was to form partnerships with other universities. Thus, we could draw upon not only their expertise and experience, but also to profit from hearing the kinds of analyses they contemplated and questions they wanted answered. Or university partners are Moody Bible Institute, in Chicago, Illinois, and Point Loma Nazarene University, in San Diego, California.
The Current Warehouse
Currently, as noted above, we have a complete student warehouse. In addition, we have developed a course-based component. We are developing applications to deal with potential donors (for development purposes) and with potential students (for admissions purposes). The software allows for three "levels" of reporting. First, ad hoc queries can be produced via Cognos’ Impromptu software. Second, on-line analytical processing (OLAP) is possible using Powerplay. OLAP provides for accessible and timely analysis of information based on the development of business "dimensions." These dimensions are aspects of the university that are especially relevant to decision-makers, such as students, time dimensions, geography, courses, etc. The software allows for easy manipulation of rows and columns of the tables produced, substitution and addition of dimensions, and various tabular and graphical output display options. Lastly, the software can be utilized to provide what traditionally would be considered executive information systems (EIS). In this sense, certain important reports are saved and updated automatically for quick and easy viewing and some manipulation by higher-level administration, such as Deans, Vice Presidents, or the President. One can consider these reports as sophisticated "briefing books" for these administrators.
Cost Analysis
As noted above, our initial foray into data warehousing showed to us that the costs of staffing and staff development were not reasonable for our institution. We believe that this is the case for many small to moderately sized institutions whose budgets, especially in technology support, are very tight. So far, the total cost for our system is less than $100,000 - a relatively small amount when one considers that to reach this point on our own we would have needed at least one full-time person over two years (over $100,000). Also, we are now in a much more "flexible" situation with regard to future expenditures. We expect that if we continue out-sourcing that our costs will not be higher than what we experience per year, and very well might be lower per year. Or, we have the option to reconsider our staffing options and perhaps get a part-time individual to do warehouse "maintenance" - a task that Intellisolve currently handles - and use this business partner only for new development options.
Summary
After trying to utilize our transactional system for years in order to assist in management decision-making, St. Mary’s University decided, in 1997, to engage in development of a data warehouse - initially for undergraduate student analyses and reporting. Currently, we have expanded the student component to include all students, and have added a course-based component as well. We are in the process of including donor/potential donor-based and potential student-based components. All of this development has occurred because of our successful partnerships with businesses (Intellisolve, CARS, Cognos) and other institutions (Moody Bible Institute, Point Loma Nazarene University). We believe our solution is cost-effective and allows small to moderately sized institutions the ability to have a sophisticated data warehouse solution.
References
Inmon, W.H. (1993) Building the data warehouse. Boston, Massachussetts: GOAL/QPC.
McLaughlin, G.W., Howard, R.D., Balkan, L.A., and Blythe, E.W. (1998) People, processes, and managing data. Tallahassee, Florida: The Association for Institutional Research