Lessons from a Successful Data Warehouse Implementation

Copyright 1995 CAUSE. From _CAUSE/EFFECT_ magazine, 
Volume 18, Number 4, Winter 1995, pp. 43-50. 
Permission to copy or disseminate all or part of 
this material is granted provided that the copies 
are not made or distributed for commercial 
advantage, the CAUSE copyright and its date appear, 
and notice is given that copying is by permission 
of CAUSE, the association for managing and using 
information technology in higher education. To 
disseminate otherwise, or to republish, requires 
written permission. For further information, 
contact Julia Rudy at CAUSE, 4840 Pearl East 
Circle, Suite 302E, Boulder, CO 80301 USA; 303-939-
0308; e-mail: jrudy@CAUSE.colorado.edu


           LESSONS FROM A SUCCESSFUL 
         DATA WAREHOUSE IMPLEMENTATION
      by John D. Porter and John J. Rome


ABSTRACT: A data warehouse is often the first 
client/server application an organization attempts. 
This was the case at Arizona State University 
(ASU), where such a project brought together 
student, financial, and human resources data in an 
integrated data warehouse. This article discusses 
the project's history and architecture, issues 
faced, and lessons learned after three years of 
work.



To remain competitive in today's business climate, 
an organization needs a foundation of quality data. 
Organizations of higher education need this 
capability as much as Fortune 500 companies. To 
ensure quality data for tactical and strategic 
decision-making, many colleges and universities are 
creating a "data warehouse."[1] A data warehouse is 
a separate store of data extracted from one or more 
production databases to produce an authoritative 
source for decision support. Some critics believe 
data warehousing contributes to an organization's 
information problem by adding yet another source of 
data. However, the success that organizations are 
experiencing with data warehousing proves it is a 
solid business strategy for the 1990s.
     
Building a data warehouse is extremely complex and 
takes commitment from both the information 
technology department and the business analysts of 
the organization. It takes planning, hard work, 
dedication, and time to create a relational 
database that delivers the right data to the right 
user. Arizona State University's data warehouse is 
not a panacea for every data problem, but it is a 
very good start toward a permanent solution.


DATA WAREHOUSING--
POPULAR, BUT NOT NEW
     
Data warehousing is not new. In fact, it is 
reminiscent of an old mainframe concept from the 
mid-1970s: take data out of production databases, 
clean them up a bit, and load them into an end-user 
database. International Business Machines 
Corporation (IBM) first coined the phrase 
"information warehouse" in 1991. IBM's original 
concept was met with skepticism because accessing 
non-relational data stores (such as IDMS[R], IMS[R] 
or VSAM[R]) was too complex and degraded 
operational system performance. Based on these 
experiences, experts now agree that a warehouse 
needs to be a separate data store built with a 
relational database management system (RDBMS). 
Names such as "information factory" or "data 
refinery" surfaced initially, but "data warehouse" 
is now the generally accepted terminology.

Warehouse Definition
     
The most widely recognized definition of a data 
warehouse is, "a subject-oriented, integrated, time 
variant, non-volatile collection of data in support 
of management's decision-making process."[2] 
Subject-oriented means the data warehouse focuses 
on the high-level entities of the business, such as 
students, courses, accounts, and employees. This is 
in contrast to operational systems, which deal with 
processes such as student registration or payment 
of an invoice. Integrated means the data are stored 
in consistent formats (e.g., consistent naming 
conventions, domain constraints, physical 
attributes, and measurements). ASU's operational 
systems have four unique coding schemes for 
ethnicity. In the data warehouse, there is only one 
coding scheme. Time variant means the data are 
associated with a point in time (e.g., semester, 
fiscal year, and pay period). Finally, non-volatile 
means the data do not change once they are entered 
into the warehouse.

Use Increasing
     
In higher education, glimpses of data warehousing 
exist in the file extracts that institutional 
research departments receive or the user reporting 
databases that the information technology 
department creates. Consequently, data warehousing 
is really an old concept with a new name and better 
technology. The data warehouse is likely to become 
the cornerstone of client/server activity in the 
immediate future.[3] So popular is the notion that 
a recent META Group report indicates 90 percent of 
their clients are undertaking warehouse 
initiatives, up from less than 10 percent just a 
year ago.[4] Judging from the number of inquiries 
about ASU's data warehouse, similar trends are 
occurring in many higher education organizations. 
In the business market, analysts estimate the 
industry will grow to $2.1 billion by 1998.[5] Some 
of the major players vying for this money include 
IBM, Hewlett-Packard Company, Oracle Corporation, 
AT&T GIS, Sybase, Inc., and SAS Institute, Inc., as 
well as vendors already established in this market 
such as Prism Solutions, Inc. and Red Brick 
Systems.


ASU'S WAREHOUSE DEVELOPMENT

Development of ASU's data warehouse started in the 
summer of 1992 as a client/server "proof of 
concept" project. Negotiations with RDBMS and UNIX 
workstation vendors resulted in a one-year lease of 
a server for the cost of the annual maintenance 
contract. While getting the warehouse server in 
place, over twenty companies agreed to provide 
complimentary copies of their data access software 
for evaluation. Although many of the access tools 
were in their adolescence at the time, accessing 
data was much easier with these graphical user 
interface (GUI) tools than with the fourth-
generation tools then in use.
     
ASU formed a development team of twelve individuals 
from the data administration and information 
technology departments to build the data warehouse. 
The team selected a representative group of 
business analysts to serve as pilot users to test 
the warehouse and access software. During the next 
few months, the team built a "student" warehouse 
model based on over 200 questions, which the pilot 
users considered difficult or critical to answer 
using current information resources.
     
During 1993, many of the original data warehouse 
team members shifted back to their regular duties, 
leaving a core of about five full-time equivalent 
employees working on the project. That core has 
remained intact, receiving additional help from 
ASU's institutional research office and many of the 
business analysts who are regular users of the 
warehouse. Also, the data administration department 
initiated a formal program to train users on the 
warehouse. To date, there are over 400 trained 
warehouse users, with two to three classes being 
taught each month. The goal is to train 1,000 
employees, approximately 20 percent of ASU's full-
time work force.

Warehouse Architexture
     
ASU's data warehouse resides in a client/server 
environment. As seen in Figure 1,ASU extracts data 
from the mainframe and loads it into a UNIX server 
running an RDBMS. ASU's warehouse server is a 
Sun[R] Sparc 630[TM] with 512 megabytes of memory 
and two processors, running the Sun Solaris 2.3[TM] 
operating system. The RDBMS is Sybase SQLServer 
release 10.x. Users connect through Ethernet to the 
warehouse over ASU's network backbone via 
Transmission Control Protocol/Internet Protocol 
(TCP/IP). The suggested GUI data access tool is 
BrioQuery[TM] from Brio Technology, Inc., which 
runs identically in both the Macintosh and Windows 
environments. Microsoft Access[R] is another tool 
used.

(FIGURE NOT AVAILABLE IN ASCII TEXT VERSION)

The process of using GUI tools to build structured 
query language (SQL) requests and bring the results 
back to a client machine is much different from the 
3270 protocols in a mainframe environment. With 
client/server architecture, once the data are in 
the workstation, users "own" the data, cutting and 
pasting at will into their favorite software (e.g., 
spreadsheet, word processor, graphic tools).

Modeling and Design
     
As business requirements and database technologies 
become more sophisticated, the need for data 
modeling and design increases. ASU uses an "upper" 
computer-aided software engineering (CASE) tool to 
design the warehouse. However, the 
entity/relationship (E/R) diagramming function and 
the object repository are the only features of the 
CASE tool used. The E/R diagramming tool creates a 
graphical representation of the data in the data 
warehouse and automates the creation of data 
definition language (DDL), the technical language 
used to create the warehouse's tables, views, and 
indexes. The object repository ensures consistent 
definitions and characteristics of fields in the 
data warehouse. While an upper CASE tool is not 
imperative in building a data warehouse, it does 
help automate the development process and the E/R 
diagrams produce "road maps" to the data. 
     
Designing a data warehouse is different from 
designing an operational system (see warehouse 
design, Figure 2). The warehouse wants data with a 
high value for decision-making, whereas the data 
content of an operational system is more 
requirements driven. ASU's data warehouse contains 
four primary databases: student, finance, human 
resource, and course. These databases are updated 
weekly, biweekly, monthly, and yearly, depending on 
the data. They also contain "official" data, which 
are captured on the census date and never changed. 
Over time, this results in an official and end-of-
period look to the historical data in the 
warehouse. For example, by the end of the semester, 
the student database provides users official and 
end-of-semester enrollment data. ASU found that 
some data in the warehouse need more frequent 
updating. These data are the valid occurrences of 
data contained in the code tables and the names and 
addresses of ASU's customers. These records are 
updated daily.

(FIGURE NOT AVAILABLE IN ASCII TEXT VERSION)

There are four basic types of tables in ASU's data 
warehouse: data tables, lookup tables, virtual 
tables, and summarized tables. Data tables contain 
raw data, extracted at the unit record level from 
the operational system. Lookup tables are code 
tables, defining the cryptic coding schemes that 
exist in the operational data. Lookup tables save 
space, improve flexibility, and allow the 
description of a code value to change while 
retaining its meaning. Virtual tables are views 
into the warehouse data. Views simplify the user's 
perception of a data warehouse, presenting data in 
a different way or restricting access to certain 
data (e.g., class roster appears as a single table, 
but the data reside physically on multiple tables). 
Last, summarized tables contain summarized data. 
These tables improve response time to frequently 
queried data and will become the foundation for 
ASU's online analytical processing systems (OLAP) 
and executive information systems (EIS).
     
Database design is a creative process. In 
fact, given the same set of requirements, two 
designers usually produce different but acceptable 
solutions. Often, in database design, it is easier 
to just do it than to explain exactly what you did. 
ASU's warehouse team follows the design guidelines 
listed in the sidebar below.

===================================================

SIDEBAR:

        ASU'S WAREHOUSE DESIGN GUIDELINES

  * Identify major subjects or topics as tables in 
    the warehouse
  * Add an element of time to the tables-semester, 
    fiscal year, etc.
  * Appropriately name fields in the tables or 
    views
  * Add derived fields when necessary-calculated 
    age, GPA, etc.
  * Duplicate data to decrease the number of tables 
    that must be joined
  * Exclude extraneous fields found in operational 
    files-"flags," etc.
  * Create logical tables or views for ease of use-
    class roster, etc.
  * Consider security and privacy during design
  * Make sure the data model answers the critical 
    business questions

===================================================

ASU'S ONGOING WAREHOUSE DATA ISSUES

ASU addressed a number of data issues in the 
process of building its data warehouse over the 
past three years, including determining what data 
to collect, how often to update the data, how to 
achieve "officialness," and how to resolve data 
security and privacy issues. ASU's experience is 
that with a data warehouse, these issues are never 
completely resolved.

What Data to Collect
     
A data warehouse must deliver the right data to the 
right people. However, the data warehouse cannot 
deliver all the data people want. People are always 
asking new questions, so predicting what data they 
need is difficult. ASU started by asking users what 
data they wanted or what reports they used. Another 
good starting point is to look at the data going to 
the institutional research department. ASU's 
experience is that once the data warehouse is 
implemented, users quickly let the development team 
know what data they want. 

Update Frequency
     
A data warehouse must deliver the right data at the 
right time, but what is the right time? The answer 
is, "it depends." In ASU's data warehouse, data are 
entered yearly, by census date, monthly, bi-weekly, 
weekly, and daily. By rule, the more often a table 
is updated, the more operational the nature of the 
data. For example, ASU's data warehouse updates 
daily the addresses of students. Many warehouse 
users create labels for student mailings and need 
current address information. Updates to code tables 
occur daily, too. However, ASU tries to limit the 
number of data elements updated daily, since there 
is a cost associated with loading the warehouse. In 
the future, daily updates to ASU's data warehouse 
will "replicate" data in operational systems. 
Replication is an economical industry solution of 
copying data and making them available to users on 
a server.

Official vs. Current
     
Making official numbers available in a data 
warehouse brings consistency and credibility. ASU 
adds official "numbers" to the warehouse to limit 
how much users need to understand the impact of 
timing on the data. To achieve "officialness," an 
organization selects census or "cut-off" dates for 
measuring data. With census dates, there is a 
distinct period of measurement, making historical 
trends much easier and allowing integration across 
systems. For some requests, official numbers are 
better to use (e.g., historical trends), while at 
other times (e.g., for financial decisions) the 
most current data are best. At ASU, both numbers 
are available from the data warehouse. However, to 
simplify user queries, official and current values 
appear in separate databases.
     
Delivering "officialness" in the data warehouse is 
not as easy as it sounds. The programs that extract 
and transform the data from the legacy databases 
must produce numbers that balance with the official 
numbers released by ASU. Since different algorithms 
and extract programs exist, there are often 
differences between the warehouse and official 
University reports. The problem multiplies because 
of ten years of data in the warehouse. Creating and 
validating ten years of official data was 
difficult. ASU found that going forward in time 
when building the data warehouse was easier than 
reconstructing and validating history.

Security and Privacy
     
Security and safeguarding privacy are major 
concerns when building a data warehouse. Security 
in a database means protecting data against 
unauthorized disclosure, alteration, or 
destruction. Granting _select_ (authorization to 
read only) access to tables or views achieves a 
certain level of security in a warehouse. At ASU, 
read-only access to the data warehouse is at the 
database level. This procedure follows an open 
access policy for employees approved by ASU's 
administration in 1993. This policy is based on the 
notion that giving employees access to data and 
holding them accountable is better for the 
organization than withholding the data.
     
Although many RDBMSs support column-level security, 
ASU has not implemented this feature, primarily due 
to the high cost of administering user access. In 
traditional operating systems, tasks or screens 
control access, meaning users only have access to a 
single record or instance of data (e.g., verifying 
admission status of a student). In a data 
warehouse, users have access to a table or set of 
tables in a subject area, which means access goes 
beyond retrieving single records to retrieving 
groups of records.
     
At ASU, the registrar's office is the trustee of 
the student database, the human resources director 
is trustee of the human resources database, and so 
forth. In these databases, read-only access 
excludes access to name and address. To obtain name 
and address information, the data trustee grants 
access to the person database. The user's business 
need determines whether access is granted. Given 
the large number of records in ASU's data 
warehouse, placing name and address in a separate 
database achieves a certain level of privacy. 
Additionally, training classes emphasize the Family 
Educational Rights and Privacy Act (FERPA), and 
users must sign a document stating they will follow 
these policies. In addition, users receive training 
on the appropriate and fair use of information.

ASU'S LESSONS LEARNED

ASU's data warehouse development improved access to 
and the integrity of administrative data, increased 
organizational awareness of data administration, 
and improved the quality of decision-making. 
However, "like a good marriage, a data warehouse is 
not created instantly; it develops only over time, 
with thoughtful goal setting to build a strong 
foundation. And like a nasty divorce, a bad 
warehouse can extract a lot of money, time, and 
energy from the parties involved."[6] ASU has high 
hopes for its marriage. But as with any computing 
project, ASU learned many lessons along the 
development path and hopes other organizations will 
learn from its experience.

===================================================

SIDEBAR: 

               ASU'S LESSONS LEARNED

  *  Develop an enterprise strategy.
  *  Identify a project champion.
  *  Avoid cost justification.
  *  Be ready for technology shortfalls.
  *  Make users aware of costs up front.
  *  Find ways to capture metadata.
  *  Build integrity and integration.
  *  Let the warehouse fill operational gaps.
  *  Invest in training.
  *  Make sure a support structure is in place.

===================================================

Develop an Enterprise Strategy
     
A successful data warehouse requires an enterprise 
strategy; otherwise the data warehouse may fail. 
The first step in establishing this strategy is 
adopting policies promoting sound data management. 
A data warehouse is easier to build and more useful 
to the organization when strong data management 
practices are in place. Before opening the data 
warehouse, ASU adopted an enterprise data access 
policy, determining who received access and what 
type of access a user received. 
     
Policies making the data administration department 
responsible for data integrity and integration of 
the data warehouse and enterprise operational 
systems also were implemented. ASU found these 
policies essential to navigate the warehouse 
project around everyone's data "turf." Also, good 
enterprise thinking must penetrate development 
culture. Users, business analysts, and the 
information technology, institutional research, and 
data administration departments must work together. 
At ASU, strong collaborative working relationships 
exist among all these areas. This culture 
contributed significantly to the success of ASU's 
warehouse implementation.

Identify a Project Champion
     
All computing projects need a champion. ASU's data 
warehouse champion is the data administration 
department. Data administration follows the 
evolution of the data warehouse according to Bill 
Inmon. Inmon says the data administrator's role has 
changed dramatically from managing the data 
dictionary to designing and constructing the data 
warehouse.[7] ASU's data warehouse put the office 
of data administration on the map and brings a new 
awareness of enterprise data to the organization. 
Users do not believe how bad their data are until 
they see them. For example, one college uses the 
data warehouse to verify professional program 
information and correct mistakes on ASU's 
operational systems. However, a data warehouse is a 
double-edged sword for the data administration 
department. Once users access the warehouse, a 
"never-ending" list of enhancements quickly 
appears. At this point, organizations will need to 
commit additional resources for warehouse 
development and support, or other data 
administration functions suffer.

Avoid Cost Justification
     
If possible, avoid the traditional cost/benefit 
analysis in justifying a data warehouse project. 
Since a data warehouse benefits the entire 
organization, ascertaining the full benefits is 
difficult, if not impossible. Fortunately, at ASU, 
a limited demonstration of the warehouse concept 
was enough to sell the project. If a more complete 
cost/benefit analysis were required, the project 
might never have started. In other words, don't 
spend too much time justifying the benefits of a 
warehouse; just start building one! 
     
A data warehouse may be inevitable for most 
organizations, since there is little chance that a 
technical breakthrough will make access to legacy 
data easier or cheaper. The Gartner Group says, 
"Organizations employing a data warehouse 
architecture will reduce user-driven access to 
operational data stores by 75 percent, enhance 
overall data availability, increase effectiveness 
and timeliness of business decisions, and decrease 
resources required by IS to build and maintain 
reports."[8] But how can all these benefits be 
quantified?

Be Ready for Technology Shortfalls
     
Client/server technology is less reliable, secure, 
and timely than its mainframe predecessor. Data 
access tools are just beginning to mature. Networks 
add new layers of complexity, and monitoring 
performance and tuning of servers is imperfect. The 
results are gaps in available technology and 
software, leaving users frustrated and their needs 
unmet. One such example is matching a cohort on a 
desktop machine with the data warehouse. Most query 
and retrieval tools do not support this type of 
function (joining a local table with server table). 
If the tool allows this function, joining data is 
slow, making the match process prohibitive for 
large databases. Allowing users to create tables 
containing the IDs of records being tracked on the 
server solves this problem. However, this solution 
defeats the benefits of client/server technology, 
moving emphasis back to the host machine. ASU's 
experience is that when problems occur with 
client/server technology, no one, including the 
vendor, knows how to solve the problem in a timely 
manner.

Make Users Aware of Costs Up Front 
     
The information technology department and 
technology infusion funding traditionally absorbed 
much of the cost of new technology at ASU. With the 
data warehouse and client/server computing, the 
cost of upgrading hardware and buying software for 
enterprise systems has shifted to the individual or 
department. Employees seeking access to the 
warehouse need to know the cost of connecting. At 
ASU, a "connection checklist" is available, 
detailing all the steps necessary for access. The 
checklist includes information on these items: how 
to obtain a data warehouse account and receive 
access, what PC or Mac and printer to buy, how to 
connect to the network, what software to buy, and 
how to register for training. ASU finds this 
checklist very helpful.

===================================================

SIDEBAR: 

                ASU DATA WAREHOUSE
               CONENCTION CHECKLIST

  *  Data Access Approval
  *  PC or Macintosh
  *  Printer
  *  Ethernet Connection
  *  Communications Software
  *  Data Access Software
  *  Software Installation
  *  Training

===================================================

Find Ways to Capture Metadata
     
One of the more difficult tasks is providing users 
and application developers a good data dictionary 
and source for metadata. Metadata are data about 
the data, including format, encoding/decoding 
algorithms, domain constraints, and definitions of 
the data. There are thousands of data elements, and 
capturing metadata is an endless task. Although 
this process is time consuming, the dividends are 
significant. ASU learned developers are more 
concerned about metadata, while users want data 
definitions. The problem is exacerbated by the fact 
that good metadata and data dictionary tools do not 
exist in client/server technology. ASU found no 
good tools to help solve the metadata problem. This 
is a problem that needs to be solved by the 
client/server industry.

Build Integrity and Integration Capabilities
     
Integrity and integration are important
characteristics of a data warehouse, and the 
characteristic lacking in most operational systems. 
These features give the data warehouse credibility, 
consistency, and real power. When designing these 
capabilities into ASU's data warehouse, the 
development team recognized that the integrity of 
the data varied. In some cases, the development 
team "scrubbed" the data; in other cases, it was 
simply too difficult. ASU's experience is that 
making the data available through the warehouse 
improves data accuracy in all systems. Knowing the 
data are observable in the warehouse is an 
incentive for those inputting data into the 
operational systems to be accurate.
     
The data warehouse also requires data that 
integrate. These are the data that span the high-
level subject areas of the warehouse. At ASU, these 
high-level subject areas are students, financial 
information, human resources, and courses. Examples 
of data that integrate or crosswalk the high-level 
subjects are fiscal year, semester or term, 
department, course, a person's unique ID, and 
account number. Data elements that integrate are 
the very fabric of an operational system. If these 
elements differ in format, domain, or values 
between systems, integrating the data in the 
warehouse is difficult or impossible. When data 
successfully span high-level subject areas, 
building a data warehouse is easier and less 
expensive. ASU's experience is that most 
integration problems need to be solved in the 
organization's operating systems before attempting 
to integrate data on the warehouse.

Let the Data Warehouse 
Fill Operational Gaps
     
Many users tend to look at the data warehouse as 
another administrative system. This phenomenon 
happens since the data warehouse is in relational 
format. While the warehouse can address some of the 
data shortfalls that operational users experience 
("data gaps"), this is not the warehouse's primary 
role. To help our users understand the difference 
between the data warehouse and their administrative 
system, we developed a slide that compares a data 
warehouse to an administrative operational system 
on a variety of dimensions (see sidebar below). 
Every talk or presentation on the data warehouse 
includes this slide underscoring the differences 
between the two. ASU reiterates these differences 
frequently to discourage users from making 
unreasonable requests of the warehouse. However, 
the truth is that ASU's data warehouse plays a 
powerful role in bringing inexpensive, temporary 
solutions to some operational computing shortfalls.

===================================================

SIDEBAR:

                DATA WAREHOUSE VS. 
              ADMINISTRATIVE SYSTEMS

  * data are read-only   * data are updated
  * serves management    * serves operational users
  * "time-fixed" data    * "current value" data
  * "what if" processing * processing is repetitive
  * data driven          * requirements driven
  * response...minutes   * response...seconds


===================================================

Invest in Training
     
Training data warehouse users is critical and pays 
good dividends. In most computing projects, 
management recognizes the need for training, but 
does not always fund training. This is true of 
ASU's data warehouse. With every new database there 
is a need for another training course, complete 
with reference materials. Every enhancement or 
change to the warehouse must be documented and 
communicated to warehouse users. At ASU, the data 
administration department assumed responsibility 
for training and documentation of the data 
warehouse. While training users is essential, it 
distracts from future warehouse development unless 
new resources are allocated.
     
Initial training at ASU focuses on the tool, the 
logic, and the data. While a data warehouse 
supports many different access tools, training with 
one tool reduces a trainee's learning curve. After 
an extensive review of data access tools, ASU chose 
a tool that works in both the Macintosh and Windows 
environment. Logic training is important also 
(e.g., SQL operators, Cartesian join, etc.). While 
this functionality is inherent in most access 
tools, training on query logic avoids many 
questions down the road. Finally, ASU's training 
concentrates on the data, which is usually what 
users understand the least. ASU spends up to 60 
percent of class time training on data, and hopes 
to increase this percentage as users become more 
familiar with access tools and query logic.

Make Sure a Support Structure 
Is in Place
     
While training reduces the number of data warehouse 
questions, a support infrastructure is key to 
handling other support needs. 
     
At ASU, there is an e-mail address where users can 
send their questions or problems. Experts on 
warehouse data, networking, and data access tools 
receive these messages and respond within 24 hours. 
ASU logs responses in a searchable database for 
users to reference in the future. Also, users can 
telephone a central help line that will send an e-
mail message for them. 
     
Second, there is a file transfer protocol (FTP) 
site available for warehouse users. This site 
stores PostScript copies of all documents 
associated with the data warehouse and copies of 
the data models. This is also a site for sharing 
common queries built by users or the warehouse 
team. 
     
Last, there is a Warehouse Users Group. The WUG 
meets monthly to share findings, educate members 
about the data warehouse, and provide feedback to 
the warehouse team. Currently, over seventy-five 
people attend the monthly meeting. The WUG also 
gives warehouse users an opportunity to find a 
"warehouse buddy," so they don't feel alone in 
ASU's world of data.


CONCLUSION
     
After three years of experience, the future of 
ASU's data warehouse is becoming more clear. 
Initially, the warehouse served as a resource for 
accessing information from legacy systems. 
Eventually, the warehouse will serve as a telescope 
into ASU's distributed data stores. Some of these 
data will reside in the data warehouse, while other 
elements will be "viewed" from the RDBMSs where the 
data reside. ASU foresees a time when the telescope 
extends beyond ASU to other organizations with 
common goals, such as the neighboring Maricopa 
County Community College District. The real power 
of the warehouse will be actualized in years to 
come.
     
The data warehouse fills an important data 
administration role in a client/server environment. 
As distributed application developers move further 
away from the central computing core, the data 
elements in the warehouse ensure the integrity of 
the organization's enterprise data. The definitions 
and coding standards in the warehouse are what 
distributed developers follow. The warehouse is the 
"glue" holding enterprise data stores together 
until a mature repository comes along. 
     
The most important contribution of ASU's data 
warehouse is the new focus on data integration. 
While attempting to achieve integration in the 
warehouse, ASU conceived a new data model which 
integrates not only the warehouse, but also the 
administrative systems. By integrating the 
warehouse, ASU obtains more powerful data. By 
integrating the operational systems, ASU gains 
strategic new levels of customer service.
     
The bottom line is that data warehousing is here to 
stay. Warehousing gives organizations the 
opportunity to "get their feet wet" in 
client/server technology, distributed solutions, 
and RDBMS. This is essential for any future 
mission-critical application, making the data 
warehouse a low-risk, high-return investment. The 
question is not simply whether to build a 
warehouse, but when.

================================

FOOTNOTES:

[1] William H. Inmon, _What is a Data Warehouse?_ 
Tech Topic 1, No. 1 (Sunnyvale, Calif.: PRISM 
Solutions, Inc., 1992), p. 1.

[2] Ibid.

[3] Colin White, "Client/Server Obsession," 
_Database Programming and Design_, Special 
Supplement, December 1994, p. 7.

[4] Katherine Hammer, "Will the Data Warehouse Be 
Warehoused?" _Relational Database Journal_, 
September-October 1994, p. 32.

[5] Rosemary Cafasso, "Praxis Forges Data Warehouse 
Plan," _Computerworld_, 10 October 1994, p. 32.

[6] Kim Nash, "Data Warehouses Mature with Help," 
_Computerworld_, 15 May 1995, p. 69.

[7] William H. Inmon, "Winds of Change: A Brief 
History of Data Administration's Amazing Growth and 
Development," _Database Programming and Design_, 
January 1992, pp. 68-69.

[8] Gartner Group, "Data Warehousing," a conference 
presentation on data warehouse, 1994.

***************************************************


John Porter (john.porter@asu.edu) is Director of 
Institutional Analysis and Data Administration at 
Arizona State University. He has seventeen years 
experience in information management and is a 
leader in the field of institutional research. Two 
years ago, Porter became ASU's data administrator 
and assumed leadership for developing the data 
warehouse. Currently, he is prototyping ways to 
build inexpensive online analytical processing 
(OLAP) systems for ASU's executives. 

John Rome (john.rome@asu.edu) is Assistant Data 
Administrator at Arizona State University. Involved 
in data warehousing since its inception, he is the 
project leader and chief architect of ASU's data 
warehouse. An expert in database design, Rome spent 
ten years of a "previous life" as a database 
administrator and programmer/analyst. He is also 
exploring inexpensive online analytical processing 
(OLAP) solutions for executive information systems 
and accessing distributed data stores through the 
World Wide Web.


***************************************************