Data Standardization Means Quality Information from Distributed Systems


Copyright 1992 CAUSE From _CAUSE/EFFECT_ Volume 15, Number 4, Winter 
1992. Permission to copy or disseminate all or part of this material is 
granted provided that the copies are not made or distributed for 
commercial advantage, the CAUSE copyright and its date appear,and notice 
is given that copying is by permission of CAUSE, the association for 
managing and using information resources in higher education. To 
disseminate otherwise, or to republish, requires written permission.For 
further information, contact CAUSE, 4840 Pearl East Circle, Suite 302E, 
Boulder, CO 80301, 303-449-4430, e-mail info@CAUSE.colorado.edu


              DATA STANDARDIZATION MEANS QUALITY INFORMATION 
                          FROM DISTRIBUTED SYSTEMS
            by Lore Balkan, Gerry McLaughlin, and Bruce Harper

************************************************************************

Lore Balkan is Data Standards Project Coordinator in the Institutional
Research and Planning Analysis office at Virginia Tech, a project
addressing data management quality issues. An active participant in
CAUSE activities, she is also Region 18 president of the Data Processing
Management Association (DPMA).

Gerry McLaughlin is Director of Virginia Tech's Institutional Research
and Planning Analysis office which provides information and data
management support for the University administration. He was a co-
recipient of the 1989 CAUSE/EFFECT Contributor of the Year Award.

Bruce Harper is a member of the Standardization of Data Elements and
Codes Project team at Virginia Tech as well as the records management
department, where he provides computer support. He is also responsible
for the maintenance and update of University policies and procedures.

************************************************************************

ABSTRACT: Increasing campus-wide interest in accessing and using data 
from distributed databases prompted a close look at data management and
administration at Virginia Tech. To increase the value of administrative
information for University-wide decision-making, the office of
Institutional Research and Planning Analysis has undertaken a project to
standardize data elements and codes across all administrative systems
and to institutionalize an ongoing standardization process to
continually improve the quality of data.


At Virginia Tech, administrative systems have always emphasized online 
user access, primarily to support sourcepoint data capture for 
administrative operational functions. System support groups exist for
each major administrative area. Management information is supplied from
a series of extracts produced by these system support groups from each
operational system. The office of Institutional Research and Planning
Analysis is a primary user of these extracts, integrating data to create
merged extracts for institutional reporting. However, there is
increasing interest across the University in accessing and using this
extracted, merged, and downloaded data for decision support activities.
The inherent challenge of integrating diverse and non-standardized data
from a variety of systems is magnified considerably as demand
continually broadens both the customer base and the scope of information
desired.

This phenomenon has led to the Institutional Research office undertaking 
a special project to improve the quality of data through the 
standardization of data elements and codes across all administrative 
systems. This article begins with a brief discussion of the data 
administration and data management principles that formed the basis for 
our data standardization project, then describes the components of the 
project and our progress to date, including lessons we have learned.

From data administration to data management

Standards are agreements on language and procedure to enable clear, 
complete, and efficient communication. Many IS organizations created a 
central data administration department in the early seventies to help 
discern and enforce minimal standards and to ensure that systems could
work cooperatively. Since most of these early systems were developed
internally by a systems development group and usually for one database
management system, the focus was on securing, cataloging, and
standardizing database definitions. A data dictionary was often closely
coupled and integrated with the particular database management system in
use.

Today we are more likely to hear about information resource management 
than data administration. The function now goes well beyond the initial 
database support function of most early data administration departments. 
The following concise and comprehensive data administration mission 
statement proposed by the Data Administration Standards and Procedures 
Working Group of the Data Administration Management Association (DAMA) 
clearly has relevance for the information management issues we face 
today:

  *  To combine activities, standard methods, human resources, and
technology for the central planning, documentation, and management of
data from the perspective of the meaning and value to the organization
as a whole.

  *  To increase system effectiveness by controlling data through
uniformity and standardization of data elements, database construction,
accessibility procedures, system communication, maintenance, and
control.

  *  To provide guidance for planning, managing, and sharing of data and
information effectively and efficiently in automated information 
systems.

This mission is significantly more expansive than earlier mission 
statements. It describes an essential management function to optimize
the information systems resources, even as technology rapidly changes,
users become more diverse and increase in numbers, and the information
support environment continues to become more distributed and complex. It
recognizes that while data administration must be acutely aware of
evolving technology and must be involved in the resulting change
management, the data administration function itself is not driven by
technology. In fact, the reverse is true: effective data administration
focuses on building a stable and quality information resource that makes
it possible for an organization to quickly respond to and take advantage
of continuous technological innovation. Moreover, with people throughout
the institution creating, managing, and disseminating electronic data,
data quality must receive attention throughout the organization. This
attention must go well beyond control activity within the information
systems organization and be understood and embraced by all of an
institution's management. The term _data management_ recognizes the fact
that managers and technicians alike throughout our organizations must
manage data and attend to the quality issues.

In discussing structures for information resource governance, Miselis 
makes the important point that "since information is an institutional 
resource that is developed and used campus-wide, the management 
structure charged with ensuring that there is effective and efficient 
use of computerized information should also be campus-wide."[1] At 
Virginia Tech, guidelines for administrative information resource 
management have been applied as University policy, providing the basis 
and describing campus-wide roles for managing data in a distributed 
environment:

The University is the _data owner_ of all University administrative 
data. University officials, such as the Controller, the Associate Vice 
President for Human Resources, and the Registrar, are _data custodians_ 
for data in their functional area. _Data stewards_ are usually delegated 
the responsibility for daily data maintenance and dissemination
activities by data custodians. Data users are individuals who need
University data in order to perform their assigned duties and are
therefore authorized access by data custodians. The function of applying
formal guidelines and tools to manage the University's information
resource is termed _data administration_ and is performed cooperatively
and collectively by custodians, stewards, and users.[2]

The success or failure of data management depends on management's 
understanding, support, and active participation in these distributed 
data administration roles. Failure to obtain participation will cause
the organization to raise barriers, such as the following:

Those who in the past have written administrative applications tend
to be set in their ways of operation. Hardware and software vendors try
to sell what they have. Operational personnel have numerous individual
pieces of automated support that they want to keep. Managers are
concerned more about making the next deadline than about improving
"their" data for someone else's needs. Everyone is already overwhelmed
with just trying to meet existing commitments with little or no time to
scale another learning curve.[3]

Clearly operational, managerial, and executive personnel must buy into 
the belief that improving data quality is worth the investment of time 
and money. This requires providing a product that has value and can be 
marketed. This product is an integrated toolset that has four sequential 
supporting parts: _people_ who perform activities; _activities_ that 
utilize data; _data_ that are manipulated using tools; and _tools_ that 
assist with creation, reference, update, and deletion of data.[4] The 
entire toolset needs to be engaged to ensure that the institution 
effectively manages its data and thus has quality information. The 
overall efficiency of this integrated toolset in terms of quality 
information systems can quickly justify the costs involved.

In building a supportive distributive environment for data management at 
Virginia Tech, we have focused on data element definitions and code 
standardization. Standards can provide the firm foundation on which to 
build the integrated toolset for quality data management. The essence of 
standardization is the adoption of a common language that enables shared 
understanding and provides capability to integrate multiple data 
sources. As such, it is a never-ending process that continually improves 
the quality of the information resource.

Standardization for quality

The process of continually improving the quality of information begins 
with a set of values. Durell's "Ten Commandments of Data Administration 
Standards" provide an excellent perspective on these values (see last 
page).[5]

The Standardization of Data Elements and Codes Project at Virginia Tech 
represents our second major stride for data management. It is a direct 
follow-up to the development of the guidelines for administrative 
information resource management mentioned earlier. Like the development 
of the guidelines, development of standards is a consensus-building
process and, as such, requires a great deal of coordination, cross-
functional interaction, and cooperation.

The purpose of the standardization project is to increase the value of 
University administrative information by implementing standards for
description, definition, and validation of administrative data. The goal
is to create both the policy and the technical tools to obtain, store,
and manipulate the administrative operational data for University-wide
decision-making. While we focus on the tools, we do so in the context of
the entire toolset. We want our standards to serve as a foundation for
generalized access and use of information from the variety of
administrative data sources by a broad base of users. Moreover, we want
to institutionalize an ongoing standardization process to continually
improve the quality of the data.

We contend that standardized data are quality data with the following 
attributes: (1) assignment of data custodial responsibility; (2) audits 
for accuracy and measures for accountability; (3) systematic edit and 
validation; (4) clear and meaningful usage; (5) consistency over time; 
(6) cross-reference to all occurrences; and (7) accessibility. Each of 
these requires that the institution provide the following support:

People and Activities
1. Data custodial accountability.
2. Data stewardship to ensure proper edit and ongoing completeness and 
accuracy.

Data
3. A single official source of critical entities such as facility or 
department with the list of standard values for key attributes.
4. Standardized data descriptions, definitions, and documentation.

Tools
5. Procedures to retain and successfully use historical data.
6. Query capability for users to identify appropriate data sources and 
procedures to address specific information needs.
7. Ready access to timely and properly secured data by trained users.

The standardization project adopted a goal statement based on the 
Shewart Cycle[6] to address the above data quality issues. The premise
of this cycle is that never-ending improvement requires movement through
the steps of _Plan, Do, Check,_ and Act (PDCA). As a result of 
validating or checking the doing of prototype work, improvements become 
a new standard or act and one goes back to _plan_, looking for new 
opportunities to continually improve.[7] This PDCA paradigm is reflected 
in the four steps to meeting the project goal:

To discover, define, document, and apply tools and techniques for 
standardizing University information by:

Step 1: Identifying critical and key University entities and related 
data elements and codes (P)

Step 2: Defining and documenting entities and related data elements and
codes (D)

Step 3: Measuring and verifying data and code quality and integrity (C)

Step 4: Establishing an ongoing process of managing standardized
entities in terms of data element edit, validation, update, alteration,
audit, correction, and distribution (A)

It is important to emphasize that the PDCA cycle is iterative. Moreover, 
for large problems or projects, multiple PDCA cycles must take place 
within a larger cycle. The four steps we have identified for standards 
are, in essence, a prototype methodology commonly used in research and 
development endeavors. Figure 1 shows the data standardization cycle 
feeding a larger quality information cycle.

[FIGURE 1 NOT AVAILABLE IN ASCII TEXT VERSION]

Establishing the standardization project

In keeping with the PDCA cycle, the first six weeks of the 
standardization project were devoted to developing the project plan.
This plan described the current environment, situation, and problems,
and identified the four standardization steps. It also addressed project
staffing with job descriptions, established milestones, and discussed
the major project activities.

Two individuals were assigned full-time to the project. However, the 
project plan clearly noted the need for additional technical support 
from computing center personnel and, even more critically, for project 
support from personnel in the various administrative offices.
Cooperation of supervisors and staff in the operational areas is
critical for making necessary modifications to local operational systems
and adopting procedures to check and adapt to changing information
requirements. It is basically their efforts that ensure enduring quality
of the standardized data and set priorities for continuous improvement.

It was especially important to develop project milestones to establish 
the project's credibility. The first three-month milestone involved 
completing a draft University standard to carry out the first two steps. 
For the second three-month milestone, a particular entity in one system 
area was selected to apply and refine the first two standards. The 
result was recommended procedures to continue standardization for other 
critical and key University entities and codes in other system areas. 
This milestone also involved work on the third and fourth steps in the 
cycle. University administrators, data custodians, data stewards, and 
information systems groups were called upon to help coordinate existing 
standards with the emerging University standard and to collaborate with 
prototype applications.

The third milestone identified in the project plan was to refine the 
draft University standard for the third and fourth steps. Again, this 
refinement involved establishing recommended procedures to continue 
standardization beyond the prototype. Other entities in other system 
areas would begin to be identified and prioritized for standardization.

The final milestone projected forward an additional six months. This 
milestone anticipated the promulgation of standardization results by 
providing user interfaces and access to standard definitions and code 
lists using available tools such as data dictionaries, relational 
databases, and an online query to extracts. It is only at this point
that the results of the standardization project take the form of an
obvious product. This milestone also projected the establishment of a
recognized data management function to continue standardization and
coordinate data quality initiatives with the various data custodians and
a user group for each system area.

Standardization activities

The project plan received endorsement when it was distributed and 
presented to the University's executive management. Figure 2 shows the
model, proposed in the plan, where data from operating systems would be
merged together into the Administrative University Data Base that would
then support diverse users. A key component of the planning model was
the repository or the dictionary of information about information. One
critical University entity, facility, stood out as an obvious early
candidate for standardization, in part because it was relatively limited
in scope. Additionally, facilities data had already been recognized as
an area in need of improvement and therefore had high visibility. There
was agreement that it should be the first entity targeted for
standardization. We proceeded to identify major project activity areas
as smaller, manageable "chunks" and prepared a plan for each. These
activity areas each involve engaging or fortifying some part of the
integrated toolset--people, activities, data, and tools.

[FIGURE 2 NOT AVAILABLE IN ASCII TEXT VERSION]

     Work with formal and informal University groups (people)

This activity involves engaging formal and informal groups to provide 
insight and feedback to the standardization process and to serve as a 
vehicle for promoting never-ending data quality improvement. The 
following efforts have been met with encouragement, enthusiasm, and 
responsiveness:

  *  periodic progress reports to an informal management group;

  *  meetings with a data steward group to monitor and review project
developments;

  *  assembly of a facilities focus group to assist with analysis and
implementation of standards for the facility entity;

  *  coordination with census file teams to standardize the census
point-in-time snapshot process and data;

  *  participation with the human resource system requirements team
on a reengineering project;

  *  interaction with the information systems departments regarding
information dissemination strategies; and

  *  roundtable discussions with the Administrative Systems Users
Group (ASUG) to review project strategy.

Additionally, we meet periodically with data custodians and always 
include them when we have major meetings with their data stewards.

     Develop the process and procedures
     for standardization (activities)

The standardization process is embodied in the four steps of the project 
goal. As the project progresses, procedures to address each step are 
documented. A "living" document is created and modified as 
standardization procedures are discovered so they can later be repeated 
for other critical and key University entities. The analysis process for 
standardization is broken down into decision points with checklists and
criteria for evaluation.

For identification of the critical and key University entities, the 
first of the four steps, we rely heavily on the already approved 
_Guidelines for University Administrative Information Resource
Management_. The _Guidelines_ list the criteria for data inclusion in 
the logical Administrative University Data Base (AUDB):

  *  It is relevant to planning, managing, operating, or auditing major 
administrative functions.

  *  It is referenced or required for use by more than one 
organizational unit. Data elements used internally by a single 
department or office are not typically part of the AUDB.

  *  It is included in an official University administrative report.

  *  It is used to derive an element that meets the criteria above.

As an entity is standardized, an official source or "entity master 
table" is either identified or created. At minimum, it must contain:

(1) key data elements--those variables that provide validation
and translate capability, and

(2) AUDB data elements--those variables that are required to answer 
University-wide questions about an entity that should be generally 
available to management, possibly via extracts.

There are four key data elements that should have an official set of 
valid values for any standardized entity. They are: (1) a standard coded 
representation across data sources; (2) a standard long name; (3) a 
standard short name; and (4) a standard abbreviation. One or more of 
these four key data elements should be included as an attribute of the 
standardized entity wherever it appears, thus paving the way to 
integrate information across distributed systems.

Prior to creating a prototype of a standardized facilities entity master 
table, the following standard definition for facility was developed:

A University _facility_ is a building, structure, site, or parking lot 
used by Virginia Tech. A _building_ is a roofed structure for permanent 
or temporary shelter. A building must be attached to a foundation, be 
roofed, be serviced by a utility in addition to lighting, and undergo 
regular maintenance. A facility that does not meet this criteria is 
considered a _structure_ and defined simply as something that is 
constructed. A _site_ is an identifiable location. An example of a site 
is the "drill field." Parking lots are special sites that warrant a 
unique category. A _parking lot_ is an identifiable and designated area 
for parking vehicles.

The next step was concentrated work with staff in the Facilities 
Planning and Construction office to refine and implement standardized 
facilities information, distribute the established standard and
companion data definitions for the facility entity, replace existing
facilities information with standardized facilities information, and
turn over data custodial responsibility for standardized facilities
information to Facilities Planning and Construction.

     Assess project progress (activities)

At each milestone or every three months, a management report on project 
progress is prepared and presented. Feedback is solicited and current 
activity and direction re-evaluated. If unforeseen obstacles surface, a 
plan to address the problem is formulated. It is worth noting that so 
far these "obstacles" generally call our attention to a necessary step 
or interface that we have overlooked and must address as part of the 
standardization process. They serve as opportunities to improve the 
quality of the process.

     Develop data definition and documentation standards (data)

Data definitions, descriptions, and documentation are created for each 
data element in an entity master table. A second "living" document 
outlines a standard for data definition and documentation. In addition 
to identifying the particular descriptive information that must be
supplied, this document also includes a standard naming convention to
provide a common language for referencing and relating like data
elements. The evolving standard has also been applied to the census
files produced in Institutional Research, is being used by several users
to define their local systems, and has potential for use as a
requirements analysis tool for reengineering projects.

     Develop data quality assurance methods and measures (data)

To check results of the standardization process, measurements of data 
accuracy must be developed. Moreover, these measures must be systematic 
and eventually automated so ongoing quality assurance can take place on 
all information sources containing standardized data. Techniques for 
measuring data quality across distributed systems must be identified and 
procedures for testing and making corrections established and 
documented. An important step already taken is creating a baseline, in 
the form of a facilities entity master table, to assess the quality of 
facilities information in a variety of existing files. Finally, as there 
are new occurrences of an entity, codes must be assigned, all 
standardized data sources updated, and users notified.The project is 
working with data custodians to establish users groups and internal 
procedures to address these change management issues.

     Develop retention procedures for historical data (tools)

The valid lists of values for the attributes of a standardized entity 
must be available over time, and for given points in time. Therefore, 
another important tool for ongoing quality information is a set of 
procedures to capture and store historical snapshots of all standardized 
data. This capture must occur on a regular documented cycle and include 
a copy of the data definitions current at the time of the snapshot. 
Census snapshots must also capture the related code validation lists 
current at the time of the snapshot. In working with the data custodians 
and data stewards to develop these procedures it is usually necessary to 
clarify the following: The procedures to capture historical standardized 
data do not replace other requirements to maintain historical 
transaction data from source systems for audit purposes. They also do 
not replace standard backups that the operational offices must do for 
disaster recovery contingencies.

     Implement repository and data distribution (tools)

A primary data management tool is the repository. It consists of data 
dictionaries for standardized elements with data descriptions and 
definitions to provide both high-level and detail documentation, cross-
reference multiple data sources, and support catalog type query, i.e.,
"what is available?" We have designed and developed this repository
based on the Information Resource Dictionary Standard (IRDS), ANSI
X3.138.1988. The repository rationale, standards, and functions are
discussed in the data definition and documentation "living" document
mentioned earlier. The system is set up so a user can access a data
element dictionary for a particular data source and view a list of all
the data elements (which also serves as a table of contents), look at a
short description of individual elements, or look at the full data
processing description of an element.

Another important "tool" receiving attention from the project involves 
the method for disseminating standardized information along with 
standardized data definitions and descriptions. This effort is primarily 
one of communicating requirements to the University's information 
systems personnel. The objective is to be positioned to deliver 
standardized information in a consistent manner to a broad base of users 
operating in a variety of computing environments, using both client-
server and mainframe technology. Meanwhile, the project has developed a 
prototype that gives users easy access to data element dictionaries, 
allows them to browse the facilities master table, and other master 
tables as they are developed, and provides the ability to read, copy, or 
print several standard reports from a master table.

Continuous improvement equals progress

The future will find us intensifying our involvement with others to 
extend and refine the standardization process. The project team will 
make itself available to campus groups and organizations that seek 
expertise and assistance with implementing standards in existing
systems. Additionally, we will continue to work with our data custodians
to develop and define ongoing data management functions and update job
descriptions accordingly.

Considerable progress has been made on standardizing the facility 
entity. We continue to refine the standardization process, collecting 
and disseminating standardized data definitions and standardized data
for the facility entity and census snapshots from other systems,
promoting compliance with standards in all data sources, and assisting
reengineering task forces with data requirements definitions. Our
primary focus is also shifting somewhat to the technical aspects of
distributing standardized information and documentation to users so they
can better manage their local functions and data and make decisions
based on quality information.

Lessons learned

It is important to take time to consider what has been learned when 
evaluating progress. As a result of our work on the facility entity, we
know considerably more about the process of standardization and are able
to move more swiftly as we take on other entities. Additionally, we see
improvement not only in the entities on which we have focused, but more
importantly, in the expanding campus awareness of the merits of
standardization. We are observing greater understanding of the
importance and specifics of conscientious data management at all levels
of the organization. We have learned the following:

  *  If the distributed system support personnel do not find improving 
the quality of their data to be rewarding, then it will not happen. 
Benefits include the opportunity to expand and exercise their 
professional skills, more effective use of technology to make their 
lives easier, better service to their users, and recognition for their 
visible improvements in terms of both end-user product and end-user 
support. Modifying their job descriptions to reflect their data 
management responsibilities and data administration activities is also 
effective motivation.

  *  From the typical user's viewpoint, occasional incorrect coding is 
not as serious a problem as the symptom, i.e., the inability to 
understand the meaning of data elements or the use of unstable or 
unknown criteria for including particular values in a database.

  *  The warm, fuzzy feeling of "making progress" must be augmented and 
supported by setting and meeting visible milestones to maintain resource 
support and morale.

  *  Success is very dependent on the project team's effective project 
management skills, as well as its ability to assemble and engage groups 
for both input and support. These skills should be reinforced and 
refined.

  *  Team members must have access to custodians and have at least some
champions in the upper administration. This is helped by treating
standardization as a cross-functional project and stressing the
importance of accountability and consensus building. Build on the
concerns and successes of your champions.

  *  Core team members should be physically located together and there
should be a minimum of two people dedicated full-time to the project.
This promotes the sharing of both technical and managerial skills, helps
keep focus on the project goals, and provides a degree of shelter for
coping with the inevitable frustrations of this type of project.

  *  A focus on quality and never-ending improvement gets everyone's
attention and shifts the focus from "why are we doing this?" to "what
can we do, how, and when?" Endless debates and the telling of "war
stories" are replaced with constructive discussions that purposefully
build on recognition of those things that are already "better than they
were." Looking back at "the way we have always done it" wastes
everyone's time.

  *  Credit for all improvements and successes should be spread as 
widely as possible. Individual ownership of project results fails to
acknowledge the importance of a broad base of involved and empowered 
"team players" necessary for enterprise-wide data management.

  *  There is no such thing as a tiny improvement. Every improvement is 
a "breakthrough," paving the way for the next ... and the next ... and 
the next. The PDCA cycle works well, providing prototypes that both 
demonstrate incremental improvement and serve as a baseline for 
continued improvement.

  *  There is still much we don't know. We have benefited greatly from 
the "lessons learned" and shared by other institutions and we recognize 
many commonalities. We will continue to seek out opportunities to learn, 
and to learn how to enjoy learning by sharing, as we all strive for 
higher quality information.

Looking ahead and stepping forward

As expected, more progress has been made thus far with the people and 
activity parts of the toolset than with the data and tools parts. 
Clearly, the prototype workwe have done positions us to progress more 
quickly in successive system areas. As mentioned earlier, we also 
anticipate additional technical support to assist with appropriate tool 
selection and development to support standardization, data quality 
assurance, and, ultimately, readily accessible decision support systems 
with quality data.

Future activities will include those already begun by the project. 
However, they will continually be expanded to include more of the 
University community addressing more of the total information resource.
Also, greater attention will be paid to measuring improvement as the
breadth of improvement expands. As our efforts bring forth end-user
decision support products, there will likewise be new activity to
support the users of those products with training, consulting, meeting
with user groups, and continual product enhancement. This links directly
with the more traditional institutional research function.

Much of the groundwork has already been laid for each part of the 
integrated toolset necessary for distributed data management. The first 
sequential part of the toolset is _people_. This is the foundation we 
must continue to fortify. We do this largely by sharing our vision, 
stepping ahead, and then expanding our vision as we learn more. We are 
seeing our earliest vision become reality. We will continue to step 
ahead, inviting more people to become involved in formulating the 
expanding vision. Our strategy is to:

  *  Influence data custodians to include the data stewardship 
activities in their system support job descriptions.

  *  Assemble formal and informal University groups and focus their 
attention on data management obligations that should be part of their 
mission and purpose.

  *  Support development of a high-level University data model to show 
interfaces between major systems and point out priorities for future 
standardization work.

  *  Develop a prototype structure for the data management function that 
addresses coordination of data selection, capture and storage, 
manipulation, and delivery necessary to support management reporting 
requirements.

We realized from the beginning that there was no "quick fix" to the 
apparent data quality problems. We also realized that our plan, in 
total, assumed a cultural change that would take time. Nevertheless, we
set out with high expectations that we could make improvements and we
have been successful. Keeping our goals before us every step of the way,
we have tackled the job, one chunk at a time. We believe the future
holds only more of the same, which includes sharing credit and
celebration for each and every incremental improvement along the way.

************************************************************************

The Ten Commandments of Data Administration Standards

1. The first rule is that there are exceptions to every rule. No
standard is applicable in every situation. However, the DA staff must
not allow exceptions to become the norm.

2. Management must support and be willing to help enforce standards. If
standards are violated, management must assist in assuring that the
violations are corrected.

3. Standards must be practical, viable, and workable. Standards must be
based upon common sense. The less complicated and cumbersome the
standards, the more they will be adhered to. Keep standards simple.

4. Standards must not be absolute; there must be some room for
flexibility. While some standards must be strictly adhered to, most
standards should not be so rigid that they severely restrict the freedom
of the data designer.

5. Standards should not be retroactive. Standards are to control and
manage present and future actions--not to undo and redo past actions. In
most cases, standards enacted today cannot apply to data design that
began several months ago.

6. Standards must be easily enforceable. To achieve this, it must be
easy to detect violations in standards. The more the process of auditing
for the compliance of standards can be automated, the more effective
will be the standards themselves.

7. Standards must be sold, not dictated. Even if upper management
wholeheartedly supports DA standards, the standards must be sold to
employees at all levels. DA must be willing to advertise the standards
to all employees and to justify the need for such standards. DA
standards demand that programmers and analysts change the way they
design data. Any lasting and meaningful change must come from the
employees themselves.

8. The details about the standards themselves are not important--the
important thing is to have some standards. DA must be willing to
compromise and negotiate the details of the standards to be enacted.

9. Standards should be enacted gradually. Do not attempt to put all DA
standards in place at the same time. Once standards are enacted, begin
to enforce them, but do it gradually and tactfully. Allow ample time for
the non-DA staff to react and adjust to new standards. The
implementation of standards must be an evolutionary, rather than a
revolutionary, process.

10. The most important standard in data administration is the standard
of consistency--consistency of data naming, data attributes, data 
design, and data use.

========================================================================

Footnotes:

1 K. L. Miselis, "Organizing for information resource management," in
J. B. Presley, ed., Organizing Effective Institutional Research Offices, 
New Directions for Institutional Research, Volume 66 (San Francisco:
Jossey-Bass, 1990), pp. 59-70.

2 L. Balkan and P. Sheldon, "Developing Guidelines for IRM: A Grassroots 
Process in a Decentralized Environment, CAUSE/EFFECT, Summer
1990, pp. 25-29.

3 G. W. McLaughlin and J. S. McLaughlin. "Barriers to information use: 
The organizational context," in P.T. Ewell, ed., Enhancing Information 
Use in Decision Making, New Directions for Institutional Research, 
Volume 64 (San Francisco: Jossey-Bass, 1989), pp. 21-34.

4 "A repository (IBM repository on its way)," Computer-world, 6 February 
1989, pp. 87-90.

5 W. R. Durell, Data Administration: A Practical Guide to Successful 
Data Administration (New York: McGraw-Hill, 1985, pp. 31-32.

========================================================================

Data Standardization Means Quality Information from Distributed Systems