Campus-Wide Information Systems: User Publication of Documents Copyright 1993 CAUSE. From _CAUSE/EFFECT_ Volume 16, Number 4, Winter 1993. Permission to copy or disseminate all or part of this material is granted provided that the copies are not made or distributed for commercial advantage, the CAUSE copyright and its date appear, and notice is given that copying is by permission of CAUSE, the association for managing and using information resources in higher education. To disseminate otherwise, or to republish, requires written permission. For further information, contact Julia Rudy at CAUSE, 4840 Pearl East Circle, Suite 302E, Boulder, CO 80301 USA; 303-939-0308; e-mail: jrudy@CAUSE.colorado.edu CAMPUS-WIDE INFORMATION SYSTEMS: USER PUBLICATION OF DOCUMENTS by Beth Ardoin and William Weems ABSTRACT: Two years ago, the Office of Academic Computing at the University of Texas-Houston Health Science Center began to implement a campus-wide system for the dissemination of information within the University. In the last eighteen months that system has been tied into a world-wide system via the Internet. This article explains the growth of that system and how the placement of information into the system is managed. During the last six years, the University of Texas-Houston Health Science Center has developed an open-systems network, based on the Transmission Control Protocol/Internet Protocol (TCP/IP), that currently interconnects over 3,300 computers. This network continues to expand at a rate of about 200 nodes per month and is part of the larger Texas Medical Center Network that includes over 5,300 Internet nodes. The goal is to create an integrated information system that maximizes the sharing of all computer-based information. This system must provide information from local as well as international sources directly to individuals as needed, and must evolve as technology and standards change. It must also allow information to be easily added to the system by its users. The University's TCP/IP-based network began in response to federal initiatives designed to standardize on a single transport protocol for research funded by the National Science Foundation and the National Institutes of Health. Concurrently, most UNIX workstations that were commercially available were being shipped with Ethernet ports and TCP/IP software. We began to realize that it might be possible to use the basic TCP/IP protocols to provide a set of common information services for virtually any type of computer. These services included the simple mail transfer protocol (SMTP) for electronic mail, "telnet" capability for terminal access to remote hosts, the file transfer protocol (FTP) for the exchange of ASCII (American Standard Code for Information Interchange) and binary files, and the network file system (NFS) for client/server-based file exchange. As TCP/IP-based local area networks (LANs) started to appear within the Texas Medical Center, the University of Texas Office of Telecommunications provided links for these LANs to become a part of the Internet. This access allows the same protocols for information sharing that had been used locally to be used in accessing the Internet. With several existing LANs and wide-area networks on campus and the promise of more, it was obvious that there was a demand by students, faculty, and staff for Internet access. A number of administrators and users were realizing the feasibility and convenience of "posting" documents on the Internet rather than dispersing the information using traditional methods. Thus, the idea of implementing a campus-wide information system (CWIS) emerged. Homegrown CWIS implemented To meet the immediate demand for the distribution of ASCII-based documents, the Office of Academic Computing (OAC) developed an in- house CWIS consisting of script files forming a menu-based system. The system was accessible only through computer terminals, either via an anonymous telnet session or dial-up phone lines. The menus and associated processes were "hard-coded" using UNIX script commands, and were not standardized. Protocols such as FTP, REXEC, and NFS were driven by script or C programs to automatically obtain information from other Internet nodes and incorporate them into ASCII files for distribution. However, no attempts were made to integrate our CWIS, as a whole, with other campus systems. The initial CWIS clearly demonstrated to the campus community that a distribution system that primarily delivered ASCII text had valuable potential. In particular, it was recognized that (1) users would always know where to look for information, (2) desired information could be accessed from any Internet node, (3) updated information would be immediately available, and (4) publication costs would be drastically reduced. Although these strong points were appreciated, the system definitely had shortcomings. It suffered in that (1) it could not be efficiently maintained, (2) it did not adhere to a user interface that was standard for similar information systems, (3) its existence was not readily visible to the world, (4) it provided no standard method to transparently access other functionally similar information resources, and (5) it was not a client/server- based system that coupled the resources of personal computers with those of information servers. WAIS introduced As these advantages and limitations were being considered, the Wide Area Information Servers (WAIS) system was developed by Thinking Machines Corporation, Apple Computer, Dow Jones, and KPMG Peat Marwick. This client/server-based system permits clients on both personal and larger computers to access text, images, voice, or formatted documents located on WAIS servers. As a searching tool, WAIS makes finding relevant text in a document easy, because the WAIS search engine, not the user, does the work. The WAIS indexing process categorizes the words and makes all words of the entire text accessible for search, with exclusion of articles, conjunctions, etc. Combinations of words can even be used. In cases where phrases are submitted for search, the WAIS search engine responds with as close a match to the entire phrase as it is able to make, then lists all references that contain each of the individual words. While WAIS is an excellent tool for searching if you are looking for specific data, the system of WAIS servers also has weaknesses. For example, the list of WAIS sources has grown by leaps and bounds. Although there is a list of registered sources, no hierarchical menu exists that groups the sources by subject. Also, you cannot just peruse the text without first trying a word search. If you just want to read a small portion of the text, WAIS is not the tool to use. For the University, WAIS became a dynamic tool. The following are a few examples of how the WAIS system has demonstrated its usefulness. * The University's Office of Information Services (which oversees administrative systems) works with the Human Resources department to pull together information on all employees, including names, departments, home and work addresses, phone numbers, and so forth. This information, used with the WAIS tool, allows anyone with Internet access to search the campus phone book to locate an employee or employees by name or other descriptive information. * The Office of Research Services gathers information on all active labs, research being done, papers written, animals and chemicals used, and so forth, and compiles this information into the Catalog of Research Expertise. With the WAIS system, researchers on or off campus could find UT-H labs working with any variety of chemicals, techniques, or animals. * The Purchasing and Receiving departments have used a similar process to help researchers and staff locate hazardous material numbers. As a stand-alone system, however, WAIS does not provide a basic, integrated CWIS system. In particular, the protocol does not provide for a standard user interface that appropriately organizes documents and services so that they can readily be browsed according to subject. The protocol also does not specify a standard, intuitive method for adding available sources to WAIS clients. To partially resolve these problems, text- and X-windows-based WAIS clients were implemented on the host supporting our initial CWIS. This approach allowed users to be directed by the CWIS menus to appropriate WAIS sources. However, it negated the important advantages associated with having clients running on personal computers so that information is directly and transparently delivered to their desktops. Gopher protocol adopted Since the demand for a more functional, expanded CWIS continued to grow, we systematically began to investigate the efforts of other universities and groups to develop a standard CWIS protocol. As a result of this investigation, we concluded that the Gopher protocol for document distribution developed by the University of Minnesota would be used to implement a new CWIS for the Health Science Center. This protocol was selected for a number of reasons. For one, users see the world-wide networked system of Gopher servers as a single hierarchical system of documents, directories, full-text search tools, and other services. Second, it is a client/server-based system that enables users to easily and transparently access servers anywhere on the Internet and functionally transfer data to their personal workstations. Third, it incorporates a well-defined set of standards for defining data types, establishing connections between servers, transferring data, linking menu entries to data files, etc. Fourth, it is becoming a de facto standard for colleges, universities, and other organizations that utilize the Internet for the rapid exchange of digital information. Finally, Gopher is a recommended protocol for document distribution within the NSF implementation plan for the Interagency Interim NREN. Our Gopher-based CWIS was activated in March of 1992. Users were immediately impressed with the system's ease of use and simple design. Gopher requires virtually no training for access or use of the system. One of its best selling points is its connectivity. Gopher servers, once listed by the Minnesota Gopher, are all capable of being interconnected. To move from one Gopher server to another, the user merely chooses a different topic. The addressing and other connectivity issues are all invisible to the user. It is as though any information available via any Gopher server resides on the user's personal computer. Files can be mailed, saved to disk, or printed by an average user, without training. Converting our CWIS to Gopher necessitated the creation of an editorial staff and new document publishing tools. Editorial staff function required Although the OAC had only seven staff members, training and experience among this group were diverse. Two had newspaper or yearbook experience (layout/design), four were programmers, one was a systems expert in UNIX, three had expertise in PCs, one was a Mac expert, several had integration experience, and three had teaching degrees. With these overlapping skills, we were able to draw on the strengths of each to implement the Gopher server. A team was formed for that purpose, made up of an "editor-in- chief," who headed the team and oversaw all work, and other staff who had responsibility for various pieces of the whole. It was imperative that the group work as a team, rather than each holding tightly to his or her own piece of the project. In addition to the editor, the team included programming and design functions. With such a team, we expected the end product would be usable, effective, and aesthetic. The assistant to the OAC director took on the responsibility of editor-in-chief, because of her background in publication, while the director assisted in editing and creating the menu structure. We felt it was more important for the editor-in-chief to have strong communications and information skills than to have programming skills. This helped avoid the possibility that the end product would be geared only toward computer-proficient users. The CWIS had to be easy to use to attract non-traditional computer users. The editor-in-chief became responsible for gathering information for inclusion, assisting information providers in preparing their information properly, creating the overall vision of the CWIS, and structuring the menu. She also found that creating handouts of instructions for information providers and teaching 1-1/2 hour courses on how to access CWIS greatly increased use and interest, so these were added to her responsibilities. The computer category on the CWIS was structured and is still maintained today by OAC's assistant director, who heads the programming staff. Once the Gopher server was up, the editorial staff (consisting primarily of the editor-in-chief from the development team and the director) assumed several ongoing responsibilities; primary among them was maintaining a state-of-the-art electronic publishing environment. This requires the staff to evaluate and implement document distribution protocols, publishing servers, and reader clients. Appropriate software tools to support publishing activities are also developed as needed. These tools are requested by the editorial staff and developed by the programming staff. A second responsibility is the layout and production of the CWIS. This includes assisting the information providers in defining appropriate text for distribution, as well as the menu and submenu categories. When campus groups become interested in publishing new information on the CWIS, some editing requirements are necessary. While we agree to hand-hold the information providers in formatting the documents for publication, it is important to note that the responsibility for producing ASCII documents lies in their hands, not the OAC staff's. As a starting point, a list of standards was created (see sidebar for a sample of some of these). When ASCII documents are appropriately formatted, the editorial staff assists in creating or modifying existing menus and assigning menu ownership. Owners of information are often provided with appropriate tools that enable them to self-publish their material via the CWIS. If necessary, the staff will enter the information for the owner until tools can be created. The editorial staff is also responsible for monitoring the overall look and feel of the CWIS. These activities include (1) confirming that menu pointers to remote information sources remain functional, (2) ensuring the general availability and timeliness of documents, (3) checking the functionality of WAIS sources, and (4) guaranteeing the adherence to certain basic standards. These standards include conformance to the formatting guidelines for ASCII text, the inclusion of ownership information within documents, conformance to copyright law, and the general readability of documents. Finally, the editorial staff has had to assume certain public relations responsibilities. The "reader" base had to be expanded by advertising, assisting users, analyzing usage trends, and responding to user suggestions. The publishing service also had to be actively promoted by contacting possible information providers, demonstrating both user and publishing aspects to various individuals, and responding to queries about the publishing capabilities of the CWIS. Publishing tools developed Publishing tools have been developed in ANSI, C, and Bourne scripts to enable users to directly publish information from either their desktops or from mainframe hosts. One of our publishing tools is called Mailman. Mailman was developed to permit users to publish ASCII documents directly, via e-mail. The process begins when a document is created for publication. Before publishing, the document must be converted to ASCII format. Then it can be mailed to the user "Gopher" (or another specified userID) on a computer that is running the Gopher server daemon. (The Gopher daemon is a utility program running in background on the server, awaiting queries from Gopher clients.) Mail addressed to this designated userID that is received by the host is sent to the Mailman program. Mailman checks to determine if the person sending the message from that particular host is in a list of approved user/host pairs with permission to remotely publish information. If the userID at the specified host/domain is included on this list, Mailman checks the subject field of the message to determine where in the hierarchical Gopher system the information is to be placed. In other words, the Mailman program checks to see that the userID, host/domain name, and subject heading all match. If all checks out, the ASCII file is published. As an added security for quality work, the Mailman program then sends e-mail to the editor-in-chief, with information on which files have been updated. This allows for checking of quality and format. A diagram of this process is shown in Figure 1. (Figures not available electronically) A second publishing tool created by the programming staff, MakeWAIS, was produced to convert information from mainframe computers. When a report is generated, either at set intervals or on demand, it is converted to an ASCII text file from the appropriate database, located on the mainframe computer. The ASCII file is then retrieved by a UNIX-based computer, using the FTP protocol. Once the data are on the UNIX server, the MakeWAIS tool indexes the data into a full-text WAIS source. As on all WAIS servers, the file is then completely searchable. When MakeWAIS creates a WAIS index, the program produces several files. These files need to be moved to proper areas in the Gopher server. The MakeWAIS tool moves all of these files. If the information becomes obsolete, the CleanWAIS tool retrieves all of the various files for that database and removes them from the system. This process is illustrated in Figure 2. These tools, plus the UT-Houston Gopher client (Ugopher), are available for download in the anonymous FTP archives on: oac.hsc. uth.tmc.edu As the system develops further, new tools will need to be created to help users publish with ease. It is our opinion that any new tool requiring an entire pamphlet of instructions should not be given to a user, but rather thrown into file 13. The Gopher protocol is an elegant model of ease and usability that should be copied. Moving forward The University's campus-wide information system changes daily and will continue to do so. In the eighteen months we have used these tools for our CWIS, we have found that interest has blossomed. Small classes of ten to twenty employees are held regularly on campus to introduce the system. We feel it is important to show the system to all levels of employees in the actual environment where they will use it. Also, the small classes allow for interaction and questions, spawning new interest. We hope with each class given that one or two more people will find the system an aid to their jobs, and that they will identify more information to include in the CWIS. The one-on-one work with information providers is still necessary. Although it means that the CWIS staff need to learn several systems, in this age of rapid technological change, this is seen as an asset. Since more and more of our interaction is done via e-mail, these meetings allow the OAC staff and the information providers to match faces and voices with e-mail addresses. This, we think, greatly enhances the team feeling. We have also added an additional staff person to upload information and assist in the day-to-day office work, to make more time available for classes, hand-holding, and interest-building. The UT-Houston campus-wide information system is still growing and developing; we are still adding information into the CWIS via Gopher and will do so for some time. At the same time, we have already begun building a sister system in World Wide Web ... but that is the subject of another article. ****************************************************************** UT-H Health Science Center's Guidelines for Preparing Text for the CWIS Avoid the use of tabs; instead use spaces. ASCII tabs are not converted to a constant number of spaces by all display devices. Thus, the use of ASCII tabs often causes text to be displayed in a jumbled fashion. Delete bullets or other special symbols such as Greek characters and/or replace with standard ASCII characters. Character-based terminals can only display ASCII symbols. Single-space lines of text, but double-space between paragraphs. This maximizes the amount of text that can be displayed per screen while preserving readability. Use a non-proportional font, and limit line length to a maximum of 78 characters per line. If all characters do not take the same amount of space, it is difficult to ascertain whether each line has less than 80 characters. Lines that exceed the 80-character limit will either wrap improperly or run off the screen. Place headings in capital letters and text in upper and lower case. This enhances the identification of the beginnings of thoughts and divisions of chapters, sub-chapters, etc. Place identifiers around headings (dashes or equal signs). This enhances the readability of the material and prepares the text for conversion to a WAIS index, if desired. Prepare text in a neat, orderly, and easy-to-read format for both paper and electronic copies. For first drafts we suggest submitting both a hard copy and an electronic copy, in case there are errors in the formatting. ****************************************************************** Beth Ardoin is the Administrative Assistant for the Office of Academic Computing at the University of Texas-Houston Health Science Center. Her background is in education and publishing. She is editor of the campus-wide information system and directs the distribution of information on the American Physiological Society Gopher on the Internet. She has a bachelor's degree in education and communications from the University of Southwestern Louisiana. William Weems is Director of the Office of Academic Computing (OAC) at the University of Texas-Houston Health Science Center. The OAC coordinates the academic computing efforts of the six schools that make up the University of Texas-Houston. He has played a major role in developing the Texas Medical Center network, which includes forty-one separate institutions. Dr. Weems holds a Ph.D. in physiology and biophysics from the University of Illinois. ************************************************************************