Getting the Right Fit: Institutional Downsizing Without Capsizing Copyright CAUSE 1994. This paper was presented at the 1994 CAUSE Annual Conference held in Orlando, FL, November 29- December 2, and is part of the conference proceedings published by CAUSE. Permission to copy or disseminate all or part of this material is granted provided that the copies are not made or distributed for commercial advantage, that the CAUSE copyright notice and the title and authors of the publication and its date appear, and that notice is given that copying is by permission of CAUSE, the association for managing and using information resources in higher education. To copy or disseminate otherwise, or to republish in any form, requires written permission from CAUSE. For further information: CAUSE, 4840 Pearl East Circle, Suite 302E, Boulder, CO 80301; 303-449-4430; e-mail info@cause.colorado.edu GETTING THE RIGHT FIT: INSTITUTIONAL DOWNSIZING WITHOUT CAPSIZING Timothy J. Foley Lehigh University Bethlehem, PA 18015 E-mail: tjf0@lehigh.edu ABSTRACT Downsizing and rightsizing are buzzwords that have gained much acceptance in current computing literature. Can mainframes be replaced with high performance clusters of workstations? That is a question many computing center directors are asking, as high-end workstations eclipse the capabilities of traditional mainframes.At Lehigh, we have found that the answer, at least for us, is "Yes". Lehigh University has undergone a dramatic change in its computing environment over the last three years. Starting in 1991, Lehigh eliminated its three academic mainframes, introduced more than 150 workstations, installed compute and file servers, and introduced software to make this combination act as a unified computing system.The result has been much greater computing power, all financed from existing funds. LEHIGH OVERVIEW Lehigh University is an independent, academically selective, comprehensive university (4500 undergraduates and 2000 graduate students) which has been described by Lehigh's president as "small enough to be personal, yet large enough to be powerful". Lehigh has four colleges: Arts and Science, Engineering and Applied Science, Business and Economics, Education, and also 35 research centers and institutes. In 1990, Lehigh's primary computing environment consisted of a CDC Cyber Model 180 purchased in 1985, a DEC Vax 8530 purchased in 1987, and two IBM 4381's purchased in 1986 and 1988.The IBM mainframes provided administrative support and support for Lehigh's Campus-Wide Information System (CWIS). Lehigh had also implemented a small workstation site with Sun workstations and provided over 300 microcomputers in public sites.By 1994, this environment had been transformed into a mainly Unix environment supporting IBM RS/6000s running AIX with three of the existing mainframes removed and over 150 RS/6000s placed in public sites. Lehigh has installed three distinct server clusters to serve the requirements of the user community.A cluster of RS/6000s has been designated as compute servers and consists of two IBM RS/6000s, models 990 and 580. Another cluster is designated as network servers which support the needs of our CWIS and consist of IBM RS/6000s models 990, 980, TC10, and 560. Three RS/6000s have also been designated as AFS file servers to provide file services for the 150 distributed workstations.Along with this increased computing power, Lehigh's microcomputer support requirements have continued to grow with over 350 public machines and thousands of other microcomputers appearing on campus. It should also be noted that some administrative applications have been moved to RS/6000s though most of them are still running on the remaining IBM 4381. Lehigh's high-speed networking activities have also been an integral part of our downsizing plans.In 1985, Lehigh installed a totally digital, non-blocking PBX providing voice and low-speed data connections (19.2 Kps).Data connections were provided everywhere including classrooms and student living units. Lehigh has since expanded its networking capabilities to include a high-speed fiber optic backbone supporting FDDI data rates of 100 Mbps. All of the Computing Center's sites are connected to the high-speed backbone along with most of the campus buildings.Most of the subnets connecting to the backbone are 10 Base-T ethernets with the exception of public sites. These were the first sites networked and were wired with thin ethernet.Connections to the residence halls were started in the fall of 1993 with plans for completion by fall, 1995. CATALYST FOR CHANGE The major catalyst for changing Lehigh's computing environment was the development of a five-year strategic plan for computing and communications. This plan was developed in the spring of 1990 and finalized in 1991.The plan called for the removal of all academic mainframes.It also specified the transition to a Unix environment.At that time, the Computing Center was supporting the following operating systems: CDC NOS/VE, DEC Vax VMS, IBM VM/VSE, IBM VM/MUSIC, and the Apple Macintosh and MS/DOS Windows environments. The plan recommended a phased approach with the removal of the academic mainframes and an upgrade to the Administrative IBM 4381. Administrative applications were to be migrated to the workstation environment at a slower pace than were the academic applications.The Vax 8530 and CDC Cyber 180 were to be replaced by a compute server.RISC, vector, and parallel machines were investigated as possible replacements.Another key aspect of the plan was the requirement to implement a common database throughout the campus. For this, the Oracle database was chosen and was used as the basis for replacing the CWIS running under the MUSIC operating system on the IBM 4381 mainframe.Other key components of the plan were the funding of distributed servers for academic departments (with research departments providing their own funding) and the financing of all these changes with existing funds. IMPLEMENTATION STRATEGIES A typical university implementation strategy was taken to plan for the removal of Lehigh's mainframes (i.e., form a committee with an "interesting" acronym).The acronym chosen was CINC (Computer Intensive Needs Committee), pronounced "sink", to enable Lehigh to downsize without capsizing.Often during this process we have had "CINCing" feelings so the acronym seemed very appropriate.The committee was composed of three members from the four colleges and the Computing Center. The first task of the committee was to survey our existing users.Users were queried on software applications, current satisfaction with our systems, possible conversion problems, and the characteristics they felt were needed in a new system. The survey indicated that 84% of the mainframe usage was research related.Figure 1 illustrates what users perceived as the major limitations of the current machines with response time being the largest problem. Figure 2 shows that users wanted at least 10 times thepower of the CDC Cyber with computing speed and better graphics rated as very important features. [FIGURES NOT AVAILABLE IN ASCII TEXT VERSION] Besides surveying the users of the system, the CINC committee also held an open faculty meeting (with 44 computing intensive users attending) to discuss the five-year plan, the current hardware options, and users' current needs. The result of these activities was a report prepared for the Computing Center Advisory Committee (CCAC).The report stated that: the current mainframes were saturated; researchers had already begun moving to a workstation environment; and high-speed network expansion was critical to the needs of computing intensive users.The final recommendation was that since about 90% of Lehigh's computing power was being consumed by 10% of the users, Lehigh should attempt to provide a computing solution to satisfy this 10%. The development of a request for proposals (RFP) was the next step in the process of determining an appropriate replacement system.Eleven vendors were originally contacted representing RISC, massively parallel, and vector architectures.Ranking of vendors was based on system capacity, application software, system management, end-user software, and three year costs.Software availability was a major requirement and eliminated three of the vendors.Of the remaining vendors, seven responded to the RFP and were asked to run a set of representative benchmarks.The benchmarks were developed from representative user jobs and ported by the vendors to run on their systems. The choice of vendors was narrowed to three -- HP, CDC, and IBM -- based on both the benchmark results and our rankings.During this time, negotiations were also occurring concerning a large acquisition of workstations.An opportunity to work on a development project with IBM was the key factor in Lehigh's final decision to go with IBM and their RISC platform for our replacement solution.The key issues were that the benchmarks were inconclusive when comparing the top three vendors and that IBM offered the best deal financially. MAJOR TRANSITION ISSUES CWIS Migration One of the first issues in replacing Lehigh's mainframes was the migration of Lehigh's current campus-wide information system which had initially been developed on an IBM 4381 running the MUSIC operating system. In 1989, it was decided to move to another platform and work was begun to port the software.A major challenge was designing the system to remain consistent with our existing interface while planning on moving to a client/server platform and an eventual graphical hypertext interface.The IBM 4381, which utilized flat ASCII files that were accessed by a control file, was replaced with a distributed database model using the Oracle database management system on a cluster of RS/6000s. Lehigh's CWIS provides the following applications to a user base of over 7000 active accounts: electronic mail, bulletin board and conferencing facilities, access to national and international networks, on-line forms and survey processing, fax delivery and retrieval, and an access point for library services. This system is widely used by the campus with 95% of the community using the system on a regular basis. Lehigh's goal of decoupling the user interface from the actual database was accomplished to some degree and recent developments have allowed the client portion of the interface to run using Mosaic and World-Wide Web. Another important design feature is the concept of "distributed services" encompassing both the management and the location of databases and applications.Data management activities are the responsibility of individual topic coordinators.Topics on the CWIS are fed and nurtured by, and the responsibility of, the information providers.Another aspect of distributed services is the ability to access designated hosts or data directories. Distributed applications are also being explored to allow selected applications to transparently run on another host. The overall goal is to move from a tightly coupled cluster supporting all available services to a more diffuse system.This will allow data and processing power to be distributed to the most appropriate location. Software Identification and Licensing Software identification was first done by surveying users and contacting vendors on availability of software.Committees were also formed to identify software which would increase the functionality of the workstations in the distributed environments.Reports were prepared on: desktop environments, graphics/CAD software, text processing, database and spreadsheets, mathematical software, scientific libraries, and statistical software.These reports resulted in obtaining a number of attractive site and floating licenses.The major site licenses obtained were Maple, NAG, and BMDP.Floating licenses were obtained for Matlab, Island Graphics (Write, Paint, and Draw), AutoCAD, WordPerfect, and Lotus.Another major agreement that made the IBM RS/6000 platform very cost effective was IBM's HESC (Higher Education Software Consortium) program.This program dramatically reduced Lehigh's operating systems cost.Lehigh's overall software costs for the CDC Cyber and DEC Vax was $150,000/year, while it is only $130,000/year for the RS/6000s.This is in contrast to the first estimates that had the software costs in the distributed environment at over $250,000/year.Figure 3 illustrates that Lehigh's major savings have been in the area of operating system costs while application costs have been very similar. [FIGURE NOT AVAILABLE IN ASCII TEXT VERSION] User Training and Documentation The overall transition to the new environment required a great deal of additional staff and user training and a revamping of practically all the existing documentation.User training was accomplished in a variety of ways with special seminars developed to deal with users' transition problems.These were hands-on sessions which dealt with the new operating environment, and also special conversion problems that needed to be addressed in moving from the NOS/VE and Vax operating systems to Unix.The move from MUSIC to Unix was easier since we kept the same interface in moving from the IBM 4381 to the RS/6000s.Figure 4 illustrates the growth of our Unix-related seminars over this timeframe. [FIGURE NOT AVAILABLE IN ASCII TEXT VERSION] User documentation was and still is a big issue that needs to be addressed in the workstation environment. Essentially, all existing documentation had to be revised or rewritten during this transition period. The Computing Center investigated a number of tools such as IBM's Infoexplorer, utilizing Unix man pages, and possibly creating a searchable WAIS server to provide on-line documentation.Initially, a simple text help file was placed on-line which listed all available commands and where to go for help on running them.The Computing Center has also started to provide all its documentation on WWW; User's Guides, Technical Bulletins, and seminar handouts are currently being converted into HTML documents. Tape, Program, and File Conversion The conversion of tapes, programs, and files presented another interesting problem for the Computing Center. Hundreds of tapes resided in the machine room and many of them had been in the tape library for years.Each tape user was sent a list of his or her tapes and also was contacted by students hired to work over the summer to assist with tape conversions. This process went better than expected and many users determined that the data that they had stored for years was really not worth saving. Program conversion was handled by providing hands-on conversion training sessions and by individual consultation. Back-up sites were arranged to assist with tape and program conversion for users who had problems getting the conversion done.Files were migrated automatically for CWIS users while Cyber and Vax users issued a command that transferred files to individual Cyber and Vax directories on the compute server cluster. Hardware Maintenance A major cost issue associated with distributing hundreds of workstations throughout the campus is how to maintain these devices.The cost to provide reasonable maintenance for all of these devices was double our existing budget.Vendors were contacted and proposals were received from each.After analyzing the costs, it was decided to provide self maintenance through our Computer Store with a parts contract from a parts supplier.Critical machines such as the compute servers, AFS file servers,and CWIS servers were kept on vendor maintenance.Figure 5 shows that the overall hardware maintenance costs have been reduced by over $30,000/year. [FIGURE NOT AVAILABLE IN ASCII TEXT VERSION] Distributed Support Issues Once the university entered the agreement to receive the 150 RS/6000 workstations, space needed to be found to house them.Departments made proposals outlining how they would use the workstations, the space they had to house these workstations, and their software requirements.This resulted in the creation of 12 semi-public sites which were to be available to the public when not in use by the departments.Procedures were established for providing support for these sites with each site having a department contact and Computing Center support personnel from Systems Programming, User Services, and Operations.Meetings were initially held to establish the guidelines for support of these sites, and software requests were directed to Lehigh's existing software committee. A minimal set of documentation wasprovided for each site with the emphasis placed on using on-line documentation for most tasks. SUCCESSES AND PROBLEMS The CWIS migration turned out to be a success and problem at the same time.The transition to the new environment went very smoothly with usage growing to new levels.The problem has been that the continual growth has put added strain on the system which has been expanded to three servers and exceeding 10,000 logins a day (see Figure 6). [FIGURE NOT AVAILABLE IN ASCII TEXT VERSION] The redesign of the system utilizing Oracle did make the transition to gopher and WWW much easier.Our current implementation allows the CWIS to be accessed through WWW with authentication, but most of the campus is still accessing the system in text-mode using TCP/IP or serial protocols. Microcomputer pop mail programs and workstation mail are encouraged as users b ecome connected to the backbone network.It is hoped that some of this, along with the transition to Mosaic/WWW, will reduce the load on the CWIS servers. The implementation of the compute servers and the workstations throughout campus has dramatically increased computing usage. CPU usage was compared from the month of May 1991 to the month of May 1993;it had increased from 63,000 CPU minutes to over 1,800,000 CPU minutes.Many researchers have been able to complete tasks that were not possible in the past using existing mainframes.Another part of the equation, however, is that the 90/10 rule for usage which was referred to previously had changed to a 95/5 rule with 5% of the users using 95% of the CPU (see Figure 7). [FIGURE NOT AVAILABLE IN ASCII TEXT VERSION] Some reasons for this will be discussed in the next section which shows that the nature of the work being carried out on the workstations has changed from what was previously done on the mainframes.Another success and also a problem has been the use of the AFS file system, which has provided a common userid and password for all of our systems along with support for distributed files and security features.Some serious AFS reliability problems were encountered during the last few years which have been very frustrating to our users. The implementation of the distributed environment has also dramatically increased our disk and tape management problems.As workstation use has grown, disk requirements have increased at a much higher rate than in our previous mainframe environment. System backups are taking an inordinate amount of time and a project is currently underway to examine hierarchical backup systems.During the transition, our overall disk capacity went from 35 gigabytes to over 100 gigabytes without any sign of this demand stabilizing. User Survey Summary To determine the extent of usage and also to find how users felt about the new environment, a survey was distributed to all workstation users.The survey results indicated that 54% of the respondents were from the Engineering College, 31% from the Arts and Science College and the remainder from the Business and Education colleges.The survey results showed that most users were satisfied with what they could do on the workstations.Users were queried on their satisfaction in 16 areas relating to the new environment; in all areas there was a surprising level of satisfaction with the workstation environment (see Figure 8). [FIGURE NOT AVAILABLE IN ASCII TEXT VERSION] The lowest-rated factor was response time, with 68% of the users satisfied or very satisfied with response time.The survey also found that there was a major shift in how the new machines were being used.In the past the major usage of our mainframes was for research. In the distributed environment only 33% used the systems primarily for research while 54% used them for communication purposes and 10% for course usage. KEY FACTORS AND SUMMARY User involvement was a major factor in accomplishing the transition to a distributed environment.It was crucial to keep the user community informed and involved in the decision making process. Lehigh was also under strict financial constraints so everything had to be done within the context of existing budgets and the reallocation of funding from the former mainframe budgets.The creation of a RFP and user benchmarks also helpedto clarify computing needs. The benchmarks helped in eliminating some vendors and in determining that there was little distinction between the performance of the three top companies. Another key factor was the establishment of very aggressive timelines and goals. Often these seemed unreasonable, but they were being driven by our financial constraints.Finally, a willingness to compromise in coming to a decision to take part in a development project allowed this transition to be made within our existing budgets. In summary, Lehigh was able to remove three mainframes within a nine-month timeframe, deploy 150 workstations in 18 months, and increase its overall computing power by a factor of 500. The result has been much greater computing power, all financed from existing funds. Users have access to more computing power, better interfaces and more advanced software, thanks in large part to the savings realized from eliminating the overhead and maintenance on the former mainframes. Administrative Computing is still running most of its applications on the IBM 4381.Admissions and Development software are running on RS/6000s but in general the administrative transition is going to take considerably longer than the academic transition. Microcomputers are also still a very major factor in Lehigh's computing structure. They have not been reduced and replaced by workstations (which was suggested in our five-year plan).A new five-year plan is currently under construction which will stress the enhancement of the new computing environment along with goals to further incorporate this technology into the educational environment of Lehigh's students.