This paper was presented at the 1997 CAUSE annual conference and is part of the conference proceedings, "The Information Profession and the Information Professional," published online by CAUSE. The paper content is the intellectual property of the author. Permission to print out copies of this paper is granted provided that the copies are not made or distributed for commercial advantage and the source is acknowledged. To copy or disseminate otherwise, or to republish in any form, print or electronic, requires written permission from the author and CAUSE. For further information, contact CAUSE at 303-449-4430 or send e-mail to [email protected].


A Partnership Approach to Funding Research Computing

Donald F. McMullen, Ph.D.

Advanced Information Technology Laboratory

University Information Technology Services

Indiana University, Bloomington

 

Providing facilities and support for excellent computational research is a costly matter. Characteristics of high-end computing that make it costly include computing equipment at >$1M/box or $50-100K per user, complete equipment turnover about every two years, large amounts of mass storage, high bandwidth network access, complex system administration tasks, and deep expertise in end-user support. Most college and university computing centers don't have budgets to meet these exceptional needs. With the critical role that simulation and computation play in research in many disciplines the problem remains how to provide high-end facilities that significantly extend the reach of one segment of a university's computing community without creating inequities in overall funding for instructional, general academic, and administrative computing. We present a model that emphasizes partnership and leveraged funding to provide appropriate hardware and support infrastructure for building centralized research computing facilities. Key elements of this model include good communication with end-user groups, correct timing, and solid partnerships with research support units in the university, external funding agencies, and vendors.

 

Introduction

Research computing at Indiana University spans a range as diverse as the interests of the faculty. As a starting point let us define research computing as computing done in support of or related to the research activities of faculty members, with little relevance to other segments of the university computing community. Indeed this is an impure definition because there is a continuum between teaching and research and considerable overlap in tools, methodologies and content. However, it is still a useful definition to get a handle on the "high end" of research computing, the expensive equipment and facilities. Components of research computing nominally include cycles, storage, networking, graphics and visualization facilities, and massive storage systems.

Several areas fall rather obviously into the research computing category:

Computational science is a critical element of research in many disciplines and together with theory and physical experiment forms the basis of the scientific method. As theories become more complex, the processes of predicting observations and planning experimental campaigns increasingly depends upon the use of computers. Econometric models now vie in computational complexity with problems in the physical sciences. A consequence of developing more complex and realistic models to test theory is that ever increasing CPU "horsepower" is needed to keep these researchers competitive in their fields and to maintain the university's reputation as a first class research institution.

Visualization is a required tool in most computationally intensive research programs. The goal of visualization is to leverage existing scientific methods by providing new scientific insight through visual methods. Computational models are so complex that tables of numbers are not sufficient to understand the results of a calculation--it is often necessary to create a visual representation of the results. This is true not only to build a Gestalt of the calculation but also to easily detect anomalous or interesting results. Good visualizations are difficult to construct and require considerable expertise, high-end graphics equipment, and software to generate.

Library information systems contain pointers to published information, electronically published literature of a scholarly nature, and digital audio, image and video media. The library is the heart of the university and the systems used to store, catalog and find library materials are critical to the mission of the institution and the academic careers of the faculty. Storage and delivery requirements associated with digital libraries are driving the development of massive storage systems and networks.

Large data sets from, for example, satellites and demographic surveys are becoming increasingly important in earth sciences and social science research. Data taken from one severe weather event may approach a terabyte (1012 bytes) in size and telemetry, photometry, and photography from satellites is continually being acquired. The problems of storing and managing huge amounts of data is the focus of several research efforts, most notably at NASA and in the Department of Energy. Similar problems loom in the health care arena as the technological possibility of easily available patient records including radiology imaging and body telemetry creates the demand to make such data sets accessible. Typical institutional data storage needs for R1 institutions, hospitals and government laboratories are projected to be in the petabyte (1015 bytes) range by the end of the decade.

Collaboration technologies are becoming increasingly important as the climate at research funding organizations turns toward supporting groups rather than individuals. These collaborations are increasingly composed of researchers at geographically disparate locations and need conferencing facilities, shared workspaces, common file systems, and shared access to computing resources (CPUs, archival filestores, etc.). Collaboration technologies are built upon networking, high-end visualization, and massive storage facilities.

Network infrastructure is a critical component of all of the above. Network bandwidth and connectivity to regional and national networks is critical to the success of individual and collaborative efforts. High bandwidth campus networks and external connections are important prerequisites for digital library access and scientific collaboration.

Support for research computing at Indiana University comes from many sources including (internally) budget lines in the central computing organization, school and departmental funds, the university's sponsored research office, and (externally) grants and contracts. Traditionally we have allocated a certain portion of our central computing dollars for research computing and have used those funds to purchase large, general purpose shared resources. These resources are used extensively and are under constant demand to expand; the expectation for this cannot always be met.

In recent years we have been experimenting with a funding model that involves matching priorities between the university's computing organization and the sponsored research unit. This effort emphasizes the construction of faculty collaborations that require large capital outlays for computing equipment which could be supported by external funding agencies. Proposals from these collaborations to external agencies are leveraged by a portion of the dollars that would have been spent on general research computing by the computing organization and supported by the sponsored research unit through additional matching funds as needed to meet agency requirements.

 

What are the issues in research computing?

Considerable change has occurred over the last fifteen years in the way scientific inquiry is conducted. With increased complexity in theory comes the need for high performance computing to test theory, analyze observations and experimental results, and predict the results of future experiments. The mathematician Robert Hamming has said, "The purpose of computing is insight, not numbers." Indeed scientific visualization provides this insight. Visualization has become an indispensable adjunct to scientific computing, allowing researchers to see and understand their data in ways not possible before. With simulation as the "third pillar" of science it is clear that excellence in science, scholarship and creative work requires a commitment to continuous provision of excellent computing resources and infrastructure. Unfortunately, other characteristics of the tools of research computing include high capital cost, rapid obsolescence, and the expertise needed to support high performance computing equipment and software, creating a challenging conundrum.

A major direction in national computing and networking priorities was set by the High Performance Computing and Communication (HPCC) Act of 1991, which created a coordinated federal effort "to help ensure the continued leadership of the United States in high-performance computing and its applications..." The National Science and Technology Council's (NSTC) Committee on Computing, Information, and Communications (CCIC) has assumed oversight of R&D programs conducted by twelve Federal departments and agencies, and coordinates research in cooperation with U.S. academia and industry. This high level organization of HPCC activities suggests that aligning the university's research computing objectives and efforts with national priorities will increase both the relevance of the underlying science done to the larger academic and industrial community and the probability of obtaining external support for the required infrastructure.

Change has also occurred in the way scientists work together. The need for much increased coordination in national research programs and fiscal responsibility in the allocation of Federal research money has created a new, collaborative view of science. Funding agencies are actively rewarding collaborative research efforts. The primary implication of group science is the increased need for the availability of high-speed networks and network based resources including high performance computers, global file systems, and improved collaboration technologies. These include shared whiteboards, video teleconferencing, and tools to facilitate data sharing, code development and writing by geographically dispersed groups.

Beyond the intrinsic importance of the research supported by advanced infrastructure we can identify several other interests within the university for support of research computing as follows:

* The sponsored research office's interest in acquiring external funding for faculty research and university development projects.

* The university's need to attract and retain excellent faculty.

* The responsibility the university has to provide the best possible learning environment so that students are prepared to meet the challenges at the leading edges of their fields.

Funding models

An ecology of funding models for supporting the equipment and personnel requirements of high-end research computing are in use at Indiana University. Roughly speaking, these are rooted in the organization of the university and include support from the departmental or school level, the central computing organization, the sponsored research office in the form of seed grants, and external granting sources.

The first of these, total funding by the faculty member's school or departmental, emphasizes focus; that is, the result will be closer to meeting the specific needs of the researcher, but at the cost of potentially lower levels of funding and therefore project scope. The second, support from the general operations of the central computing organization, exchanges focus, or the precise relevance of the equipment and support provided, for a broader set of computing resources.

A third path, leveraging sponsored research office seed grants by individuals or small groups of researchers to obtain other internal funding, provides overall higher levels of funding for the project at hand and results in acquisition of the exact set of resources needed for the project. Typically, computing equipment acquired by this mechanism remains operationally with the researcher or group in their home department. This frequently ends in increased support requirements for that department, and in some cases the cost of ownership can be staggering. In one particularly pathological case of leveraged funding gone bad, the acquisition of one somewhat esoteric computer by a senior faculty member, required nearly three-quarters of the departmental computer support technician's time to maintain. Needless to say, this was not a win for the researcher's department and undoubtedly resulted in an overall lowering of productivity for the entire department. The politics of these arrangements are somewhat one sided to the advantage of the grantee, and departments as a whole cannot plan well for these kinds of acquisitions.

Still, this method of using seed money grants is used broadly and correctly at our university as a method for quality improvement in individual research programs and as a faculty retention mechanism. The idea of leveraged or shared funding of specialized computing resources can be elaborated to include the use of alliances within the university to provide the necessary ingredients (i.e. funding, environmental support, operational support, administration, training, and end-user consulting) in a carefully planned manner. The relevance to the success of a project of having the right funding and people at the right times cannot be understated. The construction and operation of these kinds of alliances in the university context is a nontrivial learned organizational behavior. However, once there is an environmental expectation that a diverse set of units in the university can work together to provide resources for a research or development project, the stage is set for the emergence of powerful synergies. We believe that an environmental expectation of being able to build intra- and inter-institutional partnerships to support individual research computing projects is a valuable strategic asset.

The partnership model

As in many research institutions academic research computing at Indiana University grew out of the needs of a few hearty pioneers who constructed the IU academic computing center of the 60's and 70's with strong and productive R&D partnerships with IBM and Control Data Corporation. Declines in local support for research computing led to a crisis in the late 80's that resulted in a commitment by the Dean for Academic Computing to support high-end research computing by establishing core support groups outside the central computing organization. The charge to these groups was to assist in supporting leading-edge technology, statistical computing, and later humanities computing. One of the environmental factors realized as strategically important by the faculty group who precipitated this change, was the need to free them and their graduate students from technology exploration, transfer and support so they could focus on doing the best possible science.

Within the central computing organization we have traditionally allocated a certain portion of our computing dollars for "research computing" and have used those funds to purchase large, general purpose shared resources (e.g. CPUs, rotating storage, and backup devices). Selection of these resources has either been left to the computing center or made by committee based on recommendations from the computing center. These resources are used extensively and are under constant demand to expand. The expectation for this cannot always be met due to the need to develop and maintain instructional, administrative and networking facilities and services.

Recently we have been experimenting with a partnership funding model that involves matching priorities between the university's computing organization, the sponsored research office, and faculty members' large scale computing projects. This effort emphasizes the construction of faculty collaborations that require large capital outlays for computing equipment and which could be supported by external granting agencies. Proposals from these collaborations to external agencies are leveraged by a portion of the dollars that would have been spent on general research computing by the computing organization, and supported by the sponsored research unit through additional matching funds as needed to meet granting agency requirements.

If the bid for external funding is successful, we are able to combine this external funding with computing and sponsored research dollars to create a facility for the project which neither the computing organization nor the individual researcher's grants could provide alone. The facility has a clear purpose and an established user base representing a very efficient leveraged use of both computing and research support funds. Since these projects come with a very interested and involved user base, decisions about continuing or expanding facilities are straightforward. If a project has broad enough utility and appeal that it attracts additional external funding, then the central computing organization's financial commitment to the project continues and the facility is maintained and expanded. Otherwise, the computing center's research computing budget is used to leverage other more viable projects.

Vendor partnerships, cooperative research and development arrangements, cooperative pricing, etc. can be a powerful addition to university/government research grant arrangements. Most federal grant programs look favorably on projects that involve academia/government/industry partnerships. Having established relationships with vendors for this purpose can be extremely useful to support these collaborations. We will leave it as an assumption of the model that the computing organization and sponsored research office are necessary and sufficient to partner with faculty groups to fund, build and run research computing facilities.

The concept of partnership at many levels is key to the success of this model. Faculty must collaborate to form a joint set of capital equipment needs around a set of externally fundable projects. This faculty collaboration then partners with the computing organization and sponsored research unit to create a package that is attractive to external funding agencies based on matching funds and other collateral support. Issues related to operational and administrative support, and end-user consulting are worked out in the context of the intramural partnership, with the supplier of these functions in full agreement with the users as to what services will be provided and how they will be paid for. If the project involves development or expansion of the facility over time, then partnerships with vendors are essential, particularly to ensure level of technical support required to make the project succeed and to off-load some of the local support burden. With a successful project comes a built-in expectation that future partnerships can be made.

The advantages of this model over the other types of arrangements for supporting research computing mentioned earlier are significant in the following ways:

Leverage. Every funding partner's investment in the project is leveraged and the value delivered is greater than any single partner's contribution. This is particularly true when external grant funding can be obtained by using the pooled resources as a university matching contribution.

Buy-in. The faculty group, computing center, and sponsored research office should already have some level of bias toward collaboration. The possibility of participating in a high visibility project may exploit this bias. Vendors of high end computing and communications equipment usually have mechanisms to partner with universities and a willingness to engage in high visibility projects.

Efficiency. The model is opportunistic in its use of existing resources such as support staff in new and creative ways.

Focus. It produces exactly what facilities and services are needed to support the user group, but only when needed.

The requirements and potential problems are also numerous. Putting together intra- and then extramural partnerships to support large research computing projects requires a great deal of organization, communication and cooperation between computing center staff, sponsored research officers, and faculty. This raises the nontrivial question of who should lead and with what level of authority. At IU we have found that a viable solution is to have large scale faculty driven research computing projects led by the CIO's office. The gravitus of this office makes arrangements with the sponsored research office straightforward. Other requirements to make the partnership model work include the following:

* Continuous search for external funding for research computing required by some unit in the university

* Long lead times needed to prepare grant proposals

* Long term and deep commitments by vendors to pricing and support needed.

* External relationships with organizations that can provide expertise by technology transfer such as national labs and computing centers.

Significant problems in a project can occur if external funding is not secured and an important step in the development of the partnership is to determine what will happen if certain conditions aren't met. A clear, shared road map is essential both for the project at hand and to preserve good will in case of partial failure.

Another problem arises from failing to keep up "momentum" in developing facility requirements for joint faculty research computing projects, and in identifying possible external funding opportunities to match with these projects. A third problem stems from the need to maintain a consistent vision in the face of too many opportunities. A shared vision among partners of what research computing facilities should be developed is needed and each new opportunity must be tested against the goals of that shared vision. At Indiana University this shared vision is based on a set of attributes for research computing facilities developed jointly by the computing center and faculty. We support both shared and distributed memory high performance multiprocessors, provide mass store for all research machines, and support a global file system.

Case studies at Indiana University

Several cases serve to illustrate the funding models for research computing outlined above and demonstrate how we arrived at the partnership model. While working on the projects listed below, positive relationships between research computing faculty, the computer center, and the CIO's office grew beyond the usual user/computer center basis. We believe the sense of collaboration which grew up among these groups contributed in a central way to the success of the latest project.

Case 1: Baseline academic research computing through the computing center.

The Research and Academic Computing Group within the computing center provides general purpose cycles to the university community for instruction and research. In the 80's these were VAX/VMS cycles, but during a transition in the early 90's these became "commodity" Unix cycles. Some provisions are made for the installation of packages for instruction or research, but in general, support is provided through on-line help and phone consulting. The emphasis is on vanilla Unix computing. Support for non-standard software is focused on the individual or group who requested its installation. Baseline funding in the computing center's budget provides and maintains these facilities, and oversight is through a broad-based faculty committee which represents a cross section of the university computing community. The Research and Academic Computing Group currently manages nearly 40,000 accounts in this service and support environment.

Case 2: An intramural partnership for exceptional computing needs.

Clearly the previous model does not support the exceptional computing needs of scientists. In the early 90's a small group of faculty from several science departments approached the technology exploration group in the office of the Dean for Academic Computing with an opportunity to seed a high performance computing project that had significant potential to generate further grant opportunities. The project involved acquiring, installing and running a high-end distributed memory machine that would support a range of computational science projects in computer science, physics, geophysics, and astronomy. The novel feature of this opportunity was that by acquiring the high performance computing facility, members of the small faculty group would be able to make significant contributions to the national collaborations in which each participated, thus bringing attention to the university as a center for computational excellence in a number of disparate fields.

The faculty group was able to enlist the support of the computing center, the technology exploration group, and the sponsored research office. A vendor was selected based both on the type of work to be done and on the vendor's willingness to view the project as an R&D partnership with the faculty user group and the university. Funding was provided through a three way venture between the faculty group's dean, the computing center, and the sponsored research office with the understanding that the project had significant potential to generate future grant income. The computing center housed the acquisition and provided operational support. The technology exploration unit funded personnel to provide the in-depth consulting expertise needed for users to realize the full potential of the machine, since its characteristics fell outside the norm of a vanilla Unix programming environment.

The project was a success in terms of the science done and served as the basis for several large grants for related work. The initial funding and support commitments for the project required somewhat of a leap of faith by the supporting partners. Fortunately, the principal scientists in the research group were already top producers in their fields and considerable "chemistry" existed in the group as a whole.

Case 3: Developing a partnership for research computing with intramural and extramural components.

Encouraged with the results of the smaller intramural partnership, the core group of three faculty researchers who participated in Case 2 enlisted other researchers. The group grew to fifteen researchers in eight departments with a collective set of needs that fit neatly with the aims of a computational science program in the National Science Foundation. The basic objective of the project was to build a computing environment that "enables research advances in computational astronomy, chemistry, geology, mathematics, physics, and lighting design by providing an infrastructure that allows application scientists to move their computations from PCs and workstations up to geographically distributed arrays of supercomputers without massive investments in programming time at the upper end." The facility to be developed included a high performance computer capable of producing world class results in the researchers' fields. Furthermore, the current project should set the stage for future sponsored research opportunities that would be needed to upgrade and expand the facility. The alliance should be developed as a joint venture, with the expectation that the project would not be a one-time deal, but the basis for an ongoing concern.

Through the combined efforts of a faculty member and staff in the CIO's office a substantial joint venture was assembled which included: the research programs of the fifteen faculty members; a proposal for funding to the NSF; agreements for matching funds from the CIO's research computing budget and sponsored research office; and agreements to provide environmental, operational and consulting support from the computing center. The proposed facility, expected to be the largest shared memory machine of its kind at a university, would move the computing center's largest users off existing central Unix machines, freeing up considerable resources and taking some pressure off the need to expand the baseline Unix research facilities.

The following actions were taken in the process of developing the partnership and proposal:

* A set of computing needs associated with a group of research programs was identified by a faculty member, who brought the needs forward to the CIO's staff. An expectation that this would produce results had already been established by the project mentioned in Case 2. Also established was knowledge that the principals involved in the effort have compatible aims and interests.

* An external funding opportunity that would provide capital to meet the needs was identified.

* Commitments for matching support were obtained from the CIO's office and the sponsored research office with the recommendation of the CIO.

* The faculty group prepared the research portion of the proposal. The computing center and CIO's staff developed a support plan that included having the computing center house and maintain the facility. The support plan included detailed assessments of staff time and physical resources needed to accomplish the goals of administering the system and providing specialized consulting to the user base about its use. The latter were put up by the computing center as its contribution to the project

* A Request for Proposals (RFP) based on requirements prepared by the faculty research group and the computing center support team was written, then issued by the university.

* Responses to the RFP were jointly evaluated by both faculty and support teams and a vendor was selected that met the criteria of both groups. Beyond basic suitability for the proposed research and cost, key elements considered were the vendor's willingness to partner with the university to provide end-user training and consulting and to extend the pricing structure for this acquisition to cover future enhancements to the system.

* The budget was prepared and matching funding requirements were agreed to by the CIO and sponsored research office.

* The proposal was submitted; then it was funded.

* A joint faculty/computing center implementation team developed operating procedures and policies, then worked with the vendor to install the system and inaugurate the service.

* A support team consisting of computing center personnel and vendor employees operates and maintains the facility. Feedback from the faculty research team is used to fine tune operational policies and procedures.

Conclusions

The entire process in Case 3 from collecting requirements to starting service took approximately a year. Despite the long lead time, the results of building a partnership between users, local support staff, internal and external funding entities, and vendors resulted in a facility that was much more capable than what could have been obtained through the efforts of the principal researchers' efforts alone. In addition, overall quality of the resulting facility, in terms of accessibility, usability and expandability, is greatly improved over a single-investigator or small group departmental effort.

Other subsequent partnerships in research computing have resulted in a nascent high performance computing strategy for the university, acquisition of a third supercomputer class machine, and connection to the NSF's vBNS high speed research network. Extended research efforts in computing and networking between IU and national high performance computing laboratories have proven to be a valuable source of technology transfer into the university. These connections to a larger research computing community have yielded valuable insight into how the partnership model may be extended to include other institutions in a regular manner.


[Comments] [Search] [Home]