This article was published in CAUSE/EFFECT journal, Volume 22 Number 4 1999. The copyright is shared by EDUCAUSE and the author. See http://www.educause.edu/copyright for additional copyright information.
Middleware: The Second Layer of IT Infrastructure
by Kenneth J. KlingensteinA second wave of network invention and deployment has begun. Riding on top of the physical Internet protocol (IP) connectivity that drove the first generation of networking, this effort adds �logical� connectivity to the networked infrastructure. Users connecting to this network will find new services that provide them a persistent presence, dispense personal agents to do work, allow complete mobility and security, and enable remarkable applications. This second wave has the potential to be as consequential as the first layer of internetworking has proven to be.
The term �middleware� is defined by one�s point of view. Many interesting categorizations exist, all centered around sets of tools and data that help applications use networked resources and services. Some tools, such as authentication and directories, are in all definitions of middleware. Other services, such as coscheduling of networked resources, secure multicast, object brokering, and messaging, are the particular interests of certain communities, such as scientific researchers or business systems vendors. This breadth of meaning is reflected in the following working definition: Middleware is �the intersection of the stuff that network engineers don�t want to do with the stuff that applications developers don�t want to do.�
Middleware has emerged as a critical second layer of an enterprise IT infrastructure, sitting on top of the network level. The need for middleware stems from the increasing growth in the number of applications, in the customizations within those applications, and in the variety of locations from which we access those applications. These and other factors now require that a set of basic data and services be moved from their multiple instances within individual applications or network elements into a centralized institutional offering. This central provision of service eases application development, increases robustness, assists data management, helps users deal with complexity, and provides overall operating efficiencies.
Interoperable middleware between organizations is a particular need of higher education. Researchers need to have their local middleware work with that operated by national scientific resources such as supercomputing centers, scholarly databases, and federal labs. Core administrative systems can exchange common information and access through middleware. Many of the advanced applications that will transform instructional processes depend on middleware to function. Significantly, the fact that higher education is fractal in structure creates markets that need interoperable standards and products. The result is a heterogeneous and dynamic environment that often presages how technologies will be used in the wider world. As a Microsoft leader said recently, �We�ve learned the strategic importance of testing our products in higher education. If they work there, they can work anywhere.�
Taxonomy
A rough taxonomy of the components of middleware can be drawn. At the center is a set of core functionalities that seem to be required by the rest of middleware services. Those functionalities include identifiers, authentication, directory services, and authorization services. The challenges in providing these services are as much political as they are technical; many of the hardest issues involve the ownership and management of data in the complex world of higher education.
Identifiers. An identifier is generally a character string that maps a real-world subject to a set of computerized data. E-mail addresses, netIDs, LAN accounts, and bar codes are examples of identifiers. Identifiers were simple when each person had exactly one. Now identifiers apply beyond people to objects such as printers and applications and to groups of subjects. Typically a real-world subject will have several identifiers. Thus the relationships among a subject�s identifiers and the policies associated with the assignment of identifiers become important issues.
Authentication. Authentication is the process of verifying a legitimate use of an identifier. The traditional approach of login and clear text password is far too insecure and inflexible for the variety of ways that clients need to authenticate to servers. LDAP-based tools, Kerberos, and X.509 are modern alternatives.
Directory services. Much of the information about real-world subjects needs to be contained in a general-purpose, high-performance directory server that can respond to application or network requests for information. There are substantial technical and political issues in the development and operation of a directory service. Technically, determination of the elements of the directory (the schema), the ways of addressing the elements (the namespace), and operational issues such as replication and partitioning need to be considered. Applications must be reengineered to use the directory. Policy issues include ownership of data, feeds into and out of the directory, and setting permissions to read and write data.
Authorization services. An important subset of the information about a real-world subject is what it is permitted to do. Authorization can range from allowing access to refined controls of a remote electron microscope to permissions to place purchase orders at a specified level on an institutional account. Defining these rules, including means to delegate or reassign authority on a temporary basis as well as delivering this information to applications, is one of the challenges in this newly emergent area.
At the boundary of the network and middleware layers lie a number of services that may well be classified as �middleware-based networking� or �network-oriented middleware.� Such services include (1) secure multicast, the ability to extend fledging multicast efforts to permit at the network layer secure access to join a multicast session, and (2) bandwidth brokering, a service that securely allocates quality of service to various applications and users within an institution or organization. Typically, these services require, in turn, core middleware such as identifiers, authentication, and directories to operate.
Building on the core middleware services are a number of burgeoning sectors of application-oriented middleware. A rough grouping of such middleware would include:
Support for instructional computing. Higher education needs a variety of open protocols and implementations that allow students to access their bookmarks and aliases from any location, as well as ways for institutional and multiorganizational file systems to enable sharing and collaboration. Similarly, the metadefinitions of the Instructional Management System (http://www.ims.org) will permit educational software to interoperate and work off of a shared base of data.
Support for research computing. Efforts are under way to transform scattered national computational resources into a coherent grid (see http://www.globus.org/), providing researchers consistent access across a variety of architectures; permitting coscheduling of resources; and coupling data, networking, and computing together.
Support for administrative computing. The new generations of business systems have loosely coupled components that depend on a common applications infrastructure, which provides services such as object brokering for component requests, message handling between components, and monitoring of transactions.
Designing and deploying core middleware
The deployment of identifiers, authentication, and directories for an institution should be done as a coherent fabric that spans the institution and interoperates with similar deployments at other institutions. The institutional fabric will likely not be the only instance of middleware on campus. When departments, for unique needs, construct their own identifiers and directories, the institutional infrastructure will serve to anchor, populate, and integrate local services. While interoperability of these components between campuses is not a strict requirement, the benefits of a consistent national approach are compelling. Inter-institutional resource sharing of libraries, scientific instruments, and administrative applications will be enabled through a consistent national deployment.
Design is complicated by the relationships among the core middleware components. The process must include not only competing technical goals, but also policy goals and campus politics. Tradeoffs arise from the relationships within and among identifiers, authentication, and directories.
Some identifiers have associated authentication mechanisms. For example, the �netID� category of identifier usually has a password authentication process. There are also mappings between identifiers; given one�s netID identifier, it is often necessary to get the associated e-mail address of the netID holder, or perhaps his or her LAN account name. And increasingly there are mappings between authentication processes. For example, one may want to use a Kerberos password to move an X.509 key from a central store to a local machine. Similarly, there are connections between directories and identifiers/authentication. Identifiers serve as the primary indices into directories. Directories need security to permit and protect user access and hence require some form of authentication process themselves.
These relationships fundamentally compound design issues. For example, locally unique identifiers can be problematic in global directory schema. It is desirable that a compromise of the security system of the directory not compromise other authentication processes. Identifiers should be chosen to facilitate authentication mechanisms. If two identifiers are linked (say an e-mail address and an account name), then the corresponding authentication processes (an authenticated e-mail transmission and an account password) should be consistent, in operation and in policy. This tangle of issues suggests the need for a holistic approach to the design of a coherent set of core enterprise middleware.
Still, with solid up-front thought and strong political leadership, the deployment of a robust and extensible service offering in identifiers, authentication, and directories is fairly straightforward. Capital costs are not overwhelming as they may be for physical networks. Operational costs are also reasonable, with the prospects for real savings over the distributed expenditures that would occur without an enterprise service.1
The holy grail While the benefits of identity management, secure authentication, and comprehensive directories are substantial in their own right, they are also the leverage to apply against the hardest, and potentially the most rewarding, element of core middleware--authorization services. Authorization will be the basis of workflow, drive permissions for accessing networked resources, allow us to control and delegate electronic responsibilities, and serve as the basis for future administrative applications. Through authorization, researchers will have access to scientific resources and advanced networking features. It will allow us to convert our complex legal policies into automated systems in an easily scalable fashion. But while the steps to deploy identifiers, authentication, and directories are understood today, the approaches to authorization are much less clear.
Typically, authorization indicates what operations an identifier, properly authenticated, is permitted to do with a networked object or resource. At its simplest, authorization is the next generation of �ACLs�--the read/write/execute controls that are embedded in file systems. As the sophistication and variety of networked resources increases, so too must the authorization mechanisms. There are many challenges associated with authorization, including:
- Where to store the privileges or authorization characteristics
- How to transport those characteristics from the storage location to applications that need them
- How to ensure consistent meaning and validity to values associated with those privileges
- How to effectively express the sophisticated and diverse characteristics implicit in policies and trust in a machine-processable list of attributes
Several possible vehicles exist to address the above challenges. Directories can store authorization information. Within authentication processes, one can carry authorization information. Identifiers may be extended to contain authorization. Stand-alone databases can supply additional information. There will be much work done in the near future to build a greater understanding of which tools are best suited to address particular authorization requirements.
The road ahead
We are starting the next stage of building the national �information superhighway.� While there are many similarities to the process of constructing national IP connectivity, there are distinctions as well, reflecting the greater policy issues that arise as we move from automating physical systems to automating human systems. In the first stage, we put machines on the network. Now we are putting people on the Net. The work is consequential. There are payoffs (such as enabling remarkable new applications) and pitfalls (from protection of privacy to loss of perspectives). With excitement, and care, we begin.
Endnote
1 For information about the Internet2 Middleware Initiative, see http://www.internet2.edu/middleware/; for information about Net@EDU�s PKI Working Group, see http://www.educause.edu/netatedu/.
Kenneth J. Klingenstein ([email protected]) is project director, Internet2 Middleware Initiative, and chief technologist at the University of Colorado at Boulder.