Motivation

Since the debut of the first usable computer almost half a century ago, the world has witnessed dramatic improvements in computer and communications technologies. In fact, just within the past decade, networks of workstations with bit-mapped displays have replaced large time-shared mainframes as the typical computing environment. User interfaces are no longer strictly text-oriented, but generally contain a great deal of graphics, audio, and video. Memory and secondary storage capacities of machines as well as network bandwidths are vastly greater than those of ten years ago. We conjecture that this trend of technological advances will continue for the next decade, and that network bandwidths will reach or exceed several gigabytes/second, processors in small computers will run in excess of several hundred MHz, and both primary and secondary storage will be available in quantities many orders of magnitude larger than what is available today.

We envision that ten years from now most homes will be equipped with powerful, inexpensive machines that will be as integral a part of our lives as the present day telephones and televisions. Similarly, a global computing network will have evolved to the point that virtually all machines throughout the developed world will be linked by high speed optical networks, and that users will communicate with each other, as well as access essentially all of their desired information over such a network. The types of applications supported in such an infrastructure will become far more ambitious as the power of the machines and the bandwidth of the communication links increase manyfold.

These technological advances will profoundly affect the traditional views of operating and database systems, network protocols, and even data acquisition and processing techniques, thereby requiring the computer science community to rethink its major research directions. The theoretical foundations that were adequate for uniprocessors or small number of tightly-coupled multiprocessors-- semantics based on the ``state'' of a system, or a complexity theory based on lock-step synchronization of multiple processors--will have to be revised to account for large ensembles of processors communicating asynchronously to solve a single problem. To precisely identify the research directions, we propose a gross categorization of the applications into three major classes:

Class I - information retrieval: Applications in this class will support efficient navigation and browsing through large-scale data repositories, as well as retrieving information objects for clients over networks.
Examples: digital libraries of satellite imagery, databases of patient records.
Class II - information retrieval and processing: Applications in this class will be founded on methods for synthesizing meaningful information from large, complex data sets (containing textual, numeric, and image data) through simulations, and then visualizing the results through the use of interactive graphics and imagery.
Examples: scientific visualization, production planning in manufacturing systems.
Class III - information retrieval and processing with real-time interactivity: Applications in this class can be characterized by the need to carry out distributed simulations, often involving real-time constraints and user interaction within realistic virtual environments.
Examples: education and training systems for fire fighters and natural disaster rescue teams, command and control systems of a battlefield.

As may be evident, Class I applications are characterized by the need to support storage and retrieval of massive amounts of data in multiple forms (text, image, video, animation). The processing requirements, however, may be quite elementary. Class II applications, on the other hand, combine a massive amount of computation with information storage and retrieval and present the results to the user in a form in which it can be easily visualized. Typically, applications in this class do not need real-time response, nor is real-time interaction with the user a prerequisite (though achieving reasonable response times may often be required to facilitate scientific experimentations). Class III applications, in addition to requiring support for retrieving and/or synthesizing information objects, impose real-time constraints on computing and communication requirements.

We believe that the above classification characterizes the evolution of applications that will become pervasive in the information infrastructures of the future. Moreover, at the core of these application classes lies fertile ground for research and development, ranging from design of algorithms (for efficient storage and retrieval, data-compression, information transport, network management, etc.) to language and operating system design, and tools and methodologies for application development. Consequently, attaining a breakthrough in designing and evaluating system architectures for information management systems of the future will require orchestrating a coherent and comprehensive effort that integrates expertise in information storage and retrieval, database systems, computer graphics, parallel computing, interactive and real-time distributed simulations, networking protocols, and systems engineering. The synergy among the research interests of the participants in this proposal provides an excellent opportunity for pursuing this research in the Department of Computer Sciences at the University of Texas at Austin.

Questions or comments to <cise@cs.utexas.edu>