To achieve this goal, we propose to design and implement a multimedia server for storage and retrieval of multiple resolution levels of objects from disks. We also plan to extend conventional database technology (both relational and object-oriented) to support multiresolution data types. Working together, such multimedia storage servers and multiresolution data model will lay the foundation for future information archival and management systems.
A multiresolution information system will be at the core of a wide variety of applications, whose users will range from scientists knowledgeable about the data to end users who are neither experts about the data they require nor experienced at obtaining such information from distributed sources over a network. To enable a novice user to use a system of such sophistication requires major advances in query processing techniques. Most conventional query interpreters are quite rigid; even small mismatches between the terms of the query and the relations of the databases can cause queries to fail. Although ``content-based accessing'' can use knowledge bases of domain terms to reformulate queries, building these knowledge bases efficiently is problematic. Query interpreters would be more flexible if they processed natural language. Although natural-language parsers have been built for portions of English, current methods are extremely inefficient and error-prone. In what follows, we discuss the specific ideas to be investigated in this component of our research.
Multiresolution Information Management
In databases containing multimedia data, such as the image data collected by NASA's Earth Observing System, the enormous size of many individual data items makes it computationally infeasible to examine each data item returned by a query in detail. Systematic support for representing such voluminous data at various levels of detail is one powerful approach for alleviating this problem. We call systems that provide such support multiresolution or MR systems.
In general, if X is a low resolution version of Y, then X requires less storage space, less computation for processing, and less network bandwidth for transmission than Y. Multiresolution systems use this relationship between multiple resolution levels of data, and they improve performance by processing lower-resolution data whenever possible. In most conventional information retrieval systems, a query has a single, well-defined result. However, in the multiresolution model, queries can elicit responses at various resolutions, or even a sequence of responses at increasing resolutions, each of which approximates the final result of the query and improves upon the last response.
The meaning of data, and hence the notion of resolution, however, is always application dependent. For example, consider the use of a histogram as a low resolution version of a set. The number of bins determines the resolution of the histogram. When there are few bins, and many data objects are lumped together, the histogram is a low resolution version of the set. When there are a large number of bins, the histogram provides a great deal of useful information about the set, and hence denotes a higher resolution. At the highest possible resolution level, each bin contains exactly one object, thereby completely defining the membership of a set. Images can also be encoded at various resolution levels, for instance in chroma (i.e., number of bits/pixel) or spatial (i.e., number of pixels/image) dimensions. In general, multiresolution data can be viewed as a partially ordered set of total (full resolution) and partial (lower resolution) data elements describing some domain at varying levels of detail.
In an implementation of a multiresolution database, the database programmer will be required to define domain-specific MR primitive types. Similarly, the system designer will be required to provide application dependent functions that can produce low-resolution data from high-resolution data. A data definition language that supports both the specification of these functions by the database programmer and the use of such functions by the DBMS and its query evaluator are important in supporting this capability. Furthermore, the practicality of a multiresolution type is also application dependent. We are working on the application of our multiresolution model to a variety of application domains, including the storage of EOS data and the development of a synthetic environment database for battlefield simulation exercises.
If DBMSs are to address fully the needs of the growing number of applications, they must be extended to retrieve data at multiple resolutions conveniently. The most commonly used data model in modern commercial database systems, the relational data model, is built from primitive data types, tuples, and sets of tuples, or relations. Constructing partial orders based on information content of primitive types and tuples is straightforward. Some similar means of constructing a multiresolution partial order of sets (or relations) is required because sets are essential to most data models, including the relational. Our approach to this problem is called the sandbag.
The sandbag approximates a set B of total elements from a primitive data type upon which a total order is defined by using partial data elements of . Each partial data element is mapped into a range that provides lower and upper bounds on the number of elements in B that are approximated by d. For instance, consider the set of total elements from the partially ordered domain of real intervals. Given that an interval [a,b] is said to be above interval [x,y] if and , the number of elements above the partial data element d = [2.5,6.0] is 2. A sandbag describing B might provide a lower bound for d of 1 and an upper bound of 3. These are consistent with B, but are not the tightest consistent bounds, which are a lower bound of 2 and an upper bound 2. This capacity for imprecision is essential to the sandbag. We have defined operations for extracting bounds information from a sandbag, a dictionary to support the operations of insertion, deletion, and membership testing, and a means of enumerating the elements contained in the dictionary.
A relation is a set of tuples. We call a sandbag over multiresolution tuples a multiresolution relation. To extend relational databases to include multiresolution data, we first construct a relational-like algebra by defining analogs for each relational operator that take sandbags as arguments and produce a sandbag as a result. We have defined these operations so that they are strictly more general than, and completely consistent with, the familiar relational operators.
The ideas of multiresolution, progressive refinement, and the sandbag are also applicable to object-oriented data models. We are taking this approach to develop a multiresolution DBMS for the EOS application.
Defining the sandbag construct and developing the multiresolution data model are the first steps in building a complete multiresolution database system. We plan to continue our work by (1) developing and validating the practicality of algorithms for constructing sandbags and the analogs of relational operators for manipulating them; (2) creating a framework for incorporating multiresolution storage structures into a database management system; (3) developing query language extensions to specify resolution/performance tradeoffs; and (4) developing a framework for query optimization that can successfully meet those specifications.
Designing Large-Scale Multimedia Storage Servers
The fundamental problem in developing multimedia storage servers is that images, video, audio, and other similar forms of data differ from numeric data and text in their characteristics (e.g., format, size, etc.), and hence require entirely different techniques for their organization and management. The two important issues involved in designing a large-scale distributed multimedia server are: (1) how should compressed media streams be stored on disk arrays? and (2) how can multiple clients be serviced simultaneously by a multimedia server?
Efficient storage on disk arrays
Most analytical and simulation models for analyzing the performance of synchronous and asynchronous arrays have presumed a conventional workload (i.e., successive requests generated by the same process are assumed to be independent of each other). However, since audio and video playback is inherently sequential, the set of disks accessed by a client request is related to those accessed by the previous request from the same client. Additionally, due to the differences in the types of media information being requested by clients, the amount of information retrieved from the server may vary from one client to another. We propose to capture these characteristics by developing analytical and simulation models for multidisk multimedia servers, and answer questions such as: how should audio and video streams be stored on disk arrays? should the disk array be partitioned into several sub-arrays, each containing different types of media information? etc.
After developing an adequate model of storage of heterogeneous objects, we propose to extend it to incorporate multiresolution media streams. By tailoring the retrieval of media streams from disk to the desired resolution level, a multimedia server need transfer only as much data as needed from disk, thereby improving the utilization of both the server and network resources. Recently, Chiueh and Katz have presented a technique for storing video streams encoded using multiresolution compression techniques on disk arrays, and their results are encouraging. We expect to extend their technique to the storage of mutually synchronous components of a multimedia object, each of which may be encoded using multiresolution compression techniques. These techniques will utilize the inherent redundancy in multiresolution encoded media streams to provide continuous service, even in the presence of isolated disk failures.
Finally, since one of the key requirements of a multimedia server is availability, information objects may be judiciously replicated and distributed among several storage servers. Such replication schemes will not only provide high reliability, but will also promote responsive access. For each information object, derivation of the precise replication and distribution strategy will be dependent on the storage costs at the servers (which may differ from one server to another), the cost of transmission on the network (which may depend on the network configuration), the workloads at the server and the network, and the pattern of user requests. We plan to investigate this dynamic optimization problem (i.e., determining when, where, and how long an information object be stored), so as to distribute the workload uniformly among the storage servers. Given such a replication and distribution strategy, fault-tolerance and responsive access will be achieved by developing intelligent directory services, which, given the current state of the system (i.e., load and the availability of servers and network), can determine the optimal set of servers from which the requested information objects can be retrieved.
Algorithms for providing real-time guarantees to clients
Digital audio and video streams convey appropriate meaning only when presented continuously in time. Therefore, a multimedia server must ensure that they are retrieved from disk at their real-time playback rates. Based on their requirements, clients of a multimedia server can be classified as either intolerant or tolerant. Whereas intolerant clients demand that their playback requirements be strictly met throughout the service period, tolerant clients may afford brief distortions or loss of information (especially so if it is reflected in a reduced cost of service). All of the existing algorithms for servicing multiple clients simultaneously provide deterministic service guarantees to each client (which ensure that the continuous playback requirements of all the clients are strictly met for the entire service duration). The worst-case assumptions regarding the overhead of retrieving a sequence of media blocks from disk that characterize such deterministic admission control algorithms, however, may needlessly constrain the number of clients that are serviced simultaneously, which may lead to severe under-utilization of server resources. Hence, in order to enhance the scalability of the storage server, a multimedia server must employ an admission control algorithm that optimizes the utilization of server resources by exploiting the statistical variation in the access times of media blocks from disk, as well as the variation in playback rate requirements induced by variable rate compression techniques.
As a first step towards achieving this goal, we are developing an observation-based admission control algorithm, in which a new client will be admitted for service only if extrapolation from the current measurements of the server's performance indicates that the service requirements of all the clients can be met satisfactorily. A multimedia server that employs such an observation-based approach is said to provide predictive service guarantees to clients. Our preliminary analysis of such an approach has indicated that, compared to its deterministic counterpart, the observation-based admission control algorithm can serve twice as many clients simultaneously. Encouraged by these results, we propose to explore the continuum between deterministic algorithms (which provide strict performance guarantees) and observation-based algorithms (which offer fairly reliable service, but no absolute guarantees), and develop admission control algorithms that provide statistical service guarantees to each client. Furthermore, we propose to develop efficient disk scheduling algorithms that minimize the average access time of media blocks from disk and algorithms for enforcing the quality of service requirements of various clients.
Since the design of transport protocols for textual and numeric information objects has been the focus of research for more than a decade, a key objective of our research is to develop protocols that are optimized for digital audio and video communication. Additionally, these protocols will address the requirements imposed by heterogeneity in networking environments (ranging from high bandwidth ATM networks to wireless communication environments) as well as heterogeneity in the computing infrastructure at client sites (ranging from hand-held devices to powerful workstations). The suite of protocols that we propose to develop will be capable of efficiently transmitting high resolution multimedia objects to powerful workstations over high speed networks as well as low resolution multimedia objects to hand-held devices over wireless networks. These protocols, when coupled with the multiresolution storage of media streams, will enable the server to dynamically trade image resolution for efficient utilization of network resources.
The second main objective of this component of our research is to perform scalability analysis by carrying out extensive performance evaluation of the transport protocols (by assessing the overhead in various parts of kernel and driver code) and the ATM switching hardware. Whereas the former task will assist us in refining the design and implementation of transport protocols, the latter will be used to develop a hardware architecture for next generation switches. In what follows, we discuss the specific ideas to be investigated in this component of our research.
Techniques for Managing Multimedia Communication (Lam, Vin)
Digital video and audio require stringent network performance guarantees (in terms of bandwidth, end-to-end delay, loss, asynchrony, etc.). The mechanisms for providing these guarantees, however, are dependent on the network. In this section, we describe techniques for providing QoS guarantees to each media communication channel over high and low bandwidth networks.
Scheduling and Flow Control in High Bandwidth Networks
In the case of high bandwidth packet switching networks, efficient utilization of network resources requires each transmission link to be multiplexed among the many traffic flows belonging to different sessions. However, since multimedia applications impose stringent requirements, the network must ensure that the multiplexing of resources does not violate the QoS requirement of any of the clients.
If a packet switched network employs the first-come-first-served (FCFS) scheduling discipline to manage multiplexed flows, the service received by a particular client is necessarily impacted by the behavior of other traffic flows that share the same link. Consequently, the FCFS servicing policy is inadequate for providing QoS guarantees to multimedia applications. To provide some or all of these guarantees without giving up the flexibility of statistical multiplexing, a variety of rate-based service disciplines have been proposed. For a network employing one of the rate-based disciplines to guarantee performance, network users are assumed to generate traffic according to a traffic specification. Each rate-based discipline assumes that the traffic specification is satisfied by every traffic source either voluntarily, or through the use of flow controllers at network entry points. If a source does not satisfy the traffic specification, the network is not obliged to provide the performance guarantees.
Our research is concerned with the transport of video, as well as other types of traffic, in packet switching networks. Our approach is motivated by two recent findings in our research project. First, we studied the characteristics of compressed digital video encoded in accordance with MPEG, a recently published standard of the International Standards Organization (ISO). From the study, it is clear that none of the traffic specifications assumed by existing rate-based disciplines is adequate for representing variable bit-rate (VBR) encoded video streams. In particular, the packet generation rate of a VBR video source changes substantially and frequently. Second, we recently discovered and proved a delay guarantee for the Virtual Clock algorithm. It has the desirable property that the guarantee offered to a particular traffic flow is independent of the behavior of other traffic flows sharing the same server, i.e., it is unaffected by the presence of aggressive or misbehaving sources.
Using these results, we have designed an algorithm, called Burst Scheduling (where a burst refers to a sequence of packets containing an encoded frame), for scheduling the transmission of video flows in a packet switched network. We have also defined an architecture for Burst Scheduling networks. We have shown that such networks provide end-to-end delay guarantees for packets and delay jitter guarantees for bursts. More significantly, these guarantees have the desirable property that guarantees to a video flow are independent of the behavior of other traffic flows sharing the network.
To design Burst Scheduling networks that will run efficiently and reliably, and to provide performance guarantees, many problems remain. To address these problems, we will investigate: (1) demand assignment protocols for rate reservation at each channel to exploit our new model of video flows; (2) flow control of data traffic between packet switches and between source and destination; (3) buffer management of flow queues for different types of traffic in a packet switch; (4) interaction between buffer management, flow control, and demand assignment; and (5) refinement of the scheduling algorithm, architecture, and protocols to make Burst Scheduling networks tolerant of more types of failures (e.g., source regulators) and user misbehavior (e.g., malicious users); some error control is needed to support fault-tolerance.
Delivering Multimedia Objects Over Low Bandwidth Networks
In the simplest case, a multimedia object can be presented to a client across a low bandwidth network by transmitting and buffering the entire object at the client site before initiating the presentation. Although relatively straightforward to implement, such a scheme may limit the size of the object being delivered to the size of the buffer space available at the client site. Furthermore, since the buffering process itself may be highly time consuming, such an approach may yield unacceptable initiation latencies.
To address this limitation, we propose to develop techniques for scheduling the transmission of multimedia objects to client sites so as to initiate the presentation as quickly as possible while minimizing the buffer space requirement. Clearly, if the bandwidth available on the network is larger than the peak data rate requirement of an object, the presentation can be initiated without any buffering. However, if the dynamic variation in the playback rate requirements, resulting from the structure of the multimedia object, occasionally requires higher data transfer rate than the available bandwidth, then the server may be required to employ time shifting policies. Such policies exploit periods of low channel utilization to deliver media information prior to their presentation times. Notice that, whereas compression techniques reduce the peaks in data transfer requirements, time shifting attempts to smooth the bit rate requirement of multimedia objects. We propose to develop such time shifting policies for deriving schedules for transmitting multimedia objects that ensure jitter-free presentation and minimize the buffer space requirement (and hence, the initiation latency) at the client sites. We also propose to extend such policies to hypermedia objects, in which users may interactively navigate through the object.
Finally, given the multiresolution nature of the information objects, we will investigate techniques for progressive transmission of multiresolution objects from the server to client sites. These techniques will enable the storage server to trade image resolution for network load dynamically. In fact, we conjecture that such techniques, when coupled with the protocol for negotiating the quality of service desired by each client, will lay a foundation for addressing the inherent heterogeneity in computing and communication networks.
Encoding Techniques and Error Control
In high bandwidth networks (e.g., ATM), multiplexing of channels may occasionally result in loss of packets due to congestion. Thus, an important component of transport protocols for multimedia communication over high bandwidth networks is error control. Because most conventional transport protocols assume that the user traffic must be delivered without loss, they employ Automatic Repeat reQuest (ARQ) techniques (that involve retransmission of lost or corrupted data) for ensuring reliable communication. However, the inherent loss tolerances of audio and video, coupled with their stringent real-time requirements, render ARQ techniques ineffective for digital multimedia communication.
To address this limitation, we propose to develop protocols that avoid retransmissions by proper encoding of multimedia objects. We propose to achieve this goal by encoding multimedia objects so as to minimize the impact of each packet loss, or by employing forward error correction (FEC) techniques that enable the receiver to recover the lost pieces of information at the destination. In the case of video, attaining the former objective will require images to be encoded such that adjacent pixels are distributed among several packets, and transmitted such that packets containing adjacent pixels are spaced out in time. This will involve scrambling an image just prior to compression or transmission. Such techniques will enable a client to reconstruct the image by exploiting the redundancy in the remaining image data, even when a sequence of packets are lost during transmission. Although conceptually elegant, such encoding techniques may decrease the degree of compression, and hence may adversely effect the overall performance of the image transfer protocol. We propose to analyze these antagonistic effects and derive encoding techniques that enhance the loss tolerance of images while not incurring any significant degradation in the compression efficiency. As a first step towards achieving this goal, we have developed a transmission system, referred to as Dual Stream JPEG (DSJ), that utilizes layered encoding and prioritized transmission techniques to provide resilience towards isolated loss of media information, implements post-compression scrambling to distribute the effects of bursty loss, and conforms to the JPEG image compression and the ATM communication standards. With only a 5% increase in image size, DSJ achieves barely perceptible image degradation even at loss rates as high as 10%.
In addition to such implicit FEC techniques, we propose to investigate explicit FEC techniques in which redundant information (e.g., parity bits) is transmitted with the original traffic such that the images can be reconstructed at the receiver, even if some of the traffic is lost or corrupted by errors. However, for explicit FEC to be efficient, the amount of redundant information that can be transmitted with the original data is likely to to be very small. This is because, although FEC techniques help recover part of the losses, the additional data traffic introduced by the redundant FEC information increases the overall load, and hence, may make the loss rate even worse. Furthermore, if losses are highly correlated, FEC will be far less effective than when the losses are evenly dispersed. We propose to investigate explicit FEC techniques for multimedia communication and to develop policies for combining both implicit and explicit FEC techniques to enhance the performance of transport protocols.
Transport Protocol Design for Multimedia Services (Emerson, Gouda)
Past efforts on transport protocol design have been concerned with reliable delivery of data, reliable protocols for connection management, flow control, and multiplexing. To support a wide range of multimedia services, however, will require the development of a richer class of transport protocols. For instance, transport protocols for multimedia services will be required to support synchronization as well as various group communication primitives. We envision that our research in this area will lead to the development of a family of protocols, each of which may be derived by refining a basic protocol, instantiating a parameterized protocol or, more generally, a protocol graph. This will enable the network to employ different protocols that are customized for the needs of the different media streams and to parameterize them by the user-specified QoS requirements. In this section, we will first describe the design of one such transport protocol and then present a formal model for verifying the correctness of such protocols.
Inception-Time Protocols for Multimedia Communication
Recently, we have begun designing a family of transport protocols, referred to as inception-time protocols, for multimedia communication. In these protocols, each message is assigned the time at which the the message is first sent (referred to as the inception-time) as its sequence number rather than a simple message number. They assume a feed-forward, rather than the customary feed-back control (i.e., messages are sent from the sending process to the receiving process but there are no acknowledgment messages from the receiving process to the sending process, except in rare circumstances). Moreover, the protocols support ordered and causal broadcasts, as well as unicast message transport.
We believe that such inception-time protocols are naturally suited for multimedia communication over high bandwidth ATM networks. Specifically, given the traffic characteristics of each media source and the QoS requirements of a communication channel, lower and upper bounds on the time interval between successive transmission of packets (required by the inception-time protocols) can be computed easily. Furthermore, since successive messages transmitted using the protocol contain real-time stamps, they can be used to achieve clock and media synchronization. Finally, inception-time protocols support a large variety of broadcast primitives, which are essential in many multimedia applications. To demonstrate feasibility and effectiveness, we propose to implement and evaluate inception-time protocols for managing multimedia communication over the proposed ATM network.
Formal Design and Verification of Multimedia Protocols
Multimedia protocols are typically comprised of many geographically distributed components running in parallel, coordinating their activities through subtle synchronization schemes. Because of the high degree of nondeterminism inherent in such systems, validation techniques based on testing are highly inadequate. Thus, to ensure correct and reliable performance of such systems, the use of formal methods for specifying and verifying correct behavior becomes essential.
Since multimedia protocols exhibit ongoing reactive behavior, a formalism such as temporal logic for describing and reasoning about change over time is especially well-suited to this task of program reasoning. Moreover, such protocols typically exhibit a high degree of ``data independence'', meaning that their synchronization and coordination behavior is largely independent of the data content (e.g., image frames) of the messages. Thus, multimedia protocols can be modeled at a useful level of abstraction as finite state concurrent systems. This in turn makes it possible, in principle, to mechanically verify their correctness with respect to a given temporal logic specification using a model checking algorithm, which decides whether the global state transition graph of the system defines a model of the temporal logic formula. For a number of useful temporal logics, there are extremely efficient model checking algorithms that can be applied. This method of verification via model checking is quite effective for protocols that can be realized with a few million states.
However, a serious drawback to this approach is the state explosion problem. In a network with 100 processes, each with just 10 local states, the associated global state graph could have as many as 10100 global states, and explicitly storing the state graph on any computer system implementing the model checker is out of the question. Thus we plan to identify important cases where it is possible to avoid state explosion. One promising possibility is through use of abstraction: replace a large state machine with a more abstract, much smaller state machine that satisfies the same correctness properties. For example, in recent work, it is shown that for certain ring networks, where processes communicate by a token signal, a ring network with n components, for arbitrarily large integers n, satisfies the same temporal properties as a ring of size 6.
It is also possible to exploit symmetry in concurrent systems. In a system with many isomorphic components or processes, one gets the sense that there may be a great deal of redundancy. For example, processes 1,2,3, ... n may all be identical except for their names and exhibit identical behavior. Permuting the names of the processes has no real effect on the behavior of the system as a whole. This idea can be formalized profitably in terms of the symmetry of the global state graph of the system, characterized essentially by the group of (labeled, graph) automorphisms, mapping each state to a state that is the same except that the processes may be re-indexed. For many temporal properties, it is possible to efficiently reduce model checking over the original, intractably large state graph to model checking over a much smaller, symmetry-reduced quotient graph.
Finally, there is the promising new approach of symbolic model checking using BDDs (Binary Decision Diagrams), a special type of deterministic finite state automaton. In many applications, in particular those involving representation of the highly regular global state transition graphs for computer hardware circuits, the symbolic BDD representation will turn out to be of size polynomial in n while the state graph itself is of size exponential in n. Since a model checking algorithm can be implemented using the BDD representation of the graph, this makes it possible to do efficient model checking of many hardware circuits whose state graphs are of size 10100 states or much more.
BDD-based symbolic model checking has revolutionized hardware design and verification. Empirical evidence suggests, however, that BDDs are much less effective at succinctly representing the state graphs of various types of computer software, including protocols, which seem to lack the necessary regularity of structure. Thus, we propose to investigate alternative symbolic representations of state graphs for use in model checking multimedia protocols and related software.
Since BDDs are a special type of automata, viz. deterministic finite state automata (DFA), we plan to investigate the use of other automata-theoretic, language-theoretic, and logical formalisms to encode the state transition relations of concurrent systems. Possibilities that suggest themselves include alternating finite state automata, which can be doubly-exponentially more succinct than DFA's and are closed under boolean operations, and WS1S (weak monadic second order theory of one successor), which is closed under boolean operations and can be non-elementarily more succinct than DFA's. The drawback is that testing nonemptiness or, equivalently, satisfiability may be expensive in general. But there could be enormous savings in space complexity, which empirically seems to be more crucial than time complexity in automatic verification applications.
Most importantly, it seems that to determine the practical utility of any such alternative symbolic representation schemes is purely an empirical task. By analogy with BDDs, looking at the theory literature, one would think that they were worthless for applications, since checking very simple graph properties with BDDs is PSPACE-hard. Only after implementing BDD packages and experimenting with them, was it seen that they worked very well on applications such as hardware. Similarly, we would argue that the CISE facility could provide an essential testbed for empirically evaluating novel and potentially useful alternatives to BDDs. In turn, viable alternative symbolic representations of state graphs and corresponding implementations of model checking algorithms would greatly facilitate the formal design and verification of large-scale multimedia protocols.