Indexing Multidimensional Data

S. Mehrotra*
University of Illinois

We are exploring efficient storage and retrieval techniques for multidimensional data. Developing such mechanisms is becoming increasingly important because of the emergence of applications like CAD databases, medical image systems, and multimedia databases. Specifically, we are exploring indexing and efficient query-processing techniques over multidimensional data. Also, we are developing techniques for concurrent access and failure resilience of multidimensional data structures issues that must be resolved before any of the developed indexing techniques can be integrated into the database management system.


Concurrent Text Retrieval Systems


S. Mehrotra*
University of Illinois

With an explosive growth of the Internet World Wide Web, and the increasing demands to retrieve documents based on their contents over the net, it has become imperative to develop effective techniques for text retrieval. At the center of these retrieval systems is a full-text index that accelerates retrieval of documents based on the presence or absence of keyboards and their proximity to each other. In this project, we are examining text retrieval techniques that are aimed primarily toward search over the Internet. Specifically, we are developing effective techniques to support concurrent operations over the text index.


Multimedia Retrieval System


S. Mehrotra*
University of Illinois

Recent advances in digital storage technology, image analysis and computer vision, and database management have created an exciting possibility of developing powerful retrieval systems that support complex multimedia data, that have traditionally been treated as a raw uninterpreted sequence of bits, as first-class objects that can be queried and retrieved based on their rich internal structure. Such visual and aural data pose new challenges to data management. We are examining techniques for (1) effective modeling and description of visual objects, (2) support for content-based and similarity based retrieval, and (3) query evaluation techniques for composite queries that involve multiple approximate matches based on similarity (e.g., a Boolean combination of these matches).


Remote Backup Systems


S. Mehrotra*
University of Illinois

Business organizations are increasingly demanding systems that provide continuous service with zero down time. The key to developing such systems is replication. A viable approach is the maintenance of a remote backup system in which two copies of the database are maintained. Transaction processing takes place at the primary copy, and the log records generated propagate to the remote backup, which uses them to reconstruct a recent state of the database at the primary. We are examining efficient scalable techniques for maintaining remote backups. Existing approaches either result in high overhead and low system throughput, or risk loss of transactions during failures, thereby sacrificing persistence for throughput. Our approach overcomes the limitations of existing backup techniques.


Rule Processing in Distributed and Parallel Environments

S. Mehrotra*
University of Illinois

We are examining how rule processing can be supported in distributed and parallel databases. In most existing database systems (both prototypes and commercial systems) that provide support for production rules, the rules respond to operations on centralized data, and rule processing is performed in a centralized, sequential manner. With the increasing interest in parallel and distributed database systems, the techniques for processing rules in these environments are gaining importance. We are interested in developing both the theory of rule processing and, based on the theory, developing effective and efficient rule processing techniques for distributed and parallel database management systems.


Transaction Processing in Emerging Database Applications

S. Mehrotra*
University of Illinois

The objective of this project is to study the feasibility of designing transaction processing systems that provide adequate support for cooperative and long-duration computations found in emerging database applications like concurrent engineering, cooperative design environments, and office workflow automation. We are interested in developing transaction processing systems that provide a programmable interface using applications that can specify to the system their own desired computation model (and protocols to support the model). This research will pave the way for developing transaction processing systems that provide adequate support for long-duration and cooperative computations found in emerging database applications.


Transient Versioning in Distributed Database Systems

S. Mehrotra*
University of Illinois

We are studying how transient versioning can be efficiently supported in distributed databases. Transient versioning is used in database systems to reduce the data contention caused by long-duration read-only transactions. The system maintains an older version of data for the read-only transactions (that do not mind reading slightly old but still consistent data) to read. Transient versioning eliminates the unnecessary interference between such read-only queries and other short update transactions. We are examining how transient versioning can be efficiently supported in distributed databases.


Database Analysis

M. E. Williams,* D. Bott, A. Knackstedt
University of Illinois
(Conducted in the Coordinated Science Laboratory)

Analyses of data in the database of databases are run annually. Analyses included number and percentage of databases by field of science, type of database, storage media, country, geographic area, and sector of the economy. Statistics are also generated regarding numbers of records within databases according to field, country, and sector of the economy. Various correlations between data items are generated and published each year.


An Assessment of Scientific and Technical Information in the United States


M. E. Williams*
National Science Foundation via University of Tennessee

The objective of this project was to examine the status, trends, opportunities, and problems of scientific and technical information dissemination in the United States. In the first phase of the project we focused on estimating and examining the size and characteristics of the demand for STI services by various user groups and the magnitude, quality, and costs of supplying STI by various sources. Issues were identified and categorized as information technology; policy, structure and institutional; legal and ethical, economic, marketing and financial; information content and access; attitudinal and behavioral; educational and training; and international. A book on this topic will be completed in 1995 by J. M. Griffiths, D. King, and M. E. Williams (permission to use material granted by NSF).


Mapping of Environmental Data Resources

M. E. Williams,* D. Bott, A. Knackstedt, L. Smith, G. Newby
U.S. Army Construction Engineering Research Laboratory, DACA88-93-K-0001
(Conducted in the Coordinated Science Laboratory)

The objective of this project is to develop and employ a methodology for mapping environmental information resources for use by army personnel. The resources should be accessible by automation techniques and are to be evaluated based on a criteria of ``value'' to the army to include level of maintenance, timeliness, standardization, and accessibility. Sample data will be linked and made accessible through the Internet in WAIS format.


Database Support for Arrays in High-Performance Computing

M. S. Winslett,* K. Seamons, Y. Chen, Y. Cho, S. Kuo, M. Subramaniam
National Science Foundation, IRI 89-58582 PYI, Army UMCP, Z984116

Scientific applications often make use of large multidimensional arrays, a data type not supported in current databases. We are examining the question of support for array handling on traditional and massively parallel platforms, with an emphasis on applications that use multistage algorithms.


Secure Access to Services in an Open Networked Environment

M. S. Winslett,* V. Jones, N. Ching
National Science Foundation, IRI 89-58582 PYI; Advanced Research Projects Agency, DACA94-C-0029

With the growth and commercialization of the Internet and the popularity of new information services such as the World Wide Web, we find a need for clients to be able to interact without prior knowledge and servers of one another. Often a server will require proof that a new client possesses certain properties, e.g., student status, local res idency, or an ability to pay for services to be rendered. The client may also wish guarantees that it is interacting with a bona fide server, as well as guarantees for the privacy of its interactions with the server. In this project, we are extending current-day authentication and authorization mechanisms to be applicable to such a scenario, with a focus on the needs of database applications.