COMPUTER ARCHITECTURE AND SYSTEMS

Design of I/O Subsystems for New Secondary Storage Technology

R. H. Campbell,Principal Investigator D. Stephenson
Apple Computer Corp.

New applications like multimedia systems and the benefits of high-performance microprocessors and buses are limited in current personal computers by existing storage device technology. Low-cost, high-density, secondary storage devices such as 1.5 in. hard disks, flash and PCMCIA cards, and optical disk drives, allow innovative I/O subsystem designs that can help to match the performance of powerful microprocessors and high-bandwidth buses with the new storage technologies. Example I/O subsystem designs include disk arrays and log file systems. This research is examining the design trade-offs involved in the I/O subsystems.


Data Prefetching in Shared Memory Multiprocessors

J. Torrellas,Principal Investigator D. Koufaty
National Science Foundation, MIP 93-08098, MIP 93-07910, Young Investigator Award, MIP 94-57436; National Aeronautics and Space Administration, NAG 1 613; University of Illinois

Memory hierarchies are used by multiprocessor systems to reduce large memory access times. However, even with tuned memory hierarchies, large machines can waste a lot of time in memory hierarchy misses. A good technique to reduce this waste is data prefetching. In this technique, specialized hardware and software support brings data close to the processor in advance, before the processor actually needs the data. The goal is to overlap the fetching of these data with other computation and therefore waste no time waiting for the data. In this research, we perform a realistic study to find out the potential for data prefetching in numerical codes. We also design hardware support for prefetching.


Hardware and Software Support for Advanced Synchronization in Shared Memory Multiprocessors

J. Torrellas,Principal Investigator L. Yang
National Science Foundation, MIP 93-08098, MIP 93-07910, Young Investigator Award, MIP 94-57436; National Aeronautics and Space Administration, NAG 1 613; University of Illinois

Fast process synchronization is critical to the performance of large-scale shared memory multiprocessors. The fastest way to support synchronization is to provide special-purpose hardware. Examples of such hardware are the fetch-and-phi operations of the NYU Ultracomputer and IBM RP3, the full/empty bit of the HEP and Tera Computer machines, or the QOSB primitives of the Wisconsin Multicube. Cedar provides the most complete set of hardware-supported synchronization operations thanks to a special synchronization processor. We evaluate some of this hardware. We also analyze advanced algorithms to minimize interprocessor communication when processors synchronize in scalable shared memory machines.


The Illinois Aggressive Coma Multiprocessor
(I-ACOMA)
J. Torrellas,Principal Investigator J. Mitrevski, L. Yang, J. Martinez, A. Nguyen, S. Basu, D. Koufaty, A. Sharma, V. Krishnan, K. Mungnirun, E. Torres, P. Trancoso, Z. Zhang, Y. Zhang, M. Cintra
National Science Foundation Young Investigator Award, MIP 94-57436, MIP 93-08098, MIP 93-07910, ASC-9612099; National Aeronautics and Space Administration, NAG 1 613; University of Illinois

Scalable shared memory multiprocessors are a popular approach to provide large-scale computing power while maintaining programmability. What makes the base shared memory paradigm attractive is the simplicity of the programming model: memory is shared by all processors. In this project, we design the I-ACOMA multiprocessor, a new scalable shared memory multiprocessor. The issues that are being investigated include advanced cache co- herence protocols, compiler support for the protocols, support for data prefetching and data transfer optimizations, and other memory hierarchy improvements. We also look at operating systems for scalable shared-memory multiprocessors.