Unify Overview

The Unify project is one of the ongoing research projects in the Distributed Computing Systems (DCS) Lab at the University of Kentucky Department of Computer Science .The principle investigators are James Griffioen , Rajendra Yavatkar , Raphael Finkel , and James E. Lumpp, Jr. . The list of graduate students who work on Unify can be found here . The Unify project is supported by grants from ARPA , EPSCoR , and NSF .

The objective of the Unify project is to develop a scalable multicomputer linking hundreds or thousands of high-performance machines in geographically distant locations. Unify's goal is to support a highly scalable distributed shared memory programming paradigm that provides a convenient programming model. Conventional DSM approaches are inappropriate in such large-scale geographically distributed environments. To achieve scalability in such an environment, Unify supports new shared memory abstractions and mechanisms that (1) mask the distribution of resources, (2) limit/reduce the frequency of communication and the amount of data transferred, (3) hide the propagation latencies typical of large-scale networks (e.g., by overlapping communication with computation), and (4) support large-scale concurrency via synchronization and consistency primitives free of serial bottlenecks. Ideally the system must provide convenient data sharing abstractions that exhibit performance and scalability similar to that of existing large-scale message passing multicomputers such as PVM and MPI. The following briefly highlights the salient features of Unify:

Single Address Space:
A single virtual address space, shared by all applications, allows applications to conveniently and efficiently share structured address-dependent data (such as trees and linked lists) as well as other address-independent data.
Multiple Memory Types:
Our design supports three basic memory abstractions for shared data. Random access memory is directly addressable. Sequential access memory is accessed in a read/front, write/append fashion. Associative memory is accessed via <key,value> pairs. Sequential access and associative memory can often be supported with weaker spatial consistency guarantees (described below) that can be implemented more efficiently than random access memory segments.
Multiple Grades of Consistency:
The Unify design supports a set of consistency management primitives that allow an application to select the appropriate consistency semantics from a spectrum of consistency protocols, including "automatic" methods where the operating system enforces consistency and "application-aided" methods where the user defines consistency checkpoints. Several existing DSM systems have demonstrated the benefits of weak application-aided temporal consistency models. In addition, Unify supports weak automatic methods where the memory becomes consistent after some time lag T. This type of automatic consistency is particularly useful for applications that can detect stale data such as Grapevine.

Our design also introduces a new consistency dimension called "spatial" consistency. Spatial consistency determines the relative order of the contents of the replicas of a segment. For many distributed applications that use keyed lookups or sequential access, the order of the data items within a segment is unimportant; only the values of individual data items are important. Spatial consistency allows efficient implementation of such applications.

Scalable Synchronization Primitives:
Most DSM designs support locks, semaphores, and/or barriers as the basic synchronization methods. However, these synchronization primitives pose a serious bottleneck, particularly for large systems with high latencies. To provide efficient synchronization in the presence of long and varying latencies, Unify proposes the use of a modified form of event counts and sequencers. For a large class of applications, event counts can result in reduced communication and greater concurrency. In particular, conventional synchronization methods require that all participants observe the synchronization event simultaneously. Event counts allow participants to observe the event at different times, effectively relaxing the communication constraints and allowing greater concurrency. Moreover, conventional synchronization primitives can be implemented via event counts.
Hierarchy of Sharing Domains:
We believe that sharing in a large scale distributed multicomputer will follow the principle of locality. To exploit localized sharing and communication, we partition the set of hosts into "sharing domains". Each sharing domain uses a separate multicast group to reduce the cost of intra-domain information sharing. Sharing domains distribute the burden of information retrieval and distribution by allowing any member of the domain to issue or answer inter-domain requests (addressed to multicast groups). Every host consults the hosts in its local sharing domain before going outside the domain for information. Therefore, as soon as one host obtains cross-domain information from a site in a remote domain, the information effectively becomes available to all other hosts in the sharing domain.
Reliable Multicast Support:
Although protocols such as ATM and RSVP provide quality of service guarantees, they do not guarantee end-to-end reliability. In fact, classical IP performance over ATM is currently a topic of much research because of TCP/IP's poor performance and high data loss rates. Distributed applications require a reliable multicast mechanism that not only delivers data reliably, but also attempts to synchronously deliver data to all recipients. Distributed applications typically block until the multicast information has been reliably delivered to all participants. Such delays can severely affect performance.

To achieve reliable, scalable, efficient dissemination of shared data or synchronization information, Unify supports reliable multicast via a tree-based multicast transport protocol (TMTP). TMTP builds on the efficient delivery of IP multicast (possibly via the MBONE) which is widely available. To provide reliability, TMTP uses a combination of sender and receiver initiated approaches that constructs a separate control tree to handle error and flow control. As a result, retransmissions are handled locally in a timely fashion, avoiding retransmissions to the entire Internet. The use of localized NACKs with NACK suppression ensures quick response to lost messages with minimal overhead. Finally, batched positive acknowlegements reduce Internet traffic and eliminate the packet implosion problem.

The utility and scalability of large scale multicomputers can only be demonstrated by developing and evaluating real-life, large-scale parallel and distributed applications. We have implemented the Unify system as a runtime library on Unix workstations and run tests involving a significant number of hosts. DSM applications link with the library to create shared segments with appropriate memory types and consistency semantics.

We have implemented several common DSM applications including matrix multiplication, SOR, and Water and observed impressive speedups in our local environment. The library is currently available to, and in use by, our scientific colleagues with computationally complex problems. We are currently working to provide them with a high-level reactive object support system that will provide maximal performance despite highly dynamic changes in the runtime environment.


Back to the Unify Home Page