Unify Overview
The Unify project is one of the ongoing research projects in the Distributed
Computing Systems (DCS) Lab at the University
of Kentucky Department of Computer
Science .The principle investigators are James
Griffioen , Rajendra Yavatkar
, Raphael Finkel , and
James E. Lumpp, Jr. . The list
of graduate students who work on Unify can be found here
. The Unify project is supported by grants from ARPA
, EPSCoR , and
NSF .
The objective of the Unify project is to develop a scalable multicomputer
linking hundreds or thousands of high-performance machines in geographically
distant locations. Unify's goal is to support a highly scalable distributed
shared memory programming paradigm that provides a convenient programming
model. Conventional DSM approaches are inappropriate in such large-scale
geographically distributed environments. To achieve scalability in such
an environment, Unify supports new shared memory abstractions and mechanisms
that (1) mask the distribution of resources, (2) limit/reduce the frequency
of communication and the amount of data transferred, (3) hide the propagation
latencies typical of large-scale networks (e.g., by overlapping communication
with computation), and (4) support large-scale concurrency via synchronization
and consistency primitives free of serial bottlenecks. Ideally the system
must provide convenient data sharing abstractions that exhibit performance
and scalability similar to that of existing large-scale message passing
multicomputers such as PVM
and MPI. The following briefly highlights the salient features of Unify:
- Single Address Space:
- A single virtual address space, shared by all applications, allows
applications to conveniently and efficiently share structured address-dependent
data (such as trees and linked lists) as well as other address-independent
data.
- Multiple Memory Types:
- Our design supports three basic memory abstractions for shared data.
Random access memory is directly addressable. Sequential access memory
is accessed in a read/front, write/append fashion. Associative memory is
accessed via <key,value> pairs. Sequential access and associative
memory can often be supported with weaker spatial consistency guarantees
(described below) that can be implemented more efficiently than random
access memory segments.
- Multiple Grades of Consistency:
- The Unify design supports a set of consistency management primitives
that allow an application to select the appropriate consistency semantics
from a spectrum of consistency protocols, including "automatic"
methods where the operating system enforces consistency and "application-aided"
methods where the user defines consistency checkpoints. Several existing
DSM systems have demonstrated the benefits of weak application-aided temporal
consistency models. In addition, Unify supports weak automatic methods
where the memory becomes consistent after some time lag T. This
type of automatic consistency is particularly useful for applications that
can detect stale data such as Grapevine.
Our design also introduces a new consistency dimension called "spatial"
consistency. Spatial consistency determines the relative order of the contents
of the replicas of a segment. For many distributed applications that use
keyed lookups or sequential access, the order of the data items within
a segment is unimportant; only the values of individual data items are
important. Spatial consistency allows efficient implementation of such
applications.
- Scalable Synchronization Primitives:
- Most DSM designs support locks, semaphores, and/or barriers as the
basic synchronization methods. However, these synchronization primitives
pose a serious bottleneck, particularly for large systems with high latencies.
To provide efficient synchronization in the presence of long and varying
latencies, Unify proposes the use of a modified form of event counts and
sequencers. For a large class of applications, event counts can result
in reduced communication and greater concurrency. In particular, conventional
synchronization methods require that all participants observe the synchronization
event simultaneously. Event counts allow participants to observe the event
at different times, effectively relaxing the communication constraints
and allowing greater concurrency. Moreover, conventional synchronization
primitives can be implemented via event counts.
- Hierarchy of Sharing Domains:
- We believe that sharing in a large scale distributed multicomputer
will follow the principle of locality. To exploit localized sharing and
communication, we partition the set of hosts into "sharing domains".
Each sharing domain uses a separate multicast group to reduce the cost
of intra-domain information sharing. Sharing domains distribute the burden
of information retrieval and distribution by allowing any member of the
domain to issue or answer inter-domain requests (addressed to multicast
groups). Every host consults the hosts in its local sharing domain before
going outside the domain for information. Therefore, as soon as one host
obtains cross-domain information from a site in a remote domain, the information
effectively becomes available to all other hosts in the sharing domain.
- Reliable Multicast Support:
- Although protocols such as ATM and RSVP provide quality of service
guarantees, they do not guarantee end-to-end reliability. In fact, classical
IP performance over ATM is currently a topic of much research because of
TCP/IP's poor performance and high data loss rates. Distributed applications
require a reliable multicast mechanism that not only delivers data reliably,
but also attempts to synchronously deliver data to all recipients. Distributed
applications typically block until the multicast information has been reliably
delivered to all participants. Such delays can severely affect performance.
To achieve reliable, scalable, efficient dissemination of shared data
or synchronization information, Unify supports reliable multicast via a
tree-based multicast transport protocol (TMTP).
TMTP builds on the efficient delivery of IP multicast (possibly via the
MBONE) which is widely available. To provide reliability, TMTP uses a combination
of sender and receiver initiated approaches that constructs a separate
control tree to handle error and flow control. As a result, retransmissions
are handled locally in a timely fashion, avoiding retransmissions to the
entire Internet. The use of localized NACKs with NACK suppression ensures
quick response to lost messages with minimal overhead. Finally, batched
positive acknowlegements reduce Internet traffic and eliminate the packet
implosion problem.
The utility and scalability of large scale multicomputers can only be
demonstrated by developing and evaluating real-life, large-scale parallel
and distributed applications. We have implemented the Unify system as a
runtime library on Unix workstations and run tests involving a significant
number of hosts. DSM applications link with the library to create shared
segments with appropriate memory types and consistency semantics.
We have implemented several common DSM applications including matrix
multiplication, SOR, and Water and observed impressive speedups in our
local environment. The library is currently available to, and in use by,
our scientific colleagues with computationally complex problems. We are
currently working to provide them with a high-level reactive object support
system that will provide maximal performance despite highly dynamic changes
in the runtime environment.
Back to the Unify Home Page