E. Douglas Jensen's

Real-Time for the Real World

 
Home  |  Search  |  Contact Me   
 
 

My personal manifesto about the widely misunderstood field of real-time computing...

"I don't understand why people are frightened of new ideas. It's the old ideas that frighten me."
-- John Cage


 

Introduction
About Me
Real-Time
Distributed Real-Time
Distrib. Real-Time Java
Real-Time Java
Real-Time CORBA
Real-Time Resources
Our Documents
 

Distributed Real-Time Computing Systems

Distributed Computing Systems: Multi-Node Behavior

For our purposes here, the term distributed system refers to
a computing system layer that includes at least one programming model based on there being execution entities that exhibit graphs of multi-node behaviors – including (but not limited to)

bulletcontrol flow, such as RPC, remote method invocations
bulletdata flow, such as publish/subscribe
bulletasynchronous message-passing
bulletmobile code
bulletautonomous agents
bulletvirtual shared memory, including tuple spaces.

This view doesn’t limit the scope of the term, but it provides a focal point for considering properties of the system’s execution entities and their behaviors. Distributed systems are not normally described in terms of multi-node behavior graphs, so here are some illustrations of familiar multi-node behavior patterns.

A very common pattern of multi-node behavior, such as in a CORBA system (among many others), consists of

bulleta client execution entity (say, thread) executing a method in an object instance on some particular node
bulletat some point performing a two-way method invocation on an object instance on a remote node, then waiting (usually suspended) for the corresponding return
bulleta servant thread (typically awaking or being created and) executing the called method on the remote node
bulletthen performing a return (and typically being suspended or terminated)
bulletthe client thread (typically awaking and) resuming execution.

This 2-node behavior pattern, and the corresponding n-node behavior one, are examples of sequential multi-node behavior; we refer to this special case as trans-node behavior.

Another class of common trans-node behavior patterns uses asynchronous interprocess communication (IPC) messages, rather than synchronous method invocations.

Often, multi-node behaviors are formed with causal trees and graphs of asynchronous IPC's -- for example,

bulletexecution entity (say, thread) A sends a message to threads B and C
bulletconsequently, threads B and C react by executing and then sending messages to threads D and E, and G and H, respectively
bulletconsequently, threads D and G react by executing and then sending messages to thread A.

Directed graphs are the general case of multi-node behavior topology.

The publish/subscribe pattern is a popular 2-level tree multi-node behavior (in conventional practice, publish/subscribe is almost never either more than two levels or a cyclic graph, but these are not inherent limitations).

bulletan execution entity (say, thread) executing on some particular node publishes data by multicasting a message
bulletthe subscribing threads on other nodes all (presumably) receive the publication message and act on it.

Multi-node behaviors have multi-node properties that must be maintained.

The best-known such property is the integrity of the behavior, despite partial failures (e.g., nodes, communication network). When a node or path in a multi-node behavior graph fails, the potentially affected segments of the behavior must be notified and optionally react. The most elementary instances are link level protocols with time-outs or some combination of ack/nack responses. A familiar example at a higher level is RPC orphan detection and elimination.

Another multi-node property is distributed detection and handling of distributed events. By "events" we mean changes in the predicated system state -- a concept that supports many other multi-node properties, including timeliness, security, and fault management.

Real-time distributed computing systems are distinguished by having multi-node timeliness properties.

Real-Time Distributed Computing Systems: Timeliness

Distributed computing systems are real-time ones to the degree that they include multi-node behaviors having timeliness properties -- with the same two dimensions as real-time centralized systems, namely how well application time constraints are satisfied, and how predictable that satisfaction is -- on an end-to-end basis.

In the networking theory context, end-to-end communication is the problem of sending a sequence of messages from a sender to a receiver when the network through which they communicate is unreliable. That is not what we mean here.

End-to-end timeliness in our sense applies to that portion of a multi-node behavior within the scope of a time constraint -- i.e., from where an execution entity (say, thread) of a behavior enters a time constraint scope on some node, to where some thread of that behavior exits that scope. For example, in the Alpha real-time distributed OS kernel, a time constraint must begin and end on the same node; in Real-Time CORBA 2.0, a time constraint scope may begin on one particular node, and end on another. In the figure below, the black/gray-colored dashed line for the time constraint is intended to illustrate the former case, and also to suggest the alternative latter case (e.g., CORBA's "one way" invocation). Execution of the purple portion of the behavior has been completed, and that of the pink portion has not.

End-to-end timeliness of multi-node behaviors necessitates that each behavior's timeliness parameters (e.g., including its priority or deadline) be known, with the same semantics, at each node the behavior currently involves, and used there for resource management. Those nodes can either know these parameters a' priori, or obtain them dynamically. For example, timeliness parameters could be known a' priori if the behavior at a node is time-constrained by the nature (e.g., physical properties) of the application, independent of any time constraints on the behavior at other nodes. More commonly, the timeliness parameters must be propagated as a behavior involves other nodes. (Real-time CORBA supports both of these modes.)

End-to-end timeliness of multi-node behaviors can be achieved to a degree highly dependent on the extent of eligibility compliance: the extent to which those parameters are consistently employed at each such node, and by the network connecting them, for resolving contention for physical and logical resources (e.g., processor and communication path scheduling, lock dequeuing, etc.). Realistically, especially given COTS software and hardware products, not all resource management will employ these parameters or employ them consistently; so there will be some number and durations of end-to-end timeliness anomalies (e.g., eligibility inversions, such as the special case of distributed priority inversion []).

It is well known that in general, global optimality of multi-node sequencing, and hence of multi-node behavior end-to end timeliness, is not feasible – but approaches for application-specific trade-offs between acceptable approximations to optimality and affordable costs can be roughly placed into three categories. In the first category, every node currently involved in one or more multi-node behaviors uses a consistent (typically the same) sequencing policy. The Alpha distributed real-time OS kernel [Northcutt 87, Clark et al. 92], and its direct descendents -- the MK7 distributed real-time OS [Open Group 98], IBM’s never-released Workplace OS for PowerPC, Digital's never-released Libra real-time OS and ORB, and Real-Time CORBA 1 (fixed priority) [OMG 99] and 2 (dynamic scheduling) [OMG 01] -- are primarily in this category. In the second category, a logically singular global resource sequencer is instantiated on every node, and the instantiations interact to perform sequencing for all the nodes. This can provide improved timeliness over the first category, but potentially at higher costs (e.g., inter-node communication latencies). Alpha, MK7, and (optionally) Real-Time CORBA 2 are in this category for sequencing of their synchronizers. In the third category, one or more levels of “meta” resource sequencers operate above the node resource sequencers – e.g., as a compromise between the first and second categories, or when it is not feasible to have node resource sequencers in categories one or two. There are numerous examples of such systems, principally for the second reason.

The most natural approach is for application programmers to have constructs for directly specifying actual execution entity (say, thread) completion time constraints that are inherent in the application -- e.g., BeginTimeConstraint {Deadline,...}, EndTimeConstraint{}; see Time Constraints. Constructs for this purpose first appeared in the Alpha distributed real-time OS kernel; subsequently, they became part of the Real-Time CORBA 2.0 specification, and the Distributed Real-Time Specification for Java is expected to provide a similar capability. Time constraints must be interpreted uniformly in a distributed real-time system, so they require that the system have global physical time that is appropriately well synchronized at each node. (A weaker property is for at least those nodes involved in a time-constrained multi-node behavior to have appropriately well synchronized global physical time while the behavior is transiting them; but in practice this is not sufficient and is more difficult to achieve.)

But real-time, and particularly distributed real-time, computing systems that provide for using actual time constraints are the exception. The usual practice is that system software (e.g., ORB, JVM, OS) provides only a priority mechanism, and the programmers are required to establish the priority semantics by mapping time constraints (such as deadlines) onto them. Such mappings are complicated in all the but simplest, smallest, and least dynamic systems -- for example: priority assignments for multi-node behaviors require global knowledge of all other priority assignments on all nodes that the multi-node behaviors could involve (whereas time constraints are local knowledge); the granularity of time constraints is typically much finer than that of priority ranges; priority ranges supported by a system's constituent OS's, JVM's, ORB's, etc. may be different, in which case there must be a mapping between each pair; semantics are associated with priorities by the users, the system and application software, and the hardware -- not always (or even usually) entirely coherently.

Given COTS system software that provides only priorities, which necessitates mapping to emulate time constraints, then a compromise is a library that provides time constraint scope delineation constructs and automatically manipulates priority assignments on the affected nodes. In Real-Time CORBA 1 (Fixed Priority), the complications of just statically assigning, not including manipulating, priority assignments resulted in the scheduler service (and conforming products primarily for simple periodic cases); but in Real-Time CORBA 2 (Dynamic Scheduling), which provides for pluggable schedulers and for using actual time constraints, the scheduling service was deprecated.

Priorities per se do not differentiate real-time from non-real-time computing systems -- they are at least as widely used in non-real-time computing systems as they are real-time ones (many real-time computing systems use cyclic executives without priorities). In non-real-time computing systems, execution entity priorities are often part of the strategy for managing resources so as to achieve a balance of maximal system throughput and job fairness.

Priority compliance (a special case of eligibility compliance) is the extent to which resource contention is resolved in compliance with the priorities. Two features are important for maximal priority compliance -- in both real-time and non-real-time computing systems. The first is pre-emptive priorities; but pre-emption is not without costs -- for instance, it makes some scheduling algorithms more difficult or even infeasible. The second is some form of priority inheritance protocol to reduce the number and duration of priority inversions []; this too adds some cost.

High degrees of priority compliance can be as valuable in non-real-time systems as in real-time ones, and therefore are not a differentiating characteristic of real-time systems (e.g., OS's, ORB's, JVM's). The differentiation between real-time and non-real-time priority systems (contrary to popular opinion) arises from real-time ones: having priority semantics based on application time constraints; and employing the priorities to resolve resource contention with the objective of improving system timeliness properties -- i.e., the two dimensions of how well collective application time constraints are satisfied, and how predictable that satisfaction is.

Distributed systems that use priorities for resource access (e.g., execution) eligibility may propagate priority as a distributed behavior involves a remote node. Non-real-time as well as real-time distributed systems may seek high degrees of priority compliance for multi-node behaviors. For example, a resource contention resolution (e.g., sequencing) policy to maximize throughput could map execution entities' (say, threads') eligibility criteria -- expected execution time, execution time received thus far, etc. -- onto priorities and propagate client thread priority with a remote method invocation and subsequent return.

The Real-Time CORBA 1 specification provides (the option) for client thread priorities to be propagated to servant nodes, and intends that they be used for resource management by the ORB instances on each node and by the node OS's (via functions that map between the ORB's global priority space and each OS's local priority space). The Real-Time CORBA 1 (Fixed Priority) specification does not limit priority semantics to timeliness, so it might more accurately be called "a distributed priority system that works" -- hence, it could be valuable for non-real-time applications, just as a traditional priority-based real-time OS could be valuable for non-real-time applications.

The Real-Time CORBA 2 specification generalizes that to arbitrary timeliness parameters (such as for deadlines); the Distributed Real-Time Specification for Java is expected to provide similar capability for Real-Time RMI and the Real-Time Specification for Java's JVM's. But in both cases, arbitrary timeliness parameters could be those for (say) maximizing end-to-end throughput, without having to map them into artifactual priorities. Other parameters could be propagated in support of other end-to-end properties, such as fault tolerance, security, etc. Hence, both of these real-time distributed computing system specifications could be valuable for non-real-time systems as well as for real-time ones.

The greater the degree that a distributed system is a real-time one, the more it requires a well-defined architecture with stable interfaces between components and some level of system-wide agreement on infrastructure technology – including the semantics of timeliness and the multi-node coherence of timeliness-based  resource management. The higher the level within an enterprise, or among enterprises, such systems are deployed, the more difficult it is difficult to create the technical agreements and coordination needed -- even when the magnitudes of the multi-node behaviors' time constraints are relatively large with respect to traditional real-time practice (e.g., seconds or minutes).

to be continued...

References

Clark et al. 92 R. Clark, E. Jensen, and F. Reynolds, An Architectural Overview of the Alpha Real-Time Distributed Kernel, Proc. USENIX Workshop on Microkernel and Other Kernel Architectures, USENIX, April 1992.

Northcutt 87 J. Northcutt, Mechanisms for Reliable Distributed Real-Time Operating Systems – The Alpha Kernel, Academic Press, ISBN 0-12-521690-4, 1987.

OMG 99 Real-Time CORBA 1.0 Specification, ORBOS/99-02-12, and Errata, ORBOS/99-03-29, Object Management Group, 1999.

OMG 01 Real-Time CORBA 2.0: Dynamic Scheduling Specification, ORBOS/01-08-34, Object Management Group, 2001.

Open Group 98 MK7.3a Release Notes, The Open Group, 1998.

Add to Favorites

Print Page

Download a PDF copy of this page

 

New DARPA ARMS program on real-time distributed systems

New presentation about the Distributed Real-Time Specification for Java

IEEE Transactions on Computers, special section on Asynchronous Distributed Real-Time Systems, 8/02, Jensen & Ravindran, Eds.

Echelon 4's new web site is on-line. Echelon 4 applies system science -- computing, communications, and automatic control -- to the design and development of real-time "Enterprise Operating Systems."

Two new public papers about the Distributed Real-Time Specification for Java are here and here and both are on drtsj.org

The Real-Time CORBA 2/Dynamic Scheduling  Specification has been approved: OMG Document orbos/2001-04-01

 
 

View Site Changes  RSS feed | Site Updated 03/30/2008 |  Legal