









|
 |
|
Predictability
Introduction
This page primarily addresses
two major misconceptions in both the researcher and
practitioner parts of the real-time computing community:
 | "predictability"
is often recognized as an essential concept and property
in real-time computing, but it is almost never defined,
and it is under-employed as a property that can be
measured and adjusted. |
 | "predictability"
and "determinism," and "predictable" and
"deterministic," are widely and erroneously imagined to
be synonymous. |
Conventional real-time computing practice
applies the terms "predictable" and "deterministic" solely
to cases where there are known fixed least upper bounds on
latencies for interrupt services, software services, etc.
That usage by practitioners is correct (although the
two terms are not equivalent) regarding those latencies. It
may also be correct regarding the activity or application or
system, given the (typically unspecified or at least
under-specified) necessary presumptions about the system and
its execution environment: note that starting an activity's
execution with deterministic latency does not
necessarily imply that the activity will complete at an
acceptable, much less deterministic, time (due to resource
conflicts etc.).
Conventional real-time computing theory
applies the terms "predictable" and "deterministic" almost
exclusively to cases where all deadlines are always met --
meaning all hard deadlines, although almost all such theory
presumes the special case of there being only hard
deadlines.
That usage by researchers is correct
(although the two terms are not equivalent), given the
(typically unspecified or at least under-specified)
necessary presumptions about the system and its execution
environment.
But the converse -- i.e., that no other cases
are predictable or deterministic -- which practitioners and
researchers explicitly or implicitly assert, is incorrect,
as explained below.
Predictability of Sequence Timeliness
Optimality
Informally, something is
predictable to the degree that it can be known in advance.
Making a prediction involves using information with various
kinds and degrees of uncertainties, to either deductively or
inductively draw
conclusions having various kinds and degrees of
uncertainties.
In computing systems having time-constrained
actions, the fundamental predictions are about sequence
timeliness optimality. The information needed for sequence
optimality timeliness
predictions includes:
 |
the actions’ properties, resource
requirements, and dependencies; |
 |
various aspects of the
system's structure and behavior; |
 |
the characteristics of the
system’s execution environment; |
 |
predictability of timeliness
optimality of lower level operations, such as service
latencies. |
The conclusions are used for resource
management to achieve the desired timeliness and timeliness
predictions.
In other applications or systems
whose performance is characterized in terms of throughput,
their throughput and predictability of throughput are only
performance optimizations.
For example, with deadline time constraints,
one type of information that a system designer or user might want
to have some
application-specific form and degree of predictability of
could be (in order
of increasing detail and difficulty to predict)
 |
whether or not all deadlines will be met
(the hard real-time case in the research, as opposed to
the practitioner, community) |
 |
how many deadlines will be met |
 |
what the bounds (minimum, maximum, etc.),
central tendencies (mean, mode, median, etc.), and
dispersion (variance, coefficient of variation, kurtosis,
etc.) of tardiness, number of missed deadlines, etc. will be |
 |
what the time-constrained action completion
times will be. |
An application or system is a
real-time one in proportion to the degree that
predictability of timeliness optimality is part of the
application or system's logic, and thus a correctness
property. Contrary to popular misunderstanding, this is
independent of
 |
the extent to which the system
is a "hard" or a "soft" real-time one |
 | the magnitudes of service latencies or
time constraints (e.g., deadlines).
|
Predictability and Determinism
One of the most wide-spread misconceptions in the field of
real-time computing is the confusion of "predictability"
with "determinism," and "predictable" with "deterministic."
Predictability is a
continuum, having maximum predictability - called
determinism - as one end-point, and the rest of the
continuum being degrees of predictability (or
non-determinism), and having minimum predictability (or
maximum non-determinism) as the other end-point. A measure
for predictability (or non-determinism), and
characterization of the minimum predictability (maximally
non-deterministic) end-point, depend on the particular
predictability model - an example measure in
one class of probability-based predictability models is coefficient of variation, as discussed
below.
In that sense (for simplicity here we are
disregarding the semantics of "determinism" in the contexts
of chaos theory and quantum mechanics, not to mention
philosophy), something (e.g., that all deadlines will be
met, the mean lateness) is deterministic if it can be
known with certainty, in advance of being observed.
A deduction about something being deterministic is postulated on the basis of presumptions about specific
relevant conditions being deterministic throughout the
future operational life of the system (e.g., thread
properties, system and execution environment properties).
The suitability of a deterministic predictability model is
strongly dependent on how realistic the coverage of these presumptions
is. In most non-trivial real-time systems, there are
inherent dynamic uncertainties in the environment, system,
and load that severely limit what can be deterministic.
A sequencing criterion's timeliness
optimality factor is orthogonal to its timeliness optimality
predictability factor. These generally must be traded off
against one another according to application-specific
requirements. For example, it may be necessary to choose
between
 | one sequence that results in better
optimality (e.g., fewer missed deadlines or lower mean
tardiness), but with worse predictability (e.g., higher
coefficient of variation of the number of missed deadlines
or of mean tardiness)
|
 | another sequence that results in worse
optimality (e.g., more missed deadlines or higher mean
tardiness), but with better
predictability (e.g., lower coefficient of variation of
the number of missed deadlines or of mean tardiness). |
Non-deterministic systems generally are more
difficult to reason about than are deterministic systems.
But, as we shall see in the page on
Sequencing Algorithms
and Heuristics, there are a number of important
deterministic sequencing problems - including minimizing the
weighted (e.g., by importance) number of tardy threads, and
minimizing the weighted total thread tardiness - that are
NP-hard, but which have tractable stochastic counterparts.
The fact that most systems of every kind in
the world are non-deterministic has been an incentive
(outside of real-time computing, unfortunately) for the development of many
powerful tools for dealing with non-determinism - stochastic
scheduling theory, probability theories, and constraint
based scheduling are familiar
examples (among many others) directly relevant to real-time
computing. However, most of these tools have not yet become
part of the repertoire of either the researchers or the
practitioners in the real-time computing field. That is
because both of those communities have been slow to perceive
and respond to the needs of the majority of real-time
systems (i.e., beyond classical "real fast" static device-level
autonomic monitoring and control).
Despite it being a fundamental concept, “predictability” is virtually never
actually defined by either
practitioners or researchers in the real-time computing
community. There are two reasons for this.
The first reason is that
traditional real-time computing theory and practice are
historically focused mostly on simple, periodic, static (no
pertinent uncertainties), centralized, device-level
subsystems.
Given that context,
researchers usually presume that the sequencing
criterion timeliness optimality factor is only to meet all
deadlines (presuming the special case that all deadlines are
hard ones), and that the predictability of that factor is
limited only to knowing whether or not all the deadlines
will always be met
– i.e., the classical
hard real-time case discussed on a subsequent page.
Predictability of sequence timeliness in that case can
usually be accomplished with little or no insight into the
concepts of predictability
– e.g., analytically using rate- or
deadline-monotonic analysis, or empirically using a cyclic
(frame-based) executive.
In general, and especially in
more dynamic and asynchronous cases such as exist above the
device level, that hard real-time presumption is invalid
– so predictability of sequence timeliness involves
uncertainties, and consequently is greatly facilitated by a
deeper understanding of the concepts of predictability.
And given that usual context
of simple, periodic, static (no pertinent uncertainties),
centralized, device-level subsystems, real-time
practitioners focus on upper bounds on latencies, such as
for interrupt handling and operating system services.
Predictability there typically is thought of as having a'
priori knowledge of least upper bounds, although
occasionally it is thought of in terms of measured
distributions (not probability distributions) of the
latencies.
That leads to the second reason why “predictability”
is rarely defined at all, and never generally and precisely, in the
real-time computing community (among some others). Although the
meaning of "predictability" may seem intuitive, it is a
very complex concept in principle and practice.
Probability Models for
Predicting Timeliness Optimality
Prediction is based on some
particular model of the reality of interest
– the
information used is generally incomplete and inaccurate and
hence modeled in some way, and the predicting is performed
according to the rules of the model. The accuracy of the
prediction depends on how well the modeled information and
prediction process reflect the reality of interest. These
models are often informal and even implicit. For the purpose
of predicting timeliness optimality, especially in dynamic
computing systems, formal models are necessary.
A formal model for making predictions
involves an appropriate:
 | formal representation of
the observed or presumed information and its uncertainties; |
 | formal representation of
the predictions and their uncertainties; |
 | calculus for
analytically manipulating the observed or presumed
information and its uncertainty representation to arrive
at predictions. |
An example of a formal model
for hard real-time computing (as properly defined in the
research community) that supports predicting timeliness
(specifically, whether all hard deadlines will be met) is rate-monotonic analysis [].
The most common formal models for
making predictions under uncertainty use classical
probability theory to represent and manipulate uncertainties
both in the observed or presumed information and in the
predictions. Queuing theory is a familiar form of
probability-based modeling for predicting types of
performance important in non-time-constrained systems (e.g.,
throughput, latencies, resource utilization).
Probability-based predictability can be surprisingly
counter-intuitive, as illustrated by the simple example
below.
The most predictable probability distribution is intuitively
the deterministic (or constant) distribution (p = 1 at X = k).
That intuition has been formally
recognized since the 18th century as J. Bernoulli's
Principle of Non-Sufficient Reason, now better known as the
Principle of Indifference [Keynes 21]. We return to it
later on this page.
Thus, it might seem equally intuitive that the least
predictable distribution would be the uniform distribution
(p = P for all X).
But consider an
alternative, and also intuitive, perspective on the
predictability of distributions in the context of classical
probability theory. That is, that the predictability of a
probability density function is inversely proportional to
its variability
– as measured
(for the usual reason), by its coefficient of variation Cn
= variance/mean2. From that perspective, the
deterministic distribution, whose Cn
= 0, is still the most predictable. But the standard form of
the uniform distribution has a relatively low Cn
= 0.58 and thus relatively high predictability; and many
distributions exist that have higher Cn’s
and are thus less predictable
– for example, the extreme
mixture of exponentials distribution has an arbitrarily
large Cn
and is thus arbitrarily unpredictable.
Those two perspectives differed on what the least
predictable probability distribution is, but they agreed on
what the most predictable distribution is, based on their
mutual acceptance of the Principle of Indifference.
However, as intuitive as this principle may seem, it leads
to a number of insoluble paradoxes which reveal it to be a
heuristic that may be useful for suggesting hypotheses, rather than a
logical principle
– and thus inapplicable to many
continuous cases and even to certain discrete cases.
But the classical interpretation of probability (in which
probabilities have comparable numerical values) requires
that the Principle of Indifference be a logical principle.
This contradiction led to the subjective and frequentist interpretations of
probability (and subsequently others). [Gillies 2000]
provides a clear exposition of this topic.
Subjective interpretations
of probability are more effective then the classical
interpretation as the basis for models to perform
predictions about timeliness in asynchronous dynamic
real-time computing systems.
[Revised to this point on 7/26/03 -- to
be continued...]

It widely taken for granted
that uncertainty is adequately captured by probability
theory. However, it is more and more recognized that the
concept of uncertainty is too broad to be captured by
probability theory alone.
In many situations, the available evidence does not
determine a unique probability function. In classical probability
theory, elementary events are required to be pairwise
disjoint and the probability of each is required to be
expressed precisely by a real number in the unit interval
[0, 1]. This precision requirement of classical probability
theory, which is a consequence of the additivity axiom, is
often difficult to satisfy. This difficulty may be a result
of unavoidable measurement errors, insufficient statistical
information, missing data, conflicting evidence, etc.
Choosing a so-called "least informative" probability
function satisfying the evidence may be the best general
solution, in case one insists on using probability functions
to represent belief states. However, it may be better to
represent the available information, and no more than the
available information, by means of probabilistic
constraints. Absence of information, or ignorance, should
lead to indeterminate, imprecise, or partially specified
probabilities
The imprecision in expressing probabilities
introduces a new dimension into the formalization of
uncertainty and uncertainty-based information. It has been
shown [Walley ] that
reasoning and decision making based on imprecise
probabilities satisfy the principles of coherence and
avoidance of sure loss, which are generally viewed as
principles of rationality. Hence, the requirement of
precision (or, equivalently, the additivity axiom) can not
be justified as inevitable for rationality, as had been previously
believed. The soundness of using imprecise probabilities is
based on this important result.
It is recognized that imprecise
probabilities of different types exist and they require
different methodological treatments. The imprecision of
probabilities can be expressed, for example, by closed
intervals of real numbers or by fuzzy intervals with various
special properties. Some particular theories of imprecise
probabilities are already well developed. They include the
evidence theory developed by Shafer [ ], possibility theory
[ ], interval-valued probability distributions [ ], and
fuzzy probability distributions [ ].
[Partially revised to
this point on 1/17/03 -- to be continued...].

Peoples' ability to reason about and manage
timeliness in non-deterministic systems is affected by the
fact that all humans have significant intrinsic limitations
in reasoning about probabilities and risks. Cognitive
scientists have conclusively demonstrated that people (quite
independently of education or intelligence) consistently
make the following mistakes (see [Kahneman et al. 82]
on the Real-Time Resources
page)
 | assess probabilities differently when
given no evidence than when given irrelevant information
|
 | believe small samples are highly
representative of their parent populations
|
 | judge events as likely or frequent if
instances of it are easy to imagine or recall
|
 | avoid risks when seeking gains, but
take high risks to avoid a loss
|
 | over-value the elimination of risk and
under-value the reduction of risk
|
 | hold such faulty judgments with great
confidence. |
These limitations ought to be taken into
consideration when when choosing whether or not to treat an
application or system as deterministic or non-deterministic,
and when specifying and designing non-deterministic systems.
They are even more critical when dealing with requirements
and specifications for mission-critical and safety-critical systems (real-time or
not).
References:
Next:
Hard and Soft Real-Time
Back to:
Timeliness Optimality
or Sequencing
Add to Favorites
Print Page
Download a PDF copy of this page
|
|
 |
 |