hummel@RELAXATION.NYU.EDU (Bob Hummel) (03/31/88)
Reading the renewed discussion on the meaning of various calculi of
uncertainty reasoning prompts me to inject a note on the Dempster/Shafer
formalism. I will summarize a few observations here, but direct interested
readers to a joint paper with Michael Landy, which appeared this month in
IEEE Pattern Analysis and Machine Intelligence [1].
These observations were inspired, oddly enough, by reading Dempster's
original paper, in which he introduces the now-famous combination formula
[2]. It seems logical that the original source should contain the motiva-
tion and interpretation of the formula. But what is odd is that the
interpretation migrated over the years, and that the clear, logical founda-
tions became obscure. There have been numerous attempts to reconstruct an
explanation of the true meaning of the belief values and the normalization
terms and the combination method in the Dempster/Shafer work. Some of
these attempts succeed reasonably well. Along these lines, I think the
best work is represented by Kyberg's treatment in terms of extrema over
collections of opinions [3], and Ruspini's work connecting Dempster/Shafer
formalism to Bayesian analysis [4]. Shafer also constructs what he calls a
"canonical example" which is supposed to be used as a scale to invoke
degrees of belief into general situations, based on "coded messages." The
idea, described for example in [5], is isomorphic to the observations made
here and in [1] based on the foundations laid by Dempster [2]. The problem
is that none of these interpretations lead to generalizations and explain
the precise intent of the original formulation.
Before giving the succinct interpretation, which, it turns out, is a
statistical formulation, I should comment briefly on the compatibility of
the various interpretations. When lecturing on the topic, I have often
encountered the attitude that the statistical viewpoint is simply one
interpretation, limited in scope, and not very helpful. The feeling is
that we should be willing to view the constructs in some sort of general
way, so as to be able to map the formalism onto more general applications.
Here, I believe, is one source of the stridency: that if I would only per-
mit myself to view certain values as subjective degrees of belief in anal-
ogy with some mystical frame of reference, then I will see why certain
arguments make perfectly logical sense. Accordingly, our treatment of the
statistical viewpoint is introduced in the framework of algebraic struc-
tures, and our results are based on proving an isomorphism between an
easily interpreted algebraic structure and the structure induced by the
Dempster rule of combination acting on states of belief. So when I use the
terms "experts" and "opinions" and related terms below, an alternate
interpretation might easily use different concepts. However, any interpre-
tation that truly captures the Dempster/Shafer calculus must necessarily be
isomorphic, under some mapping identifying corresponding concepts, to the
interpretation given here.
Here is the formulation. Consider a frame of discernment, here
denoted S. Instead of giving a probability distribution over S, we con-
sider a collection of experts, say E, where each expert e in E gives an
opinion. The opinions are boolean, which is to say that expert e declares
which labels in S are possible, and which are ruled out.
For the combination formula, suppose we have two collections of
experts, E1 and E2. Each expert in E1 and each expert in E2 expresses an
opinion, in a boolean fashion, over the labels in S. (An important point
is that the frame of discernment S is the same for all collections of
experts). We now wish to combine the two collections of experts. We con-
sider the cross product E1 X E2, which is the set of all committees of two,
with a pair of experts comprised of one expert from E1 and one expert from
E2. For any such committee, say (e1,e2), we define the committee's boolean
opinion to be the logical intersection of the two composing opinions. Thus
the committee says that a label is possible only if both committee members
say that the label is possible. We regard the collection of all such com-
mittees and their opinions to be a new collection of experts E, with their
boolean opinions.
We have defined an algebraic structure. Call it the the "Boolean
opinions of Experts." The elements of this space consist of pairs (E,f),
where E is a collection of experts, and f is their opinions, formed as a
map from E to a vector of boolean statements about the labels in S. Now
define an equivalence relation. We will say that two such elements are
equivalent if the statistics over the collections of experts, among those
experts giving at least one possibility, are the same. By the statistics,
we mean the following. Let E' be the subset of experts in E for whom at
least one label is possible. For any given subset A of S, let m(A) be the
percentage of experts in E' that designate precisely A as the subset of
possible labels. Note that m of the empty set is 0, by the definition of
E', and that m forms a probability distribution over the power set of S.
It turns out that m is a mass function, used to define a belief state
on S. Further, when sets of experts combine, the statistics, represented
by the corresponding m functions, combine in exactly the Dempster rule of
combination. (This is no accident. This is the way Dempster defined it.)
Accordingly, the set of equivalence classes in the space of "Boolean
opinions of Experts" is isomorphic to the Dempster/Shafer formalism,
represented as a space of belief states formed by mass distributions m.
Some people express disappointment in the Dempster/Shafer theory, when
it is viewed this way. For example, it should be noted that no where does
the theory make use of probabilities of the labels. The theory makes no
distinction between an expert's opinion that a label is likely or that it
is remotely possible. This is despite the fact that the belief values seem
to give weighted results. Instead, the belief in a particular subset A, it
can be shown, corresponds to the fraction of experts in E' who state that
every label outside of A is impossible. The weighted values come about by
maintaining multiple boolean opinions, instead of one single weighted opin-
ion.
In the PAMI paper, Mike Landy and I suggest an extension [1], where we
track the statistics of probabilistic opinions. In this formulation, we
track the mean and covariance of the log's of probabilistic opinions.
Details are in Section 5 of the paper.
In a follow-on paper [6], presented at the 1987 IJCAI, Larry Manevitz
and I extend the formulation to weaken the necessary notion of independence
of information. It is always true that some independence assumption is
necessary. Larry and I defined a one-parameter measure of a degree of
dependence, and show how the formulas tracking means and covariances are
transformed. We also consider a case where we combine bodies of experts by
union, as opposed to cross product.
To those who study these extensions, it will become clear that the
formulas bear some resemblance to treatments of uncertainty based on Kalman
filtering. For specific applications involving the observation of data and
the estimation of parameters, the Kalman theory is certainly to be recom-
mended if it can be applied.
Robert Hummel
Courant Institute
New York University
References
1. Hummel, Robert A. and Michael S. Landy, "A statistical viewpoint on
the theory of evidence," IEEE Transactions on Pattern Analysis and
Machine Intelligence, pp. 235-247 (1988).
2. Dempster, A. P., "Upper and lower probabilities induced by a mul-
tivalued mapping," Annals of Mathematical Statistics Vol. 38 pp. 325-
339 (1967).
3. Kyburg, Jr., Henry E., "Bayesian and non-bayesian evidential updat-
ing," University of Rochester Dept. of Computer Science Tech. Rep. 139
(July, 1985).
4. Ruspini, E., Proceedings of the International Joint Conference on
Artificial Intelligence, (August, 1987). Also SRI Technical Note 408.
5. Shafer, Glenn, "Belief functions and parametric models," Journal of
the Royal Statistical Society B Vol. 44 pp. 322-352 (1982). (Includes
commentaries).
6. Hummel, Robert and Larry Manevitz, "Combining bodies of dependent
information," Tenth International Joint Conference on Artificial
Intelligence, (August, 1987).