Introduction to the Theory of Sets of Probabilities

(Probability Intervals, Belief Functions, Lower Probability, Lower Expectations, Choquet Capacities, Robust Bayesian Methods, etc...)

Fabio Cozman

What I hope is that these pages contain a brief but reasonably general presentation of the foundations of theories that handle sets of probability distributions. There are many such theories: Quasi-Bayesian theory, Lower Probability, Lower Expectations, Choquet Capacities, Robust Bayesian Methods, and some other similar theories. I feel sorry that I can't possibly refer to all the good work that has been published on these topics --- I attempted to refer to some representative papers and books, mostly of foundational character, and most of them have been written prior to 1993.

There are two main pieces of information off of this page:

The one-page Theory in a nutshell.
A longer, but still quite informal, description of the theory of sets of distributions (and related theories).

There are two other sources of material on these things in the web; you can find a fair amount of material at the Society for Imprecise Probability Theory and Applications.

Most of the content in this web site talks about foundational concepts and basic results. A collection of practical results would also be useful, but I think the first step must be to present a consistent theory.

I focus only on proposals that maintain the basic infra-structure of Bayesian theory and augment/enrich/generalize it. Proposals that require an entirely different view of uncertainty (like Dempster-Shafer theory), or which handle other concepts (like fuzzy logic) are not covered here. Among all the possible theories that use sets of probability distributions to represent uncertainty, there is a particular axiomatization that is very simple to present and understand. It is the axiomatization given by two statisticians, Giron and Rios, in 1980 [2]. Their paper is very nice; they call the resulting theory Quasi-Bayesian theory.

The original theory by Giron and Rios was quite elegant but did not include discussions of conditioning and independence; they also did not have a clear statement of decision criteria. I try to present their theory and fill in those gaps with ideas that have been proposed in a variety of contexts since 1980; the goal is to present the theory in a unified format so that its scope can be better analyzed.

I'm aware that there is a lot of excellent work that I have not reviewed; I apologize for omissions!

There are postscript versions of the content that you can reach from this page.

Why so many words in the sub-title of this page?

There are several similar generalization of probability that use sets of probability distributions:

The theories of Lower Expectations and Lower Previsions use intervals of expected losses to generate sets of distributions.
A slightly different approach in theories that impose axioms on events. The resulting structures are generalizations of probability called Lower Probabilities or Choquet Capacities. Special cases of such structures are the Monotone Choquet Capacities and the Lower Envelopes. Infinitely Monotone Choquet Capacities are sometimes called Belief functions. Such structures can be in most cases represented by convex sets of probability distributions.
From a slightly different perspective, many statisticians use sets of distributions to study the robustness of a statistical analysis.

These theories have points of divergence, but this work tries to emphasize the points where there is agreement.

I hope these pages are useful for anyone interested in sets of distributions, but because I work more in Robotics and Artificial Intelligence, I can better understand the theory from this point of view.

My work with sets of probabilities

I work both on foundational and algorithmic issues (with emphasis on the later). (Note: Preprints for most of the papers discussed below are available from my publications page.)

Most of my work on this theory can be grasped through the papers

F. G. Cozman. Credal networks, Artificial Intelligence Journal, vol. 120, pp. 199-233, 2000.
F. G. Cozman. Computing posterior upper expectations, International Journal of Approximate Reasoning, vol. 24, pp. 191-205, 2000.
F. G. Cozman. Calculation of Posterior Bounds Given Convex Sets of Prior Probability Measures and Likelihood Functions, Journal of Computational and Graphical Statistics, vol. 8(4), pp. 824-838, 1999.

Most of the results in the second paper, and a summary of the third paper, can be found at

F. G. Cozman. Computing Posterior Upper Expectations, First International Symposium on Imprecise Probabilities and Their Applications (ISIPTA), pp. 131-140, Ghent, Belgium, June/July, 1999.

My interests in the theory of sets of probability follow some general lines. First, I'm interested in efficient algorithms to obtain posterior quantities. I'm now trying to extend such algorithms to deal with more complicated cases; for example, situations where observations may have probability zero, and situations where judgements of independence are stated. Second, I'm interested in concepts and properties of irrelevance/independence connected to the theory of sets of probabilities. A starting point on this:

F. G. Cozman. Separation Properties of Sets of Probability Measures, XVI Conference on Uncertainty in Artificial Intelligence, pp. 107-115, San Francisco, California, July 2000.
F. G. Cozman. Irrelevance and Independence Axioms in Quasi-Bayesian Theory, European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU), London, England, published in Symbolic and Quantitative Approaches to Reasoning with Uncertainty, A. Hunter e S. Parsons (eds.), pp. 128-136, Springer, July, 1999.

I have also pursued some different directions, looking at the problem of sequential-decision making associated with observations, and also exploring the possibility of learning convex sets of probability from data. While working with these things, I coded the JavaBayes system, most known for its ability to handle Bayesian networks (JavaBayes versions up to 0.347 contains facilities for specification of local and global perturbations in Bayesian networks).

Basic references

Most of the foundational issues in the theory of sets of probabilities can be absorbed through the work of two researchers:

I. Levi, whose The Enterprise of Knowledge [3] is a great analysis of many philosophical issues related to the theory. If you want Philosophy, you probably want to read this.
P. Walley, whose Statistical Reasoning with Imprecise Probabilities [4] is a tremendous summary of all that has been said about the theory in the field of Statistics (and also tries some connections with Economics and Artificial Intelligence).

These books are very dense and require some background. I'm trying to construct these informal pages for the reader that is not entirely familiar with the theory of sets of probabilities, but has already learned some probability and decision theory.

Here are four references that capture a vast portion of the theory of sets of probabilities:

1: J. O. Berger. Statistical Decision Theory and Bayesian Analysis. Springer-Verlag, 1985.
2: F. J. Giron and S. Rios. Quasi-Bayesian behaviour: A more realistic approach to decision making? In J. M. Bernardo, J. H. DeGroot, D. V. Lindley, and A. F. M. Smith, editors, Bayesian Statistics, pages 17-38. University Press, Valencia, Spain, 1980.
3: I. Levi. The Enterprise of Knowledge. The MIT Press, Cambridge, Massachusetts, 1980.
4: P. Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, New York, 1991.

Thanks for the visit; you're visitor [count]

since July 15, 1996.

Fabio Cozman, fgcozman@usp.br

Acknowledgement

This work started at the Robotics Institute at the School of Computer Science, Carnegie Mellon University. I had a scholarship from CNPq (Brazil). Thanks to these organizations!