What is the Best Interpretation of Probability?
Providing an interpretation of probability (in the context of this course on philosophy) is providing an analysis of the concept of probability. A “good” interpretation would be one that meets some reasonably stringent criteria. Of course, as an exercise in philosophical understanding, any candidate interpretation should be unambiguous, non-circular, and use well-understood primitives. But to provide some criteria more specifically applicable to “probability”, I will draw upon the work of Wesley Salmon(1) and Alan Hajek(2), and identify six criteria that a “good” interpretation of probability must meet. The “best” interpretation, then, will be that one which, in my opinion, “best” meets these criteria:
Admissibility: A fundamental requirement for any concept of probability is that it satisfy the mathematical relations specified by some calculus of probability.
Ascertainability: There must be some method by which, in principle at least, we can ascertain the values of probabilities.
Non-Triviality: It is fundamental to our notion of probability that, at least in principle, it can take intermediate values. (Not just the extremes of 0 and 1.)
Applicability to Frequencies: There must be some explicable relationship between probabilities and (long-run) frequencies. Also, not quite equivalently, between probabilities and population proportions. Among other things, it should explain why more probable things/events occur more frequently than less probable things/events.
Applicability to Rational Belief: There must be some explicable relationship between probabilities and the degrees of belief, or credences, of rational agents. Among other things, it should explain why we consider it “rational” when our degrees of belief, or credences, correspond to our notion of probability.
Applicability to Inductive Inference: A good interpretation of probability should illuminate the distinction between ‘good’ and ‘bad’ inductive inferences.
The various interpretations of probability that one can find in the literature, can be viewed through the lens of two different approaches:
Epistemological interpretations: probability is primarily related to human knowledge or belief.
Objective interpretations: probability is about a feature of reality independent of human knowledge or belief.
Against this framework of six criteria and two approaches, I am going to examine five different interpretations of probability that have been offered in the literature, and conclude with a final sixth interpretation that combines elements of some of the others, and which I consider to be the “best” available interpretation of probability. The six different interpretations are:
-
- The Classical Interpretation
- The Frequency Interpretation
- The Logical Interpretation
- The Propensity Interpretation
- The Subjectivist Interpretation
- The Conceptualist Interpretation
(a) The Classical Interpretation
This interpretation gets its name from the early work of LaPlace(3), Fermat, Pascal, Bernoulli, Huygens and Leibniz, among others, in the 1700’s and early 1800’s. The interpretation was initially developed as in an effort to understand games of chance. Being a product of its times, it is based on the Newtonian premise that the World is deterministic. The probability concept is therefore an epistemic concept. It assigns probabilities in the absence of any evidence (“total ignorance”) or in the presence of symmetrically balanced evidence (“the principle of indifference”). It is, naturally enough given its initial impetus, especially well suited to the analysis of circumstances where the range of possible alternative outcomes is well understood, and there is no basis upon which to prefer one outcome over any other. And it is also well suited, for similar reasons, to the analysis of probabilities related to population proportions. The “standard” mathematical axiomatization of the probability calculus by Kolmogorov(4) is based on the Classical Interpretation of probability. The Classical Interpretation and the Kolmogorov axiomatization form the foundation of much of modern mathematical statistics.
However, the Classical Interpretation suffers from some challenges. There seems to be no way to conceive of the probability of a unique event – like, say, an earthquake that damages a nuclear reactor (Fukushima?). There is no population of possible cases across which a ratio can be established, or across which symmetrically balanced evidence can be evaluated. On the other hand, the Classical Interpretation does not seem to properly deal with frequency information. The two possible outcomes from flipping a coin, over which the indifference principle is applied, will remain “head” and “tail”, regardless of the evidence from a history of flips. There appears to be no means of adjusting the initially assumed 50:50 probabilities to account for a (say) 60:40 result of a sequence of trials.
And Bertrand’s paradox(5) shows that the Principle of Indifference can yield inconsistent results depending on how one chooses to describe the circumstances. Many situations can be described in different, but equivalent, ways that generate different populations of alternative outcomes across which the Indifference Principle is to be applied. Consider an example adapted from van Fraassen(6). A machine produces cubes with side-length randomly and uniformly distributed between 1 and 2 centimeters. What is the probability that a randomly chosen cube has side-length between 1.0 and 1.5 centimeters? The answer would seem to be ½. But consider the same production run from this perspective — the machine produces cubes with face-area uniformly distributed between 1 and 4 square centimeters. What is the probability that a randomly chosen cube has face-area between 1.0 and 1.25 square centimeters? Now the answer would seem to be ¼. Here we have one situation that yields two different probabilities depending on how one considers it.
Classical probability thus appears to be context dependent. While not fatal flaw, and thoroughly in keeping with the understanding of its progenitors, it does mean that on the Classical Interpretation, probability must be an epistemic concept and not part of the objective world.
(b) The Frequency Interpretation
In response to the apparent difficulties faced by the Classical Interpretation, the Frequency Interpretation was developed in the late 1800’s and early 1900’s by the likes of John Venn, Richard von Mises, John Maynard Keynes, and Hans Reichenbach. The Frequency Interpretation was developed to address certain of the problems experienced with the Classical treatment of frequency information. For Frequency theorists, probability is taken to be a mathematical concept dealing with mass random events — events unpredictable in detail but having a numerical proportion in the long run that is predictable. It is an application of the Classical treatment of population proportions to populations (possibly infinite) of outcomes. Unlike the Classical Interpretation, because the relevant frequencies are taken to be objective features of reality, this is an objective interpretation.
The primary problem with the Frequency Interpretation, is that there does not appear to be any way to conceive of the probability of a unique event. Similarly, there does not appear to be any way to conceive of the probability of a potential or hypothetical event with no existing frequency basis — what is the probability of heads for an un-flipped coin? The only response seems to demand the objective existence of hypothetical or potential frequencies (and frequency limits). This ontological profligacy makes many philosophers uncomfortable.
Another concern for many is that the Frequency Interpretation understands probability for infinite series in terms of the limit (as the population goes to infinity). While not an ontological problem, as physics contains many examples of such limits, it raises the question of how we learn what the limit is. In an infinite sequence of coin-tosses, it remains possible that for however many tosses we have actually observed, the observed proportion of heads might not be anywhere near the “real limit” if the coin were to be tossed an infinite number of times. How then, do we determine the limit? The frequency theorists can argue that the limit, being objective, none-the-less exists. But it renders the probability potentially unascertainable.
(c) The Logical Interpretation
The Logical Interpretation of probability was a product of the Logical Positivist program, in the first half of the 20th Century, to reduce all of philosophy and mathematics to logic and language. The most notable authors of this program were Keynes, Jeffrey, and most famously, Carnap(7).
The basic idea of the Logical Interpretation is that probability is the measurement of partial logical entailment (with probabilities 1 and 0 as limiting cases) – the measurement of the evidential link between evidence E and the hypothesis H supported by E. As such the Logical Interpretation tries to provide a framework for inductive logic.
Carnap’s version is based on a formal language of entity names and predicates, which together form a set of “state descriptions”. (Each “state description” is a logical concatenation across all entities “a” and all predicates “F” of either “Fa” or -Fa”.) Carnap then defines “structures” on this matrix of state descriptions, and assigns probability to them according to the number of state descriptions within each structure. Hence probability assignments are a priori, as in the Classical Interpretation.
There are many problems with the Logical Interpretation, all having to do with the specificity of the Logical Interpretation to the language within which probability is to be understood. It is entirely unclear how or whether such a language-specific interpretation can be ported to the common English environment. Given the generally accepted failure of the Logical Positivist program, the Logical Interpretation has few advocates today.
(d) The Propensity Interpretation
This interpretation of probability owes its modern revival to Karl Popper(8), although an earlier known description was by Charles Sanders Peirce(9). Since Popper, it has been developed in a number of different flavours by quite a number of philosophers. David Miller and Donald A. Gillies, for example, have proposed propensity theories somewhat similar to Popper’s, in that propensities are defined in terms of long-run relative frequencies.(10)
On this interpretation, probability is an objective physical disposition of reality to produce outcomes of a certain kind. Presumably, such dispositions are causally effective. This interpretation allows one to make sense of single case-probabilities, as is required for certain quantum theory applications of probability — the initial focus of Popper’s effort. The propensity of a fair coin to come up with tails is ½ because of the objective nature of the coin — whether or not it is ever tossed.
However, it is quite unclear just what propensities actually are. It is therefore hard to see how this interpretation provides any clarification of what probability is. Propensity is hardly a “well understood primitive”. And the asymmetric nature of causation gives propensities some flavour of asymmetry, so that it can be hard to understand both P(A|B) and P(B|A) in the same terms.
(e) The Subjectivist Interpretation
The most widely known advocate of the Subjectivist Interpretation is Bruno de Finetti. In his Theory of Probability(11) he begins with the bold statement “Probability does not exist”. The subjectivist identifies probabilities with degrees of belief, degrees of confidence, credences, or partial beliefs of “suitable” agents. Suitable agents must be rational in a strong sense – logically consistent, having beliefs which satisfy the axioms of a probability calculus, and which are updated by Bayesian conditioning.
Many subjectivists (including de Finetti) analyze degrees of belief (probabilities) in terms of the hypothetical betting behavior of an ideal rational agent. Consider a bet where one wins W if A is true and loses L if it is false. The probability you attribute to A is what you think the fair value of L expressed in units of W, that is, the value of L if you did not know which side of the bet you would have to take. The problem with this approach is that most people are not “ideal rational agents”. Nor are most people ideal — for most people the activity of betting seriously distorts the evidence of their degree of belief in the underlying proposition. So the Subjectivist Interpretation has a clear normative component. And it is less than clear that the probabilities involved can be ascertained through the distortions of normal tendencies to logical inconsistency and betting biases.
One of the major challenges faced by the Subjective Interpretation is that it does not appear to properly treat situations of a clearly objective and a priori nature. If a jar contains a thoroughly mixed collection of 50 red and 50 white balls, the probability of picking a red ball does not seem to depend on one’s beliefs about the contents of the jar. Lewis’ “Principal Principle”(13) cannot be applied here without begging the question, since the objective chance is just what the Subjectivist is denying exists.
(f) The Conceptualist Interpretation
The Conceptualist approach is to treat “probability” as a theoretical construct — a concept — that is constructed in order to describe certain closely similar features of the real world. It is therefore an epistemic interpretation of an objective feature of the world. Most importantly, it is not a univocal concept. It is a family concept — somewhat like the concept “game”(14). What is a “game”? No single coherent definition can be provided. Wittgenstein’s point was not that it is impossible to define “game”, but that we don’t have a single definition, and we don’t need one. Even without a single unitary definition, because we have a family conception, we use the word successfully. The same argument applies, according to the Conceptualist Interpretation, to the concept of probability. It is a concept designed to cover all those situations wherein there are a number of possible outcomes (P and Not-P at the limit), and we lack sufficient information to tell which outcome is going to occur. Based on what information we do have, we “guess”.
Members of this family include the a priori “population of indifference” and “population proportions” of the Classical Interpretation. As a theoretical construct, it can acknowledge the limit-frequencies of the Frequency Interpretation, while accepting that the observed frequencies are but evidence for the limit-frequency, and can be wrong in well understood ways. And it can incorporate the causal-linkage notions of the Propensity Interpretation without committing to an ontology of objective propensities. And even more importantly, it can incorporate the Subjectivist Interpretation notion of “degrees of belief” because now the “Principal Principle” is no longer begging the question.
By melding all of these features into one coherent concept, the strengths of one member of the family can be brought to bear on resolving the difficulties faced by other members of the family. As a “family” concept, the Conceptualist Interpretation is unambiguous, non-circular, and uses well-understood primitives. The other well-explored interpretations have drawn clear boundaries around limited portions of the concept. The Conceptualist Interpretation of probability is a concept that is:
- Admissible, because it satisfies the mathematical relations specified by some calculus of probability.
- Ascertainable, because where there are objective features of reality that determine the probabilities, we can ascertain them; and where there are not, we can ascertain the relevant normative “degrees of belief” that ought to apply.
- Non-Trivial, because probability can take intermediate values.
- Applicable to Frequencies, because of the Frequentist elements it incorporates.
- Applicable to Rational Belief, because of the Subjectivist elements it incorporates.
- Applicable to Inductive Inference, because of the Propensity elements, and the incorporation of the causality asymmetry.
It is thus the “best” interpretation of probability.
Notes & References
(1) Salmon, Wesley; The Foundations of Scientific Inference, University of Pittsburgh Press, 1967, ISBN 978-0-822-95118-6. Pg 64.
(2) Hajek, Alan; “Interpretations of Probability” in The Stanford Encyclopedia of Philosophy (Spring 2010 Edition), Edward N. Zalta (ed.), URL=http://plato.stanford.edu/archives/spr2010/entries/probability-interpret/
(3) LaPlace, Marquis de Pierre Simon; A Philosophical Essay on Probabilities, Nabu Press, 2010, ISBN 1-172-26405-8.
(4) Kolmogorov, Andrey Nikolayevich; Foundations of the Theory of Probability, Chelsea Publishing Company, New York, New York, 1956 (1933).
(5) introduced by Joseph Bertrand in his work Calcul des Probabilités in 1888.
(6) van Fraassen, Bastiaan C.; Laws and Symmetry, Clarendon Press, Oxford, England, 1989. ISBN 0-198-24860-1.
(7) Carnap, Rudolf; Logical Foundations of Probability, 2nd edition, The University of Chicago Press; Chicago, Illinois. 1962. ISBN 0-226-09343-3.
(8) Popper, Karl; “The Propensity Interpretation of Probability” in The British journal for the Philosophy of Science, Vol 10 (1959), Pgs 25-42.
(9) Miller, Richard W.; “Propensity: Popper or Peirce?” in The British Journal for the Philosophy of Science Vol 26, No 2 (1975), Pgs 123–132.
(10) Wikipedia contributors; “Propensity probability” in Wikipedia, The Free Encyclopedia. URL=<http://en.wikipedia.org/w/index.php?title=Propensity_probability>.
(11) de Finetti, Bruno; Theory of Probability, Volume 1, John Wiley & Sons, New York, New York, 1990.
(12) Wikipedia contributors; “Bayesian probability” in Wikipedia, The Free Encyclopedia. URL=http://en.wikipedia.org/w/index.php?title=Bayesian_probability
(13) Lewis, David; “A Subjectivist’s Guide to Objective Chance” in Philosophical Papers, Volume II. Oxford University Press. Oxford, England, 1986.
In a simplified version, this principle says that if you know the objective chance of some inherently chancy outcome, then your degree of belief in that outcome should equal the chance. If you know the distribution of balls in the jar, then your degree of belief should reflect the actual chance of drawing a red ball.
(14) Wittgenstein, Ludwig; Philosophical Investigations (1953). Blackwell Publishing, London, England. 2001. ISBN 0-631-23127-7. See Sect 3.