Fisher, Bayes, and Predictive Inference †
Abstract
:1. Introduction
2. Fisher on Inverse Probability in the 1920s
≪inverse probability, which like an impenetrable jungle arrests progress towards precision of statistical concepts.≫[10] (p. 311)
≪I may myself say that I learned [inverse probability] at school as an integral part of the subject, and for some years saw no reason to question its validity.≫[11] (p. 248)
≪I must indeed plead guilty in my original statement of the Method of Maximum Likelihood [in 1912] to having based my argument upon the principle of inverse probability; in the same paper, it is true, I emphasized the fact that such inverse probabilities were relative only.≫[12] (p. 326)
2.1. Fisher’s Critique of the Uniform Prior
2.2. Fisher vs. Pearson on the Correlation Coefficient
≪[Pearson’s] comments upon my methods imply such a serious misunderstanding of my meaning that a brief reply is necessary…From this passage a reader, who did not refer my paper, which had appeared in the previous year, and to which the Cooperative Study was called an “Appendix”, might imagine that I had used Boole’s ironical phrase “equal distribution of ignorance”, and that I had appealed to“Bayes’ theorem”. I must therefore state that I did neither.≫
≪Pearson’s energy was unbounded. In the course of his long life he gained the devoted service of a number of able assistants, some of whom he did not treat particularly well. He was prolific in magnificent, or grandiose, schemes capable of realization perhaps by an army of industrious robots responsive to a magic wand.The terrible weakness of his mathematical and scientific work flowed from his incapacity in self-criticism, and his unwillingness to admit the possibility that he had anything to learn from others, even in biology, of which he knew very little. His mathematics, consequently, though always vigorous, were usually clumsy, and often misleading. In controversy, to which he was much addicted, he constantly showed himself to be without a sense of justice.≫
3. The Fiducial Argument
3.1. The Original Fiducial Argument
≪There are, however, certain cases in which statements in terms of probability can be made with respect to the parameters of the population.≫
- T be a sample statistic having a continuous distribution;
- be a one-dimensional parameter for the distribution of T;
- be the p-th quantile for T under :
≪The probability integral of the exact frequency distribution, in finite samples, of an exhaustive statistic is used to form a continuum of probability-statements, of the formand using the monotonic property of the functions for all P, this is transformed to the equivalenta complete set of probability statements about , in terms known for a given sample value T.≫
3.2. Example
≪From the table we can read off the 95 per cent. r for any given , or equally the fiducial 5 per cent. for any given r. Thus if a value were obtained from the sample, we should have a fiducial 5 per cent. equal to about 0.765. The value of can then only be less than 0.765 in the event that r has exceeded its 95 per cent. point, an event which is known to occur just once in 20 trials. In this sense has a probability of just 1 in 20 of being less than 0.765.≫[21] (p. 534)
3.3. A Short Period of Peaceful Coexistence
≪The fiducial frequency distribution will in general be different numerically from the inverse probability distribution obtained from any particular hypothesis as to a priori probability. Since such an hypothesis may be true, it is obvious that the two distributions must differ not only numerically, but in their logical meaning.There is …no contradiction between the two statements.≫
4. Bayes and Predictive Inference
4.1. Bayes’ “Billiard Ball”
≪In mathematical teaching the mistake is often made of overlooking the fact that Bayes obtained his probabilities a priori by an appropriate experiment, and that he specifically rejected the alternative of introducing them axiomatically on the ground that this “might not perhaps be looked at by all as reasonable”; moreover, he did not wish to “take into his mathematical reasoning anything that might admit dispute”. This passage (and additional text) was added in the 2nd edition, pp. 127–128, and still further further material was added to this section (5.26, “Observations of two kinds”) in the 3rd edition as well.≫
- Pick a point uniformly on a rectangular table.
- Project this point onto the horizontal axis of the table.
- Posit the position of this point O on the axis to be uniformly distributed.
- Record whether or not subsequent points X selected on the axis in this way lie to the left of O.
4.1.1. Karl Pearson Enters the Fray
≪It has occurred to me that possibly the bull itself is a chimera, and there may be no need whatever to master it. In short, is it not possible that any continuous distribution of a priori chances would lead us equally well to the Bayes-Laplace result? If this be so, then the main line of attack of its critics fails.≫
≪Thus it would appear that the fundamental formula of Laplace …in no way depends on the equal distribution of ignorance. It is sufficient to assume any continuous distribution–which may vary from one type of a priori probability problem to a second.≫
- X is the position of the initial point O;
- is the CDF of X;
- F is continuous, but not necessarily uniform.
≪I owe to Miss Ethel Newbold this insight into the exact relation between the two hypotheses.≫[33] (p. 192)
4.1.2. Fisher’s Version
≪In many continental countries this distinction, which Bayes made perfectly clear, has been overlooked, and the axiomatic approach which he rejected has actually been taught as Bayes’ method. The example of this Section exhibits Bayes’ own method, replacing the billiard table by a radioactive source, as an apparatus more suitable for the 20th century.≫
4.2. The Rule of Succession
4.3. Bayesian Prediction: A Conundrum
≪It may be noticed that the last factor in the expression developed above [predicting from assuming a uniform prior],stands only for the binomial coefficients forming the last line, or base, of Fermat’s arithmetical triangle; butis not the only polynomial in the value of which is constantly equal to unity.≫
≪If, in fact, the triangle is extended to any chosen boundary, as for example in the diagram, the thirteen totals outside the boundary are the coefficients of a polynomialof which the value is unity for all values of p.Then, based on previous experience of a successes out of , we may infer the probability of reaching the terminal value to beif the subsequent trial were made with these endpoints.≫
4.4. Fiducial Prediction
4.5. Are Fiducial Probabilities Verifiable?
≪Such fiducial probability statements about future observations are verifiable by subsequent observations to any degree of position required.≫
5. The Fiducial Argument Revisited
5.1. The Logical Status of the Fiducial Distribution
≪It is essential to introduce the absence of knowledge a priori as a distinctive datum in order to demonstrate completely the applicability of the fiducial method of reasoning to the particular and real experimental cases for which it was developed. This last point I failed to perceive when, in 1930, I first put forward the fiducial argument for calculating probabilities. For a time this led me to think that there was a difference in logical content between probability statements derived by different methods of reasoning. They are in reality no grounds for any such distinction. This contradicts Fisher’s later statement in [2] (p. 105) that he had simply failed to make clear the need for this “distinctive datum” in his 1930 paper (scarcely credible given essentially the same language appears in two later papers, [12,34]), and then compounded this by going on to criticize Neyman and Pearson for not perceiving its necessity at the time.≫
5.2. Is T Fixed?
≪For the population of cases relative to which a fiducial probability is defined, the value of any relevant statistic T is not regarded as fixed. This I have deliberately exerted myself to make clear since my first writings on the subject…I shall be glad to give you all possible support in dissuading mathematicians from thinking that they can obtain a true probability statement logically equivalent to one of the kind aimed at by Bayes’ theorem, yet without using the approximate basis of this theorem. Believe me, I have never attempted anything so foolish. The inferences which can be drawn without the aid of Bayes’ axiom seem to me of great importance, and quite precisely defined, but are certainly not statements of the distribution of a parameter over its possible values in a population defined by random samples selected to give a fixed estimate T.≫[22] (p. 124)
≪The applicability of the probability distribution to the particular unknown value of sought by an experimenter, without knowledge a priori, on the basis of the particular value of T given by his experiment, has been disputed, and certainly deserves to be examined.≫
5.3. Recognizable Subsets
≪Before the limiting ratio of the whole set can be accepted as applicable to particular throw, a second condition must be satisfied, namely that no [subset having a different limiting ratio] can be recognized. This is a necessary and sufficient condition for the applicability of the limiting ratio of the entire aggregate of possible future throws as the probability of any one particular throw. On this condition we may think of a particular throw, or succession of throws, as a random sample from the aggregate, which is in this sense subjectively homogeneous and without recognizable stratification (There is an obvious and close connection here with Richard von Mises’s concept of a kollektiv, with its twin assumptions of the existence of a limiting frequency (Fisher’s first condition) and its invariance under place selection (Fisher’s second condition). Note however the two concepts serve very different purposes. von Mises denied the existence of single-case probabilities; the kollektiv was designed instead to give a formal mathematical definition of random sequence, and the absence of place selections with differing limiting frequencies was what characterized randomness. For Fisher in contrast the absence of recognizable subsets permitted the identification of the class frequency with the probability of the individual. For Fisher’s views on infinite populations and continuous variates as convenient fictions, see [2] (pp. 35 and 114). He was never a frequentist in the von Mises sense).≫
≪This fundamental requirement for the applicability to individual cases of the concept of classical probability shows clearly the role of subjective ignorance, as well as that of objective knowledge in a typical probability statement.≫
5.4. Recognizability and Fiducial Inference
≪The reference set for which this probability statement holds is that of the values of , and s corresponding to the same sample, for all samples of a given size of all normal populations. Since and s are jointly Sufficient for estimation, and knowledge of and a priori is absent, there is no possibility of recognizing any sub-set of cases, within the general set, for which any different value of the probability should hold. The unknown parameter has therefore a frequency distribution a posteriori defined by Student’s distribution (Contrast this with Fisher’s discussion on pp. 61–62, where he advances a different, equally lapidary justification (“in the absence of a prior distribution of population values there is no meaning to be attached to the demand for calculating the results of random sampling among populations, and it is just this absence which completes the demonstration”). Savage [36] (p. 476), quoting at length from the passage containing this statement, refers to it as illustrating “Fisher’s dogged blindness about it all”).≫
5.5. Fiducial Reprise
5.6. Aftermath
≪In fact, the more I consider it, the more clearly it would appear that I have been doing almost exactly what Bayes had done in the 18th century. As Lindley purports to be a protagonist of Bayes, it seems that his misunderstanding and confusion goes deeper than anyone could imagine.≫
6. Conclusions
Funding
Data Availability Statement
Conflicts of Interest
References
- Fienberg, S.E. When did Bayesian inference become “Bayesian”? Bayesian Anal. 2006, 1, 1–40. [Google Scholar] [CrossRef]
- Fisher, R.A. Statistical Methods and Scientific Inference; Oliver & Boyd: Edinburgh, UK, 1956. [Google Scholar]
- Fisher, R.A. Statistical Methods and Scientific Inference, 2nd ed.; Oliver and Boyd: Edinburgh, UK, 1959. [Google Scholar]
- Fisher, R.A. Statistical Methods and Scientific Inference, 3rd (posthumous) ed.; Collier Macmillan: London, UK, 1973. [Google Scholar]
- Fisher, R.A. Statistical Methods and Scientific Inference, 3rd ed.; Reprinted in Statistical Methods, Experimental Design, and Scientific Inference; Bennett, J.H., Ed.; Oxford University Press: Oxford, UK, 1990. [Google Scholar]
- Bayes, T. An essay towards solving a problem in the Doctrine of Chances. Philos. Trans. R. Soc. Lond. 1763, 53, 370–418. [Google Scholar] [CrossRef]
- Fisher, R.A. On an absolute criterion for fitting frequency curves. Messenger Math. 1912, 41, 155–160. [Google Scholar]
- Fisher, R.A. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 1915, 10, 507–521. [Google Scholar] [CrossRef]
- Fisher, R.A. On the “probable error” of a coefficient of correlation deduced from a small sample. Metron 1921, 1, 3–32. [Google Scholar]
- Fisher, R.A. On the mathematical foundations of theoretical statistics. Philos. Trans. R. Soc. A 1922, 222, 309–368. [Google Scholar]
- Fisher, R.A. Uncertain inference. Proc. Am. Acad. Arts Sci. 1936, 71, 245–258. [Google Scholar] [CrossRef]
- Fisher, R.A. The concepts of inverse probability and fiducial probability referring to unknown parameters. Proc. R. Soc. A 1933, 139, 343–348. [Google Scholar]
- Aldrich, J.R.A. Fisher and the making of maximum likelihood 1912–1922. Stat. Sci. 1997, 12, 162–176. [Google Scholar] [CrossRef]
- Edwards, A.W.F. What did Fisher mean by ‘‘inverse probability’’ in 1912–1922? Stat. Sci. 1997, 12, 177–184. [Google Scholar] [CrossRef]
- von Kries, J. Die Principien der Wahrscheinlichkeitsrechnung, Eine Logische Untersuchung; Mohr: Tübingen, Germany, 1886. [Google Scholar]
- von Kries, J. Die Principien der Wahrscheinlichkeitsrechnung, Eine Logische Untersuchung, 2nd ed.; Mohr: Tübingen, Germany, 1927. [Google Scholar]
- Keynes, J.M. A Treatise on Probability; Macmillan: London, UK, 1921. [Google Scholar]
- Soper, H.E.; Young, A.W.; Cave, B.M.; Lee, A.; Pearson, K. On the distribution of the correlation coefficient in small samples. Appendix II to the papers of “Student” and R. A. Fisher. A cooperative study. Biometrika 1917, 11, 328–413. [Google Scholar] [CrossRef]
- Edwards, A.W.F. R.A. Fisher on Karl Pearson. Notes Rec. R. Soc. Lond. 1994, 48, 97–106. [Google Scholar]
- Coolidge, J.L. An Introduction To Mathematical Probability; Clarendon Press: Oxford, UK, 1925. [Google Scholar]
- Fisher, R.A. Inverse probability. Proc. Camb. Philos. Soc. 1930, 26, 528–535. [Google Scholar] [CrossRef]
- Bennett, J.H. Statistical Inference and Analysis: Selected Correspondence of R. A. Fisher; Clarendon Press: Oxford, UK, 1990. [Google Scholar]
- Neyman, J. On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. J. R. Stat. Soc. 1934, 97, 558–625. [Google Scholar] [CrossRef]
- Zabell, S.L. R.A. Fisher and fiducial argument. Stat. Sci. 1992, 7, 369–387. [Google Scholar] [CrossRef]
- Box, J.F. R. A. Fisher: The Life of a Scientist; Wiley: New York, NY, USA, 1978. [Google Scholar]
- Edwards, A.W.F. Commentary on the arguments of Thomas Bayes. Scand. J. Stat. 1978, 5, 116–118. [Google Scholar]
- Stigler, S.M. Thomas Bayes’ Bayesian inference. J. R. Stat. Soc. Ser. A Gen. 1982, 145, 250–258. [Google Scholar] [CrossRef]
- Zabell, S.L. Philosophy of inductive logic: The Bayesian perspective. In The Development of Modern Logic; Haaparanta, L., Ed.; Oxford University Press: Oxford, UK, 2009. [Google Scholar]
- Pearson, K. The fundamental problem of practical statistics. Biometrika 1920, 13, 1–16. [Google Scholar] [CrossRef]
- Edgeworth, F.Y. Molecular statistics. J. R. Stat. Soc. 1921, 84, 71–89. [Google Scholar] [CrossRef]
- Burnside, W. On Bayes’ formula. Biometrika 1924, 16, 189. [Google Scholar] [CrossRef]
- Pearson, K. Note on the “fundamental problem of practical statistics”. Biometrika 1921, 13, 300–301. [Google Scholar]
- Pearson, K. Note on Bayes’ theorem. Biometrika 1924, 16, 190–193. [Google Scholar] [CrossRef]
- Fisher, R.A. The fiducial argument in statistical inference. Ann. Eugen. 1935, 6, 391–398. [Google Scholar] [CrossRef] [Green Version]
- Lindley, D.V.; Novick, M.R. The role of exchangeability in inference. Ann. Stat. 1981, 9, 45–58. [Google Scholar] [CrossRef]
- Savage, L.J. On rereading R. A. Fisher. Ann. Stat. 1976, 4, 441–500. [Google Scholar] [CrossRef]
- Buehler, R.J. Some validity criteria for statistical inferences. Ann. Math. Stat. 1959, 30, 845–863. [Google Scholar] [CrossRef]
- Buehler, R.J.; Feddersen, A.P. Note on a conditional property of student’s t. Ann. Math. Stat. 1963, 34, 1098–1100. [Google Scholar] [CrossRef]
- Brown, L. The conditional level of Student’s t test. Ann. Math. Stat. 1967, 38, 1068–1071. [Google Scholar] [CrossRef]
- Lindley, D.V. Review: Statistical Methods and Scientific Inference. Heredity 1957, 11, 280–283. [Google Scholar] [CrossRef]
- Barnard, G.A.R.A. Fisher—A true Bayesian? Int. Stat. Rev. 1987, 55, 183–189. [Google Scholar] [CrossRef]
- Lindley, D.V. Fiducial distributions and Bayes theorem. J. R. Stat. Soc. B 1958, 20, 102–107. [Google Scholar] [CrossRef]
- Pitman, E.J.P. Statistics and science. J. Am. Stat. Assoc. 1957, 52, 322–330. [Google Scholar] [CrossRef]
- Tukey, J.W. Some examples with fiducial relevance. Ann. Math. Stat. 1957, 28, 687–695. [Google Scholar] [CrossRef]
- Dempster, A.P. Further examples of inconsistencies in the fiducial argument. Ann. Math. Stat. 1963, 34, 884–891. [Google Scholar] [CrossRef]
- Wallace, D.L. The Behrens-Fisher and Fieller-Creasy problems. In R. A. Fisher: An Appreciation; Lecture Notes in Statistics; Fienberg, S.E., Hinkley, D.V., Eds.; Springer: Berlin/Heidelberg, Germany, 1980; Volume 1, pp. 119–147. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zabell, S. Fisher, Bayes, and Predictive Inference. Mathematics 2022, 10, 1634. https://doi.org/10.3390/math10101634
Zabell S. Fisher, Bayes, and Predictive Inference. Mathematics. 2022; 10(10):1634. https://doi.org/10.3390/math10101634
Chicago/Turabian StyleZabell, Sandy. 2022. "Fisher, Bayes, and Predictive Inference" Mathematics 10, no. 10: 1634. https://doi.org/10.3390/math10101634