Necessary/recommended level of theory for developing statistical intuition

Blockquote
The black hole article shows that there are formal procedures one can take to manage and mitigate, if not eliminate, the influence of groupthink, in that case from prior expectations or knowledge.

While I have to study those examples more closely, I think the fundamental issue is the appropriate language for model uncertainty (ie. probabilities about models that output probabilities)

For a Bayesian, it is probabilities all the way down, as Frank’s post shows. Philosopher Richard Jeffreys also held that view.

Others, like Arthur Dempster and Glen Shafer, are not convinced that all uncertainty can be expressed by a single probability number. Their work lead to Dempster-Shafer theory, also known as the “Mathematical theory of Evidence” and is closely related to the notion of “imprecise” (aka interval) probabilities.

The work in this area is more likely to be published in symbolic logic journals than applied stats, but the tools developed there are likely (IMO) to productively resolve these philosophical disputes and lead to rigorous “statistical thinking” vs. the too common statistical rituals now at epidemic proportions.

Here are some interesting papers for the philosophically inclined. I’d start with the first one, and then the others for more formal development and justification.

  1. Crane, H. (2018) Imprecise probabilities as a semantics for intuitive probabilistic reasoning (link)
  2. Crane, H; Isaac, W (2018) Logic of Typicality (link)
  3. Crane, H. (2018) Logic of Probability and Conjecture (link)

For applications the following are interesting; 2 is a mathematical formalization of the argument made by @Sander in numerous threads about Bayesian probabilities being too optimistic, to accept the idea that these models capture all uncertainty. Martin turns the absence of any truly “non-informative” prior to prove any additive system of representing beliefs runs the risk of false confidence. Formal discussion at the meta-level (ie. model criticism) can be productively done in the realm of non–additive beliefs.

  1. Martin, R (2021) An imprecise-probabilistic characterization of frequentist statistical inference (link)
  2. Martin, R (2019) False confidence, non-additive beliefs, and valid statistical inference (link)
  3. Martin, R (2021) Valid and efficient imprecise-probabilistic inference across a spectrum of partial prior information (link)
  4. Balch, M, Martin R. Scott, F. (2019) Satellite conjunction analysis and the false confidence theorem (link)

Christian P. Robert gives his opinion here:

1 Like

Thank you for these references, especially those to Crane (including from your earlier post) and the false confidence papers. I was not aware of this work, and I find I am sympathetic to much of Crane’s thinking (though I deviate sharply on a few points - I will need to start reading more of his corpus). Phenomena analagous to “probability dilution” is fairly routine in my work, and like Kay and King, and possibly Crane, I do not agree that it is the duty of statistics/computation/math to “solve” such problems by finding ways to quantify uncertainty, even if imprecisely. Neither my personal experience working with scientists for 20+years, nor my reading of the history of science, are consistent with Martin’s assertion, “I contend that scientists seek to convert their data, posited statistical model, etc., into calibrated degrees of belief about quantities of interest.” No scientist I’ve ever worked with has even behaved in such a manner, much less directly asked for this from me.

Perhaps because I’ve worked on the applied side of science, instead my collaborators are interested in building an evidence base for making decisions, including experimentally evaluating alternatives, under limited resource constraints. My role has been to help them design studies, and describe/summarize and interpret the resulting data. I rarely find it either warranted, necessary, or useful to include “probability as uncertainty” claims (including statistical inferences of any flavor) in advancing that goal, except in the narrow situations I allude to in my 2019 paper, which form a minority of what crosses my desk.

1 Like

After a study of the Carmichael and Williams exposition, and their references to the Fieller theorem and Gleser-Hwang theorem, the False Confidence result is less surprising.

Liseo, Brunero. (2003). Bayesian and conditional frequentist analyses of the Fieller’s problem. A critical review. Metron - International Journal of Statistics. LXI. 133-150. (link)

After showing the limitations of frequentist and non-informative Bayesian methods, the author

Blockquote
…adopts a robust Bayesian approach to show that it is nearly impossible to end up with a reasonable solution to the problem without introducing some prior information on the parameters

Upon reflection, I think the philosophical claims are too strong. I see no reason why the notion of an imprecise probability can be embedded in a broader Bayesian framework. A decision theoretic approach to the design of experiments implies uncertainty about the “true” probability distribution, and hence an imprecise probability, and imprecise probabilities are one way of conducting a robustness analysis of Bayesian methods. Still, the paper was very interesting.

To bring this discussion back to the original point about statistical intuition, Crane’s comment, in his “Naive Probabilism” article that you posted earlier, is crisp and emphatic:

“Probability calculations are very precise, and for that reason alone they are also of very limited use. Outside of gambling, financial applications, and some physical and engineering problems – and even these are limited – mathematical probability is of little direct use for reasoning with uncertainty.”

I think Kay & King (2020) are basically making the same point in a more long-winded (but more compelling) way. In my view, an important part of statistical intuition is the quality of judgment about whether a probability calculation could be both valid and useful. Statisticians tend to overdo it; frankly so do many physicists, perhaps because physics has so many cases where probability calculations are valid and useful. In his 1995 autobiography, Gen. Colin Powell wrote that “experts often have more data than judgment” and he was right.

Could not disagree more strongly with “little direct use”.

1 Like

Crane (and Martin’s) full position is elaborated in the 2018 preprint Is Statistics Meetting the Needs of Science? (See my post above for link.)

Blockquote
To be clear, our lack of conviction about whether statistics is meeting the challenge ought not be construed as skepticism about whether it can meet the challenge [of modern science].

After re-reading his critique on a new p-value threshold (mentioned here), I came to understand, and ultimately agree with your position that estimation should be emphasized over testing. Tests are too easy to misuse for the information they provide.

Ultimately, I think everyone would learn to appreciate it if authors were encouraged to publish using p-value curves and “confidence” distributions (ie. the set of all interval estimates 1-\alpha with \alpha ranging 0 \lt \alpha \lt 1 as statistical summaries, regardless of whether they later use Bayesian or likelihood methods for analysis.
.

IMHO statistics took a wrong turn when it began favoring inference over decision making. The two are very different, and the latter is relevant.

3 Likes

My 2019 paper attempted to make a similar case that Statistics could offer more to science than it currently does with its exaggerated focus on inference. However I was too optimistic. Much of what constitutes ‘statistical thinking’ is probabilistic thinking, which has its place for sure, but a much smaller one than statisticians demand. I’m not saying to get rid of probability, just that it must take its place as merely one of several frameworks with which to understand the use of data in the scientific enterprise, and shouldn’t have a monopoly on it.

I’d like to attempt a summary of the “divergence” from the original post (which I actually had nudged along) and then redirect the discussion with a quote that, in part, captures the kind of thing I was attempting to convey in starting this topic. However, I am not opposed to going back to the current discussion (I have enjoyed them greatly and have a long reading list thanks to the posters). I was just hoping to satisfy my initial question, even if the only “real” answer is along the lines of “go back and get a graduate degree in math and/or statistics”.

Summary

It seems that there is controversy in the prob/stats world regarding even a “simple” question such as “what is probability?”, which strikes me as a philosophical question at heart; see for example this forum topic. I don’t know whether the arguments between Frequentism, Bayesianism, and Likelihoodism all essentially stem from different “views” regarding how one should define “probability”, or whether there are purely methodological differences, but it seems to be academic to me (at times even dogmatic). There are even disagreements within “camps” (for example, see the topic @R_cubed linked to on whether P-values should be abolished or the “threshold” changed).

Coincidentally, there are several reviews of Deborah Mayo’s book (Statistical Inference as Severe Testing) on Gelman’s blog (also typed up in LaTeX and posted to arXiv if you prefer that format). In one of her responses she states that

The disagreements often grow out of hidden assumptions about the nature of scientific inference and the roles of probability in inference. Many of these are philosophical.

[I can’t comment on whether this is a true representation of the field, nor did I mention her book in support of her views as I only just learned of its existence.]

I think I agree with @f2harrell’s sentiment in his post on the topic on this forum: it is all interesting intellectually but not very useful in the real world. I am, ultimately, interested in understanding more and some excellent references have been shared already; I intend to read much of what was shared and appreciate everyone’s contributions. But beyond the intrinsic reward of accumulating knowledge and intellectual discussions, I don’t have much use for philosophy currently.

In any case, I still believe that a strong foundation in math (and to some extent theory) is required to properly appreciate such issues, even if it all boils down to differences in “philosophy”. I currently don’t have that foundation; think the Dunning-Kruger effect (I don’t even know all that I don’t know about the subject).

Back to Intuition

I recently came across an interesting historical comment by the late Sir David R. Cox recounting several “pioneers in modern statistical theory”. The quote that stood out to me was about John Tukey (Section 13 of the article):

Some 20 or more years later, he unexpectedly came to see me at home in London one Sunday. What was I working on? I told him; it was something that he was highly unlikely to have known about. After a few moments of thought he made ten suggestions. Six I had already considered, two would clearly not work, and the other two were strong ideas which had not occurred to me even after long thought on the topic. This small incident illustrates one of his many strengths: the ability to comment searchingly and swiftly on a very wide range of issues.

Although Sir David gives no indication as to what he was working on, the passage highlights what is, to me, the “ultimate goal”. That is, the ability to give a thoughtful (perhaps critical) appraisal of a problem based only on the basics and a bit of thought (and probably some questions about the problem/data/goal). Is this kind of ability even achievable for a “non-Tukey” (who, by all accounts I’ve seen, was regarded a brilliant man)? Or does it require years of experience, supported by a strong foundation in mathematics?

P.S. Gelman’s blog

The post on Gelman’s blog linked to above links to some papers that I haven’t had a chance to read yet but appear to be extremely interesting. There is an additional review by Christian Robert at another blog.

The first paper is a “monograph” by Robert Cousins. It looks quite interesting based on my brief reading of a few of the sections.

Another is a paper by Gelman and Christian Hennig (including extensive discussion after the article): Gelman & Hennig (2017). Beyond subjective and objective in statistics. J R Statist Soc A. I haven’t had a chance to go through it all yet.

He also links to another blog post of his discussing his article with Cosma Shalizi (Philosophy and the practice of Bayesian statistics) (along with responses).

P.P.S.

I also stumbled upon these lecture notes by Adam Caulton that look interesting. (I actually found when I was googling “savage ramsey finetti jeffreys jaynes:grin:)

Apologies for how long this ended up!

2 Likes

Richard Hamming, himself an important figure in applied math/computer science, was humbled by Tukey’s knowledge and productivity. He said:

“I worked for ten years with John Tukey at Bell Labs. He had tremendous drive. One day about three or four years after I joined, I discovered that John Tukey was slightly younger than I was. John was a genius and I clearly was not. Well I went storming into Bode’s office and said, ‘How can anybody my age know as much as John Tukey does?’ He leaned back in his chair, put his hands behind his head, grinned slightly, and said, ‘You would be surprised Hamming, how much you would know if you worked as hard as he did that many years.’ I simply slunk out of the office!”

Source:

4 Likes

@ChristopherTong I don’t know how I failed to remember Richard Hamming, as he wrote a book in line with the theme of this thread: The Art of Probability, which is much closer to the information theoretic perspective I find helpful

Philosophy of science and the foundations of statistics can be good in small doses. The philosophical debates over the nature of “probability” don’t really matter in the context of data analysis.

In the context of messy, ambiguous data analysis, @Sander is always worth reading. In these articles he demonstrates how to use frequentist software to conduct an approximate Bayesian analysis.

3 Likes

Thanks for sharing, I greatly enjoyed his talk! It is fortunate that it was recorded.

While I suspected that hard work, perseverance, diligence, etc. is a major aspect, I do believe there are some “true geniuses” (and know a few people who I consider such, incidentally physicists) who seem to possess some kind of natural ability enabling them to more easily understand complex/abstract topics. Hamming actually mentions Feynman, in the Q&A, in regards to this; that he knew he would win a Nobel for something. See also Oppenheimer’s recommendation letter for Feynman for Berkeley:

Of these there is one who is in every way so outstanding and so clearly recognized as such, that I think it appropriate to call his name to your attention, with the urgent request that you consider him for a position in the department at the earliest time that it is possible. You may remember the name because he once applied for a fellowship in Berkeley: it is Richard Feynman. He is by all odds the most brilliant young physicist here, and everyone knows this.

I may give you two quotations of men with whom he has worked. Bethe has said that he would rather lose any two other men than Feynman from this present job, and Wigner said, “He is a second Dirac, only this time human.”

Or, for a historical example in the “arts”, there are those “prodigies” such as Mozart; while he was privileged and by all accounts an incredibly hard worker, there seemed to be something “special” about him. In creative areas, others similarly describe “getting” or “receiving” their “content” (Tolkien with invented languages, and current young prodigy Alma Deutscher when discussing melodies). Then again, I am a big fan of Mozart to begin with!

But I digress; Hamming gives good advice even if one’s goal is not to do “Nobel quality work”, and even (or especially) if one is not a “prodigy”.

I came across this important article by Philip Stark I had previously read, on the use of statistical models in practice. He echos the concerns expressed by @ChristopherTong

Blockquote
But not all uncertainties can be represented as probabilities.

I think this assertion requires a bit more elaboration. It seems to smuggle in the idea that one must have some physical model before invoking “probability” as a tool. Jaynes would call this the “mind projection fallacy” and takes substantial effort to ground probability as an expression of prior knowledge (not mere subjective belief).

I still highly recommend reading (and re-reading) this from time to time. I think his observation about mathematical models being used to “persuade and intimidate” rather than predict is close enough to the mark to give him credit for a bullseye hit on the target.

2 Likes

A physical model is not mandatory for the use of probability modeling and reasoning, though having one provides welcome additional insight.

Regarding “not all uncertainties can be represented as probabilities”, this can easily be shown. An example of a quantitative uncertainty which is non-stochastic is the bound on an approximation error for a Taylor series approximation of a function. In some applications, this bound is treated as an uncertainty, though it is purely deterministic. There are other examples in approximation theory, though I concede that some theorems in that field are probabilistic rather than deterministic, but certainly not all of them. (Taylor’s isn’t.)

More interesting are examples of uncertainties that cannot be quantified, which Kay and King’s book Radical Uncertainty (discussed above) gives many examples of. I cannot hope to summarize their arguments here, but a few quotes might give a flavor of where they are coming from.

The appeal of probability theory is understandable. But we suspect the reason that such mathematics was, as we shall see, not developed until the seventeenth century is that few real-world problems can properly be represented in this way. The most compelling extension of probabilistic reasoning is to situations where the possible outcomes are well defined, the underlying processes which give rise to them change little over time, and there is a wealth of historic information.

And

Resolvable uncertainty is uncertainty which can be removed by looking something up (I am uncertain which city is the capital of Pennsylvania) or which can be represented by a known probability distribution of outcomes (the spin of a roulette wheel). With radical uncertainty, however, there is no similar means of resolving the uncertainty – we simply do not know. Radical uncertainty has many dimensions: obscurity; ignorance; vagueness; ambiguity; ill-defined problems; and a lack of information that in some cases but not all we might hope to rectify at a future date. Those aspects of uncertainty are the stuff of everyday experience.

Radical uncertainty cannot be described in the probabilistic terms applicable to a game of chance. It is not just that we do not know what will happen. We often do not even know the kinds of things that might happen. When we describe radical uncertainty we are not talking about ‘long tails’ – imaginable and well-defined events whose probability can be estimated, such as a long losing streak at roulette. And we are not only talking about the ‘black swans’ identified by Nassim Nicholas Taleb – surprising events which no one could have anticipated until they happen, although these ‘black swans’ are examples of radical uncertainty. We are emphasizing the vast range of possibilities that lie in between the world of unlikely events which can nevertheless be described with the aid of probability distributions, and the world of the unimaginable. This is a world of uncertain futures and unpredictable consequences, about which there is necessary speculation and inevitable disagreement – disagreement which often will never be resolved. And it is that world which we mostly encounter.

I won’t provide a list of their examples, but I gave one of my own on another thread (discussion of the London Metal Exchange trading of Nickel in March of this year).

David A. Freedman’s rejoinder to the discussants of his classic shoe leather paper contains a similar assertion to Stark’s (Stark and Freedman were collaborators, so no suprise).

For thirty years, I have found Bayesian statistics to be a rich source of mathematical questions. However, I no longer see it as the preferred way to do applied statistics, because I find that uncertainty can rarely be quantified as probability.

Source: Freedman (1991): A rejoinder to Berk, Blalock, and Mason. Sociological Methodology, 21: 353-358.

RE: Freedman’s quote.

Blockquote
For thirty years, I have found Bayesian statistics to be a rich source of mathematical questions. However, I no longer see it as the preferred way to do applied statistics, because I find that uncertainty can rarely be quantified as probability.

My view: In Feynman’s lectures, he described the scientific method simply:

  1. Guess at a formulation of a natural law
  2. Compute the consequences
  3. Design an experiment and compare the results to experience.

Why is there such an aversion to educated “guessing” the form of the prior? It is a starting point on the path to inquiry.

(I’ll concede things are more complicated when real risk of loss is involved).

I’m not sure why this isn’t done more often, but an initial experiment can be done using frequentist theory, and later ones can use a prior derived from the IJ Good/Robert Matthews Reverse Bayes technique to guide later experiments.

2 Likes

As Richard Feynman said, “The first principle is that you must not fool yourself, and you are the easiest person to fool.”

As I already mentioned above, the team that produced the first image of a black hole knew this, and took extraordinary measures to stop their prior expectations from influencing the analysis of the data, resulting in increased credibility of the result. As the Wall Street Journal’s Sam Walker reported,

The chief threat to the EHT imaging team was the consensus that black holes look like rings. To build an algorithm that predicts what data might “look” like, they would have to make hundreds of assumptions. If they harbored any prejudice, even subconsciously, they might corrupt the formulas to produce nothing but rings.

For details, see Walker’s piece (which I cited above):

I take this aversion to probability and decision theory as having the perfect be the enemy of the good, much like the nihilist who argues because there is the possibility of error, no knowledge is possible.

The challenge of applying probability in the context of economics remains an active area of research. Notions like “incomplete markets” (PDF) and computational decision analysis are interesting…

2 Likes

Nobody on this thread has taken what @R_cubed chracterizes as a “nihilist” position. Kay & King outline the arena in which probability modeling can fruitfully be used, as I quoted above, and will repeat in part here:

The most compelling extension of probabilistic reasoning is to situations where the possible outcomes are well defined, the underlying processes which give rise to them change little over time, and there is a wealth of historic information.

My own post from May 23 above bears excerpting as well:

Much of what constitutes ‘statistical thinking’ is probabilistic thinking, which has its place for sure, but a much smaller one than statisticians demand. I’m not saying to get rid of probability, just that it must take its place as merely one of several frameworks with which to understand the use of data in the scientific enterprise, and shouldn’t have a monopoly on it.

This is a call for humility and a broader perspective, not nihilism.

This is still understating the scope of probability IMHO. Even though probability doesn’t apply to everything it applies to most things.

The scope of probability described by Kay and King (see quote above) was called the class of “small world” problems by them, and everything else belongs to the class of “large world” problems (or “grand world” in Savage’s parlance). There are numerous successful applications of probabilty in small world problems, such as the kinetic theory of gases, quantum theory, modeling of fiber optic and electronic communications signals, certain types of bacterial growth models, and of course games of chance. Would someone like to offer an example or two of a large world problem where probabilistic modeling/reasoning has been successful? Perhaps I can learn something from your examples.

1 Like