Necessary/recommended level of theory for developing statistical intuition

I came across this important article by Philip Stark I had previously read, on the use of statistical models in practice. He echos the concerns expressed by @ChristopherTong

But not all uncertainties can be represented as probabilities.

I think this assertion requires a bit more elaboration. It seems to smuggle in the idea that one must have some physical model before invoking “probability” as a tool. Jaynes would call this the “mind projection fallacy” and takes substantial effort to ground probability as an expression of prior knowledge (not mere subjective belief).

I still highly recommend reading (and re-reading) this from time to time. I think his observation about mathematical models being used to “persuade and intimidate” rather than predict is close enough to the mark to give him credit for a bullseye hit on the target.


A physical model is not mandatory for the use of probability modeling and reasoning, though having one provides welcome additional insight.

Regarding “not all uncertainties can be represented as probabilities”, this can easily be shown. An example of a quantitative uncertainty which is non-stochastic is the bound on an approximation error for a Taylor series approximation of a function. In some applications, this bound is treated as an uncertainty, though it is purely deterministic. There are other examples in approximation theory, though I concede that some theorems in that field are probabilistic rather than deterministic, but certainly not all of them. (Taylor’s isn’t.)

More interesting are examples of uncertainties that cannot be quantified, which Kay and King’s book Radical Uncertainty (discussed above) gives many examples of. I cannot hope to summarize their arguments here, but a few quotes might give a flavor of where they are coming from.

The appeal of probability theory is understandable. But we suspect the reason that such mathematics was, as we shall see, not developed until the seventeenth century is that few real-world problems can properly be represented in this way. The most compelling extension of probabilistic reasoning is to situations where the possible outcomes are well defined, the underlying processes which give rise to them change little over time, and there is a wealth of historic information.


Resolvable uncertainty is uncertainty which can be removed by looking something up (I am uncertain which city is the capital of Pennsylvania) or which can be represented by a known probability distribution of outcomes (the spin of a roulette wheel). With radical uncertainty, however, there is no similar means of resolving the uncertainty – we simply do not know. Radical uncertainty has many dimensions: obscurity; ignorance; vagueness; ambiguity; ill-defined problems; and a lack of information that in some cases but not all we might hope to rectify at a future date. Those aspects of uncertainty are the stuff of everyday experience.

Radical uncertainty cannot be described in the probabilistic terms applicable to a game of chance. It is not just that we do not know what will happen. We often do not even know the kinds of things that might happen. When we describe radical uncertainty we are not talking about ‘long tails’ – imaginable and well-defined events whose probability can be estimated, such as a long losing streak at roulette. And we are not only talking about the ‘black swans’ identified by Nassim Nicholas Taleb – surprising events which no one could have anticipated until they happen, although these ‘black swans’ are examples of radical uncertainty. We are emphasizing the vast range of possibilities that lie in between the world of unlikely events which can nevertheless be described with the aid of probability distributions, and the world of the unimaginable. This is a world of uncertain futures and unpredictable consequences, about which there is necessary speculation and inevitable disagreement – disagreement which often will never be resolved. And it is that world which we mostly encounter.

I won’t provide a list of their examples, but I gave one of my own on another thread (discussion of the London Metal Exchange trading of Nickel in March of this year).

David A. Freedman’s rejoinder to the discussants of his classic shoe leather paper contains a similar assertion to Stark’s (Stark and Freedman were collaborators, so no suprise).

For thirty years, I have found Bayesian statistics to be a rich source of mathematical questions. However, I no longer see it as the preferred way to do applied statistics, because I find that uncertainty can rarely be quantified as probability.

Source: Freedman (1991): A rejoinder to Berk, Blalock, and Mason. Sociological Methodology, 21: 353-358.

RE: Freedman’s quote.

For thirty years, I have found Bayesian statistics to be a rich source of mathematical questions. However, I no longer see it as the preferred way to do applied statistics, because I find that uncertainty can rarely be quantified as probability.

My view: In Feynman’s lectures, he described the scientific method simply:

  1. Guess at a formulation of a natural law
  2. Compute the consequences
  3. Design an experiment and compare the results to experience.

Why is there such an aversion to educated “guessing” the form of the prior? It is a starting point on the path to inquiry.

(I’ll concede things are more complicated when real risk of loss is involved).

I’m not sure why this isn’t done more often, but an initial experiment can be done using frequentist theory, and later ones can use a prior derived from the IJ Good/Robert Matthews Reverse Bayes technique to guide later experiments.

1 Like

As Richard Feynman said, “The first principle is that you must not fool yourself, and you are the easiest person to fool.”

As I already mentioned above, the team that produced the first image of a black hole knew this, and took extraordinary measures to stop their prior expectations from influencing the analysis of the data, resulting in increased credibility of the result. As the Wall Street Journal’s Sam Walker reported,

The chief threat to the EHT imaging team was the consensus that black holes look like rings. To build an algorithm that predicts what data might “look” like, they would have to make hundreds of assumptions. If they harbored any prejudice, even subconsciously, they might corrupt the formulas to produce nothing but rings.

For details, see Walker’s piece (which I cited above):

I take this aversion to probability and decision theory as having the perfect be the enemy of the good, much like the nihilist who argues because there is the possibility of error, no knowledge is possible.

The challenge of applying probability in the context of economics remains an active area of research. Notions like “incomplete markets” (PDF) and computational decision analysis are interesting…


Nobody on this thread has taken what @R_cubed chracterizes as a “nihilist” position. Kay & King outline the arena in which probability modeling can fruitfully be used, as I quoted above, and will repeat in part here:

The most compelling extension of probabilistic reasoning is to situations where the possible outcomes are well defined, the underlying processes which give rise to them change little over time, and there is a wealth of historic information.

My own post from May 23 above bears excerpting as well:

Much of what constitutes ‘statistical thinking’ is probabilistic thinking, which has its place for sure, but a much smaller one than statisticians demand. I’m not saying to get rid of probability, just that it must take its place as merely one of several frameworks with which to understand the use of data in the scientific enterprise, and shouldn’t have a monopoly on it.

This is a call for humility and a broader perspective, not nihilism.

This is still understating the scope of probability IMHO. Even though probability doesn’t apply to everything it applies to most things.

The scope of probability described by Kay and King (see quote above) was called the class of “small world” problems by them, and everything else belongs to the class of “large world” problems (or “grand world” in Savage’s parlance). There are numerous successful applications of probabilty in small world problems, such as the kinetic theory of gases, quantum theory, modeling of fiber optic and electronic communications signals, certain types of bacterial growth models, and of course games of chance. Would someone like to offer an example or two of a large world problem where probabilistic modeling/reasoning has been successful? Perhaps I can learn something from your examples.

1 Like

interesting information

  1. Given that too many areas of science are plagued by improper use and understanding of frequentist procedures, and have been for close to 100 years now, I think there remains large scope for what you describe as “small world” applications of probability rephrased in the language of information theory. In economic terms, large areas of scientific inquiry are well below the “efficient frontier” in terms of information synthesis. [1]

  2. I fail to see why a large class of problems described by the term “Knightian uncertainty” isn’t covered by frequentist theory of minimax decision rules. [2][3]

  3. Some problems might require conservative extensions to probability theory. In mathematical logic, the program of Reverse Mathematics takes the theorems of classical mathematics as given, and searches for the axioms needed to prove them. A number of sub-systems of varying logical strength have been discovered. Likewise, there are a number of proposals for extending probability theory: ie. IJ Good’s Dynamic Probability, or Richard Jeffrey’s probability kinematics. [4]

  4. I expect the RT Cox/ET Jaynes approach to Bayesian analysis to advance where it can handle non-parametric problems as easily as frequentist methods. Combining Cox-Jaynes with the Keynes/Good/Jeffreys/Wally notion of interval probabilities leads to Robust Bayesian Inference as noted in [5].

Further Reading

  1. Anything by @Sander and colleagues on the misinterpretation of P values.
  2. Kuzmics, Christoph, Abraham Wald’s Complete Class Theorem and Knightian Uncertainty (June 27, 2017). Available at SSRN here. Also peer reviewed and published here.
  3. David R. Bickel “Controlling the degree of caution in statistical inference with the Bayesian and frequentist approaches as opposite extremes,” Electronic Journal of Statistics, Electron. J. Statist. 6(none), 686-709, (2012) link
  4. Good, I. J. “The Interface Between Statistics and Philosophy of Science.” Statistical Science 3, no. 4 (1988): 386–97. link
  5. Stefan Arnborg, Robust Bayesian analysis in partially ordered plausibility calculi,
    International Journal of Approximate Reasoning, Volume 78, 2016, Pages 1-14, ISSN 0888-613X, link

Fascinating discussion!
A related blog post on Ramsey vs Keynes is at Syll’s blog:

which links to previous critiques of Bayes there, including most recently


With regard to @R_cubed 's comment on information theoretic formulations of probability being deployed in “small world” problems, as I mentioned above (5/18/22 post), I agree that the “maximum entropy” approach is intriguing and worthy of consideration, and I would be open to seeing more (and judicious) attempts to use it for “small world” problems.

I would still like to see some concrete “large world” examples of successful probabilistic reasoning introduced to this thread. As @f2harrell said, “probability doesn’t apply to everything” but “it applies to most things”, so it shouldn’t be difficult to locate examples (though my personal biases may be interfering with my ability to think of one). These examples may help to show in what ways the views I offered above are poorly-baked or just plain wrong.

Thanks to @Sander_Greenland for some thought-provoking links.


I’m afraid I may be misunderstood here. I’m prepared to accept that my views, enunciated above, are too pessimistic, but only via discussion of concrete examples, not conjecture. I’ve discussed several real examples on this thread that further my perspective, but naturally those have been “cherry picked” (proton mass, black hole image, London Metals Exchange trading of Nickel in March, 2022). Where are the examples that contradict my pessimism?

RE: LME example. It isn’t clear what you are looking for. I can’t think of any one other than a high level LME member who would have enough information to have a posterior probability of default over 0.5.

The LME has been in existence for just over 100 years This isn’t the first time the exchange had closed. It is now owned by a Chinese firm Hong Kong Exchanges and Clearing.

I’ve never worked in risk management, but have a very basic idea on how they think. Shrewd risk managers might not have actually predicted a closure, but by monitoring prices of various securities and world news, they would still have been able to act by diverting trades to other exchanges, stop trading on that exchange, or even take short positions in LME securities, (ie. put options on a more stable exchange, synthetic short, use of an OTC credit default swap (CDS) etc.).

Your fundamental premise is debatable — ie. that there exists good examples in the public domain on good uses of probabilistic reasoning. Any real world use almost certainly takes place in an zero-sum market context, where information that is useful but not widely known, is valuable.

I’ll post some more thoughts on this in the thread I created on Stark’s paper.

1 Like

I say for a second time (note 1), a fundamental point has been missed. Canceling 8 hours of legitimate trades is unprecedented. Period. This is a public domain example with a century of historical data, as @R_cubed noted. Yet, the inconceivable still happened. What other inconceivable events might have happened but didn’t - and what priors should be assigned to them? How large should the probability space be to accomodate the infinite number of “moon made of green cheese” outcomes that must therefore be considered even if unlikely? I submit it as an example of what Kay and King call a “large world” problem where probability reasoning would have been helpless.

This is not one of their examples, as it occured 2 years after their book was published. However, their examples (eg, their discussion of the financial crisis of 2008) include ones that played out in public view. If “probability applies to most things” (as observed by @f2harrell ) it should apply to most public domain decisions too.

(1) See

I say for a second time (note 1), a fundamental point has been missed. Canceling 8 hours of legitimate trades is unprecedented. Period.

It is certainly rare, but not unprecedented. Assessing counter-party risk is part of the business. All that glitters isn’t always gold.

Contracts were cancelled in 1985 during the “Tin crisis”

The London Metal Exchange is owned by a Chinese business entity. War has restricted the supplies of nickel, driving up the price. Circumstances have the U.S. and Chinese government as adversaries. That alone should indicate to anyone sensible that:

Past performance is no guarantee of future results.

I do recall a time not all that long ago when the U.S. government banned short sales of financial institutions during a crisis.

So much for “price discovery” when the insiders don’t want such things “discovered.”

1 Like

Excellent! I was not aware of the tin crisis. However the article cited states that:

Open contracts struck at the high prices quoted on the LME at the time trading was suspended were settled with reference to the much-lower prices that were soon seen afterward in the physical market.

This is quite different from just canceling 8 hours of trades. My reading of the above is that open contracts at the time of the halt were re-priced. Thus, I stand by my assertion that the events of March 2022 at the LME nickel trading market were unprecedented. The market participants quoted in the article clearly did not see it coming. And “past performance is no guarantee of future results” is one facet ot the general critique being made by K^2, as this dictum applies to “large world” problems. Market participants can try to anticipate unprecedented events using knowledge of context and global market conditions, as @R_cubed rightly notes. The only question is, do they use a probabilistic framework, or are they acting more heuristically? (And in this case it seems clear that whatever framework they used, the actual events of that day took many of them completely off guard.)

The U.S. futures markets use a sophisticated, quantitative risk assessment system known as SPAN (Standard Portfolio Analysis of Risk). U.S. stock markets use different guidelines.

Historical data (variance estimates and inter-market correlations) are used to estimate the possible worst case loss for a portfolio (usually 1 day look ahead) in order to calculate margin requirements. So those who get paid to keep exchanges open use heavily quantitative methods.

SPAN always struck me as a well thought out system that was good for traders but protected the market.

Superb, this discussion is taking a very productive turn!

According to the CME, SPAN is a Value at Risk (VaR) based system.

While I am not immediately aware of failures of SPAN in US options trading (though I will now start looking for them), the use of VaR was implicated in the financial crisis of 2008. For instance, supposedly the failure of Northern Rock shouldn’t have happened based on a VaR analysis (this is an example discussed by Kay and King). K^2 write that

Although value at risk models may be of some use in enabling banks to monitor their day-to-day risk exposures, they are incapable of dealing with the ‘off-model’ events which are the typical cause of financial crises.

Riccardo Rebonato (I thank @R_cubed for pointing me to this author) states that VaR analysis is an “interesting and important measure” of risk “when nothing extraordinary happens”. But of course in our lifetimes markets have experienced “extraordinary” events on multiple occasions…because financial markets are actually “large world” arenas. And as we’ve seen, regulators in 2008 took a very improvisational approach to handling such crises…as far as I know, they did not rigidly follow the prescription of some probability-based framework (and thus, their behavior could not be so modeled by other market participants).

It seems that K^2 and Rebonato are saying that VaR has its honored place in risk modeling, but cannot be depended upon to fully assess risk when market non-stationarity is changing rapidly, which is precisely a “large world” issue. VaR works best when the “large world” system is behaving (temporarily) in a “small world” fashion.

I don’t know of anyone who thinks financial markets are perfectly modeled by simple probability distributions, or that VaR is a complete methodology.

It is fundamental to me that markets are incomplete in the Arrow-Debreu sense: there exist more risk than there are securities to hedge that risk.

It is also an interesting question whether these financial “innovations” end up increasing risk, rather than simply transferring it ie. moral hazard.

A very thorough (open source) text on VaR methods (with utility to other areas of applied stats) can be found here:

My biggest objection (which I will elaborate on later) is their recommendation to compare and contrast different “narratives”. This entire notion of “narratives” has created a cargo cult of “experts” who advise representatives of constituents, who do not communicate information to the people they are supposed to represent, but manipulate them in the fashion of Thaler’s “nudges.”

A detailed critique can be found here:

I essentially agree with this clip from an FT review

The authors of this book belong to the elite of British economic policymaking …Their alternative to probability models seems to be, roughly, experienced judgment informed by credible and consistent “narratives” in a collaborative process. They say little about how those exercising such judgment would be held to account.