Significance tests, p-values, and falsificationism

You cannot productively discuss statistical procedures without some comprehension of the mathematics behind them which often requires algebra and maybe a bit of calculus.

Richard Feynman in discussing the relationship of math to physics quoted Euclid, who said “There is no royal road” (ie. easy way) to learn geometry or physics. There is no royal road to learning statistics without mathematics, either.

1 Like

That’s certainly a relevant question. There are two ways of answering it, one negative and one positive. The negative way (‘what falsificationism isn’t’) can be seen in my reference [1]. Misconceptions about the idea abound, and nobody should be talking about it without at least making very clear how their use of the term and the idea corresponds to what Popper’s notion actually amounts to. Because very often, the people using the term don’t, in fact, know that.

If I had to give a positive sort of definition of the term, I’d say what I said in the OP: that it’s a methodology for learning from experience given that induction doesn’t work and that all our efforts are fallible. This includes a clear-eyed view of the aforementioned asymmetry between verification and falsification: the insight that singular statements may be verified or falsified, but that universal statements can only be falsified (always assuming valid logic). It extends into the realm of what Popper called “methodological rules”, which necessarily have to complement the purely logical analysis. (Cf. §11 of LoSD)

Of course, that doesn’t restrict anyone’s ability to call themselves (or somebody else) a “falsificationist”. That usually (in practice) just means that somebody endorses one or another of the elements of falsificationism as I defined it above. In that sense, Fisher would be a falsificationist for underlining the mentioned logical asymmetry: “Every ex­periment may be said to exist only in order to give the facts a chance of disproving the null hypothesis.”

Let’s see, I wrote this: " ‘A single p-value doesn’t mean anything’ is false if taken literally, and embodies a confusion I find prevalent among P-values critics who fail to distinguish math objects from their various interpretations."
-that makes no mention of you personally; my sentence’s topic was instead the quoted statement and noted a general confusion the statement invites. I then explained what I meant at painstaking length. To which you replied that my explanation (which encapsulates published material) did “lengthily show off how clever we are without ever trying to find out what the other person actually means” - that’s just nasty in a direct personal way, especially when it would have sufficed to explain how I missed your intended meaning.

Regardless, you really ought to try and understand the math and logic of a criticism before you go off on the source challenging your claims. And you really ought to go back and read the literature to see how the issues you raise have been raised repeatedly and debated at length for at least a half century, without convincing any of us to become converts to falsificationism, refutationism, or whatever you want to call the philosophy you promote, even though we all agree it can be a useful perspective at times and has some wisdom worth noting. For just a small sample from epidemiology see
Buck C. Popper’s philosophy for epidemiologists. Int J Epidemiol 1975;4:159–168
Jacobsen M. Against Popperized epidemiology. Int J Epidemiol 1976;5:9–11.
Maclure M. Popperian refutation in epidemiology. Am J Epidemiol 1985;121:343–50.
Susser M. The logic of Sir Karl Popper and the practice of epidemiology. American Journal of Epidemiology 1986;124:711-718.
Eells E. On the alleged impossibility of inductive probability. Br J Phil Sci 1988;39:111–16.
Greenland S. Probability versus Popper: An elaboration of the insufficiency of current Popperian approaches for epidemiologic analysis. In: Rothman KJ (ed.). Causal Inference. Chestnut Hill, MA: ERI, 1988.
Susser M. Falsification, verification and causal inference in epidemiology: reconsideration in light of Sir Karl Popper’s philosophy. In: Rothman KJ (ed.). Causal Inference. Chestnut Hill, MA: ERI, 1988, 33-58.
Pearce N, Crawford-Brown D. Critical discussion in epidemiology: problems with the Popperian approach. J Clin Epidemiol 1989;42(3):177-84.
Karhausen LR. The poverty of Popperian epidemiology. Int J Epidemiol 1995;24:869–74.
as well as the Papineau article that Pedro cited,

and the two articles of mine I cited on Twitter:
Greenland S. Induction versus Popper: substance versus semantics. Int J Epidemiol 1998;27:543–548.
Greenland S. Probability logic and probabilistic induction. Epidemiology 1998;9:322–332.

The 1989 Pearce+Crawford-Brown article sums up the position of the unconvinced thusly: "The recent Popperian ‘trend’ has a positive aspect in that it has fostered deductive thinking, and exposed the shortcomings of induction. However, the restrictive Popperian framework actually inhibits discussion despite its veneer of ‘critical discussion’ ”. Indeed; a third of a century later it remains so, and it is remarkable how those who rally under the ideals of Popper have often done so by pairing the stated goal of critical rationalism with contemptuous dismissal of any criticism that is not framed and easily addressed in their terms.

This aggrieved sense of being unfairly ‘restricted’ by mostly unnamed oppressors (and indeed within the context of a ‘trend’) invites comparison with [1]. If calls for ecumenicism/pluralism are valid against criticism by the ‘big meanie’ Popper and his hordes of ‘little meanies’, why don’t they also apply against @ADAlthousePhD and comrades? What protects us from a post-hoc-power pluralism?

Recently I have resumed my old habit of worshipping reading Popper in the mornings. Chapter 15 in Conjectures & Refutations (CR) republishes [2], which has a wonderful passage I’ll quote at some length:

Dialecticians say that contradictions are fruitful, or fertile, or productive of progress, and we have admitted that this is, in a sense, true. It is true, however, only so long as we are determined not to put up with contradictions, and to change any theory which involves contradictions; in other words never to accept a contradiction: it is solely due to this determination of ours that criticism, i.e. the pointing out of contradictions, induces us to change our theories, and thereby to progress.

It cannot be emphasized too strongly that if we change this attitude, and decide to put up with contradictions, then contradictions must lose any kind of fertility. They would no longer be productive of intellectual progress. For if we were prepared to put up with contradictions, pointing out contradictions in our theories could no longer induce us to change them. In other words, all criticism (which consists in pointing out contradictions) would lose its force. Criticism would be answered by “And why not?” or perhaps even by an enthusiastic “There you are!”; that is, by welcoming the contradictions which have been pointed out to us.

(I ask you: Is there a more delightful expression than “There you are!” for capturing the spirit of Chang & colleagues’ posthocpowerism?) :joy:

How do we plead for pluralism without abandoning this basis for progress? In various places, Popper identifies as obscurantist certain metaphysical attitudes that hold back progress. For example, in “Three Views Concerning Human Knowledge” (CR, Ch.3), he attacks Cotes’s essentialist view of Newton’s gravitational theory as such:

That it was obscurantist is clear: it prevented fruitful questions from being raised, such as ‘What is the cause of gravity?’ or more fully, ‘Can we perhaps explain gravity by deducing Newton’s theory, or a good approximation of it, from a more general theory (which should be independently testable)?’

No other word so fully captures my own sense of Biostatistics’ failure to make progress in dose-finding methods, which is the area of my most intense engagement with the discipline. As much as I would like to analyze this obscurantism solely as a philosophical problem, such that it opens the door to rational persuasion, I have ultimately come around to a view somewhat like yours here, Sander, that a more comprehensive sociological (incl. economic) outlook is needed:

  1. Chang DC, Stapleton SM. Response: The Proliferation and Misinterpretation of “As Safe As” Statements in Surgical Science: A Call for Professional Discourse to Search for a Solution. Journal of Surgical Research. Published online August 2020:S002248042030500X. doi:10.1016/j.jss.2020.03.074 PubPeer

  2. Popper K. What is Dialectic? Mind. 1940;49(196):403-426. What is Dialectic? on JSTOR

From an interested observer not trained in philosophy, my current take on the cumulative discussion to date is that falsification is an interesting and useful concept in the abstract but not in the doing of science. I view this similarly as how I view causal inference when not applied to randomized experiments: interesting, and useful for infinite sample sizes, but not as useful as it appears once real finite datasets are involved.

2 Likes

What about criticism, though? It seems to me that the crucial contribution you make in this post is to propose a variable selection method that embeds a critical principle. You have approached the variable selection problem from the perspective of fallibilism (another name for Popper’s critical rationalism — cf. RAS p.xxxv), and advanced a method that objectively reminds the user “but you could be wrong!”

2 Likes

I used to have an interest in (academic) philosophy of science; I’d say that I share Sander’s taste for Feyerabend’s pluralism. The parts that are still interesting for the philosophically inclined are Foundations of Mathematics (or Meta-Mathematics) and mathematical logic itself, but some mathematicians who work in foundations are less interested in logic in favor of things like combinatorics (like Doron Zeilberger from Rutgers).

An interesting philosophical discussion involves the relative merits of Zermelo-Frankel set theory with Choice Axiom aka. ZFC and law of excluded middle (LEM) or varieties of constructive logic that do not assume LEM. Since Godel, an important distinction is made between what is “true” and what is “provable”, which gets to the heart of the Popper quote above.

The Curry-Howard-Lambeck correspondence (the mapping of math proofs to algorithms) addresses the notion Popper referred to above.

Much of science has had its start in questions philosophers asked. If we took much of the work of the logical positivists and substituted “verifiable” or “falsifiable”" for “decidable”, we get the essentials of computational theory.

The mathematical tools of game theory are used in logic as they have been in statistical decision theory. IIRC the logician Jakkko Hintikka was the editor of a scholarly journal where Jerzy Neyman published an important paper on the applications of hypothesis tests. Despite being written in 1977, it is still worth reading.

2 Likes

I would submit that you should never take anything seriously that Papineau says or has said about Popper. He simply doesn’t know what he’s talking about and will misrepresent Popper’s ideas as egregiously as anyone. You want examples? You quoted some:

“Popper creates the impression that all scientists…are creative visionaries”. No, he actually didn’t. Not only can’t Papineau distinguish a normative framework from a description of science in practice, he also doesn’t understand that Popper never said that scientists had to consciously want to overthrow theories. That’s a fabrication by people who never bothered to read the actual text (of LoSD in this case) and much rather ran with their prejudices.

“There would be no point to science unless its conjectures sometimes acquired enough inductive evidence to graduate to the status of established truths.” I mean, where do you even start? It’s ignorant of what Popper argued should be the aim of science, it’s ignorant of the way Popper explained knowledge can grow in a falsificationist methodology, it’s ignorant of how Popper’s view necessarily involves choice between competing theories…

1 Like

Which is, of course, a view that quite a few people have expressed over the years—and overwhelmingly, they have never seen the need to bother finding actual arguments for it that others would be able to criticise. A few of you saw a prime specimen of that on Twitter (and liked it, completely unsubstantiated as it was and remained).

So let me ask you: what are the actual arguments that falsificationism is not useful in the doing of science?

Funny, when that is precisely what you refuse to do.

You still haven’t engaged with a single thing I was talking about. (And lest somebody make the ridiculous charge of “dogma” about that again, let me add: If you think I’m mistaken in that statement, please point to a counter-example and I’ll happily discuss it.) I have expanded on all of my points and answered a few questions. There should now be enough material for anyone actually interested in good-faith discussion to find something to take me to task on, point to counter-evidence, dispute the helpfulness of definitions, or question an interpretation of what Popper’s ideas are actually about. I’m looking forward to that.

The indirectness doesn’t match how humans learn, which is mostly by continual updating without invoking a mechanism to reject hypotheses. Human activities, including scientific learning, are mostly about decisions, not inference. We act as if something is true whether it is in actuality true or not. We play the odds.

3 Likes

As I said in our Twitter prelude to our present debacle:

I then suggested the present blog venue and proceeded to write at length criticizing your initial set of points. You took that personally and responded with no positive clarifications or patient elaborations as a blog-discussion leader should make. Instead you supplied only contemptuous dismissal because I used statistical geometry to describe one fine point and used the phrase “Pope Popper”, an apparent blasphemy worthy of personal insult if not beheading in response. So I responded in kind, noting your failure to engage the criticisms I raised as evidence of your knowledge gaps in statistics and data analysis (which this blog is about), unsurprising gaps given your background:
Dr. Peter Monnerjahn | JKU Linz (click on CV)
Publications and Research | JKU Linz

My pointing to that is no different than your implying your opponents are ignorant of details of Popper and falsficationism - which is true enough in some instances. But where that is true your job should be as a patient instructor, not a hectoring advocate. Instead you simply refused to accept any responsibility for your own nasty postings (which one colleague - not anyone posting here or on Twitter - has described as “quick-draw posturing, condescending trolling and ridiculous dick-wagging”).

I then responded at length to Norris’s comments with my neoKuhnian views. You haven’t engaged with that lengthy response either. I’m happy to debate bona fide scholars like Norris who not only are well-read and experienced in real medical research, but foremost will patiently read and try to understand my lengthy comments before explaining in return precisely where they think I missed something or how Popper addressed my concerns. Take a lesson.

BTW, since you were so approving of Senn’s tweets listing Popper books he had read, here’s my list: The Logic of Scientific Discovery; Conjectures & Refutations (in some ways my favorite); his Postcripts to LSD; Unended Quest; The Open Society and its Enemies (which Feyerabend and Lakatos called “The Open Society By One of its Enemies”); and miscellaneous papers. That was about my fill as like some others I found Popper quite limited (as all individual writers must be), and allocated time to read other viewpoints as seriously and at length (eg Russell, Wittgenstein, Kuhn, Bartley, Feyerabend, Giere, van Fraasen, Quine, Lewis, and DeFinetti). And having had Popper’s former student and subsequent apostate Feyerabend for phil sci was superb preparation for battling the Popperian devotees I was soon to encounter - see the citation list I posted earlier.

Nonetheless, not being as versed in Popper as those devotees, I see they have a purpose (much like religious scholars) in preserving his legacy, and I am grateful for quotes directly from Popper that are pertinent to a discussion. I believe that some study of Popper is to be recommended to all who would be literate in philosophical foundations of scientific activity - but not at the expense of missing all the other fine thinkers of the past century. So his scholars do a service in creating their favorite selections (like Miller did) even though some may do a disservice in failing to respectfully and accurately accommodate alternative viewpoints (which frankly I think Popper himself did in his treatment of Bayesian viewpoints).

Finally, I post as a retirement hobby, so I don’t have to respond to comments which are at the level of an ISIS attack. But I will respond in kind if that’s what I’m confronted with. If instead you want to engage constructively, start with reading and responding logically and helpfully to what I’ve already posted here.

1 Like

“Human activities, including scientific learning, are mostly about decisions, not inference. We act as if something is true whether it is in actuality true or not.”
-You may realize that characterization can be found in Neyman’s behavioristic statistics. I found more in common between Neyman and de Finetti than between either and Fisher; indeed, Neyman hosted de Finetti in the Berkeley symposia which helped spread de Finetti’s ideas in the Anglophone world. And among them all one can find passages which can be read as falsificationist in sentiment; Neyman was simply the most formally so.

2 Likes

“Popper creates the impression that all scientists…are creative visionaries”. No, he actually didn’t. Not only can’t Papineau distinguish a normative framework from a description of science in practice (…)

Papineau’s point is both descriptive and normative. He doesn’t conflate both. As a matter of fact, not all scientists are visionaries putting forward extremely bold and falsifiable theories. That’s the descriptive claim. But that’s also perfectly fine because “speculative research is not the only kind of science, or even the most important kind.” That’s the normative claim. Some scientific theories are modest, they are not as bold and falsifiable as universal generalizations. We wouldn’t be able to even apply science without these “smaller” theories, so they are of utmost importance. Popper’s work mostly ignores this and it’s not clear how it should fit in his philosophy.

he also doesn’t understand that Popper never said that scientists had to consciously want to overthrow theories.

Why he doesn’t understand that? Can you elaborate? Yes, Popper doesn’t think falsification is easy. Sometimes we need to protect theories from falsification, even with ad hoc hypotheses (I don’t have the source right now, but he uses the introduction of neutrinos as a good example of an ad hoc hypothesis used to save a theory). But that’s why the usual “black swan” example people use to illustrate the verification/falsification asymmetry is so misleading. First, scientific theories are not so simple. Second, even in this simple example, you may be wrong about having observed a black swan (perhaps it was some other animal, for example). But, without induction, how can we choose which “basic statements” (e.g. “there’s a black swam x at place d”) to accept? Popper does give some methodological rules (the criterion needs to be intersubjective, public, etc), but he’s left with the absurd conclusion that we can never know which basic statements are true/probably true/reasonable. It’s all an act of free decision on the part of the scientific community. Not only we can never verify theories, we can never falsify them in any normal sense of the word (again, this is much more radical than the boring claim that we can be wrong, which everyone accepts).

One of the greatest appeals of Popper’s philosophy is that falsification is easy and verification is hard/impossible. But Popper himself concedes that this is not the case. (It’s not that there’s no asymmetry between verification and falsification. Carnap, Savage and others also thought that very precise universal generalizations had zero probability of being exactly true. I’m just saying the asymmetry is overstated.)

“There would be no point to science unless its conjectures sometimes acquired enough inductive evidence to graduate to the status of established truths.” I mean, where do you even start? It’s ignorant of what Popper argued should be the aim of science, it’s ignorant of the way Popper explained knowledge can grow in a falsificationist methodology, it’s ignorant of how Popper’s view necessarily involves choice between competing theories…

If you don’t start, I’ll never know what you mean

I’m not sure I understand this correctly. Do you think your “indirectness” corresponds to what David Colquhoun says here:?

The problem is that the p-value gives the right answer to the wrong question. What we really want to know is not the probability of the observations given a hypothesis about the existence of a real effect, but rather the probability that there is a real effect – that the hypothesis is true – given the observations. And that is a problem of induction.

[1] And I didn’t say he did. I said he couldn’t distinguish them, and I meant that his criticism is that Popper’s view doesn’t fit reality (ie he takes it to be descriptive), while Popper’s view is explicitly normative. Popper additionally went to some lengths to show that a lot of good science actually fits his model, but that’s a separate point.

[2] It certainly doesn’t ignore it; Papineau is simply wrong there. Popper is well aware that not every scientist walks around wondering what theory to falsify next; he talks about the underlying logic of theories that are open to refutation in principle (specifically: more open than their competitors). And it’s also rather clear how modest theories could fit into his philosophy, eg in the form of “degrees of falsifiability”.

[3] Because he talks about it as though Popper has stipulated a psychological attitude where falsification of your theory had to be in your mind all the time. That’s a complete misrepresentation of what Popper said. a) He even said it was important for our theories to be held “dogmatically” for some time (a most unnecessarily unfortunate choice of word, if you ask me), so that their logical consequences could be worked out in detail without the theory being given up prematurely. b) He talks about two things: the logical properties of theoretical systems on the one hand and methodological rules on the other.

[4] How is he left with that conclusion? I don’t think I know what you’re referring to there. I can say, though, that of course we can never say for certain which basic statements are true. That would imply infallibility. But why would it matter anyway? Valid logic can’t use arguments with only basic statements as premises in any case. What do we do with that logic, then? Notturno has an answer for that:

[Logic] cannot force us to accept the truth of any belief. But it can force us, if we want to avoid contradicting ourselves, to reexamine our beliefs…

I did start and gave three examples. :slight_smile:

Papineau gives that statement as a counterpoint to Popper—as if Popper hadn’t explicitly argued against (I would say: demolished) all the premises Papineau uses there. Papineau ignores that (and why) Popper’s whole philosophy hinges on the need, in his view, to do away with “the idol of certainty”. He ignores that Popper’s view of knowledge is not absolute (“the truth”) but relative: we can only make progress by finding better theories. He ignores the logical situation where “graduating to the status of established truths” is simply impossible and only choices are rationally supported.

Thanks David for this. My sense is the cited passage from C&R provides an excellent illustration of an exhortation typical of what I found in Popper, both persuasive emotionally for science as a grand knowledge enterprise, and missing the subtleties needed for everyday science as pragmatic engineering.

On the one hand it captures an important aspect of knowledge evolution - our need to resolve apparent contradictions, motivating theory modifications. On the other hand it does not speak to the constant apparent contradictions in details that one will encounter in real research, not all of which can be addressed or even need to be addressed in the time that can be allotted by real users to real problems.

On the grand scale an example would be the conflict between general relativity and quantum field theory that has obsessed some theorists for generations going back to Einstein. Unresolved to this day to general satisfaction, engineers and most physicists tolerate it as they can make good use of the theories in the separate domains where they hold practical sway. On the small scale applied scientists tolerate inconsistencies like the theory that different methods should produce clearly heterogeneous observations, yet often don’t (specific examples from past collaborations of mine available on request).

So, in isolation the passage looks a product of its time in its overemphasis on contradiction resolution. It’s fine if taken with a caution in light of practical needs, and with awareness (which I did not see in Popper, but perhaps you can point it out) of developments in rationality and logic that followed in later decades to address those needs (like satisficing, type-II rationality, paraconsistent logics and so forth: weaker systems of reasoning than the absolute contradiction intolerance of classical deduction).

I tend to label strict adherence to classical logic as naive deductivism, understanding that the use of “naive” here is not an epithet; it is instead an acknowledgement of unawareness of later developments of specifically circumscribed contradiction tolerance and deductive gaps, which may better suit scientific activities in their more instrumental, pragmatic, “normal”/everyday mode (note that these would still regard the power abuse you cite as something to discard, since it has no rationale even in these weaker systems). This naivety is no different than the fact that most math uses naive set theory without concern about Russell-type paradoxes and the contradictions they entail, since most applied math (and much pure math as well) does not enter territory where those concerns must be heeded.

With that clarification, I hope you know I share your concerns about biostatistics and statistics in general. The personal conflict I have with some Popperians obscures the fact that I think the falsficationist/refutationist perspective is indispensable for sound research at any level, and that the epistemology implicit in much of “statistical inference” is naive in the negative sense or worse: obscurantist and even nonsensical. But the conflict over how to address the latter problem remains intense and often nasty, leading me to think that progress is arrested by a complex mix of human factors, including the innate unrecognized cognitive limitations we are saddled with, as well as unopposable social pressures.

In that light, claims about “getting beyond the statistics wars” begin to sound like “peace in our time” did in 1938. And as in Brave New World from the same decade, AI/machine learning now offers some glimmer of a future beyond the conflict, in which the goal of individual understanding is recognized as hopeless and perhaps the central obstacle to progress (after all, who understands the internet and its World Wide Web full detail?).

1 Like

Neyman thought very highly of de Finetti:

After the above article had been completed, I read a paper by B. de Finetti [3] representing the contents of his lecture delivered at the “III Entretiens de Ziirich” in 1951. Seeing that Professor de Finetti represents the “subjectivist” school of thought on probability, it is a pleasure to find myself in full agreement with most of what he writes. In particular, I welcome de Finetti’s concluding paragraph directed against formulae intended to impose on thinking public normative measures of intensity of
beliefs that this public should experience in the given circumstances.
(…)
The difference between the standpoint of de Finetti and the one expressed here is, perhaps, that for me philosophy of science is a little more of an empirical discipline than it is for de Finetti. As a result, my own attention is more intensely directed towards the acts of will involved in the concluding phase of any research. As a result whenever I have previously decided to act as if I were assured that a given stochastic model M adequately represents a certain class of phenomena (here de Finetti would
probably say “whenever I believe that, for example, hereditary phenomena are governed by Mendel’s laws”) and I have to choose between several possible actions the desirability of which depends on the details of the phenomena that are unknown, I consider it advisable to make the choice after examining (a) the relative undesirability of consequences of the various possible errors and (b) the frequencies, implied by that same model M, with which a given rule of behavior would lead to these different errors. As far as I can see, de Finetti does not discuss these questions and I hope that, if and when they attract his attention, we shall find ourselves in further perfect agreement. Among other things, I hope that we shall agree that the awareness of the problem of inductive behavior leads to practically important and theoretically interesting new mathematical problems.

(Page 21)

(I’m not sure I agree that de Finetti’s philosophy is “a little less empirical”. He thought forecasts should be empirically tested and evaluated, it wasn’t just about coherence.)

Interestingly, Fisher (at the end of his life) and Jeffreys thought they were closer to each other than to Neyman (!). According to Keith Runcorn,

“Fisher was genial and Jeffreys was friendly. I remember saying something
about their controversy and, in his charming way, Fisher said, ‘He agreed with Jeffreys
approach more than the current school of Neyman,’ and Jeffreys very emphatically said,
‘Yes, we are closer in our approach.’” (Box, 1978, p. 441)
(Cited here https://faculty.educ.ubc.ca/zumbo/ins2001/Alymer.pdf)

3 Likes

Thanks Pedro for the direct cites! I remember vividly from his class and direct interaction Neyman being most aware of personalistic elements in statistics and hence more tolerant of subjective Bayesian than of Fisherian ideas. All that revealed how the dimension of conflict in mid-20th century statistics was in some ways more of decision/betting theorists like Neyman, Wald and de Finetti, each of whom explicated value-based or subjective elements in statistics*, vs. summarization/information theorists like Fisher and Jeffreys. But for some reason this more acrimonious split was not much emphasized compared to the frequentist vs. Bayesian distinction.

*Here’s an example from Neyman Synthese 1977 p. 106 from whence I learned of the nullistic fallacy (automatically taking “no effect” as the test hypothesis to be falsified):
Consider an experiment with mice…intended to determine whether a chemical A is carcinogenic or not. This experiment, with m mice exposed to A and n control mice, will show some numbers X and Y of mice which died from cancer. Our question is: What is our ‘hypothesis tested’? To answer this question we must first answer another question: which error in testing is the more important to avoid?
As usual, there are two possible errors. The verdicts about A may be: (i) ‘A is carcinogenic’, and (ii) ‘A is not carcinogenic’. Each of these verdicts may be wrong. Which of these errors is our ‘error of the first kind’? Here we come to the subjectivity of judging importance. From the point of view of the manufacturer the error in asserting the carcinogenicity of A is (or may be) more important to avoid than the error in asserting that A is harmless. Thus, for the manufacturers of A, the ‘hypothesis tested’ may well be: ‘A is not carcinogenic’. On the other hand, for the prospective user of the chemical A the hypothesis tested will be unambiguously: ‘A is carcinogenic’. In fact, this user is likely to hope that the probability of error in rejecting this hypothesis be reduced to a very small value!

6 Likes