**The fourth meeting of our New Phil Stat Forum*:**

**The Statistics Wars
and Their Casualties**

**January 7, 16:00 – 17:30 (London time)
11 am-12:30 pm (New York, ET)**
**

***note time modification and date change*

**Putting the Brakes on the Breakthrough, **

**or “How I used simple logic to uncover a flaw in a controversial 60-year old ‘theorem’ in statistical foundations” **

**Deborah G. Mayo**

**ABSTRACT: **An essential component of inference based on familiar frequentist (error statistical) notions p-values, statistical significance and confidence levels, is the relevant sampling distribution (hence the term sampling theory). This results in violations of a principle known as the *strong likelihood principle* (SLP), or just the likelihood principle (LP), which says, in effect, that outcomes other than those observed are irrelevant for inferences within a statistical model. Now Allan Birnbaum was a frequentist (error statistician), but he found himself in a predicament: He seemed to have shown that the LP follows from uncontroversial frequentist principles! Bayesians, such as Savage, heralded his result as a “breakthrough in statistics”! But there’s a flaw in the “proof”, and that’s what I aim to show in my presentation by means of 3 simple examples:.

**Example 1**: Trying and Trying Again**Example 2**: Two instruments with different precisions

(you shouldn’t get credit/blame for something you didn’t do)**The Breakthrough**: Don’t Birnbaumize that data my friend

As in the last 9 years, I posted an imaginary dialogue (here) with Allan Birnbaum at the stroke of midnight, New Year’s Eve, and this will be relevant for the talk.

**Deborah G. Mayo** is professor emerita in the Department of Philosophy at Virginia Tech. Her *Error and the Growth of Experimental Knowledge* won the 1998 Lakatos Prize in philosophy of science. She is a research associate at the London School of Economics: Centre for the Philosophy of Natural and Social Science (CPNSS). She co-edited (with A. Spanos) *Error and Inference* (2010, CUP). Her most recent book is *Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars* (2018, CUP). She founded the Fund for Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (E.R.R.O.R Fund) which sponsored a 2 week summer seminar in Philosophy of Statistics in 2019 for 15 faculty in philosophy, psychology, statistics, law and computer science (co-directed with A. Spanos). She publishes widely in philosophy of science, statistics, and philosophy of experiment. She blogs at errorstatistics.com and phil-stat-wars.com.

**For information about the Phil Stat Wars forum and how to join, click on this link. **

**Readings:**

**One of the following 3 papers:**

*My earliest treatment via counterexample:*

- Mayo, D. G. (2010). “An Error in the Argument from Conditionality and Sufficiency to the Likelihood Principle” in
*Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science*(D Mayo and A. Spanos eds.), Cambridge: Cambridge University Press: 305-14.

*A deeper argument can be found in:*

- Mayo 2014. “On the Birnbaum Argument for the Strong Likelihood Principle,” (with discussion & rejoinder)
*Statistical Science,*29(2), 227-239, 261-266.

*For an intermediate Goldilocks version (based on a presentation given at the JSM 2013):*

- Mayo 2013. “Presented Version: On the Birnbaum Argument for the Strong Likelihood Principle.” In
*JSM Proceedings*, Section on Bayesian Statistical Science. Alexandria, VA: American Statistical Association, 440-453.

**This post from the**** Error Statistics Philosophy blog will get you oriented. **(It has links to other posts on the LP & Birnbaum, as well as background readings/discussions for those who want to dive deeper into the topic.)

**Slides and Video Links:**

D. Mayo’s** slides**: “Putting the Brakes on the Breakthrough, or ‘How I used simple logic to uncover a flaw in a controversial 60-year old ‘theorem’ in statistical foundations’”

**D. Mayo’s presentation:**

**(Link to paste in browser):**https://philstatwars.files.wordpress.com/2021/01/mayo_172021_presentation.mp4**SHORT LINK (quick):**https://wp.me/abBgTB-x4

**Discussion **on Mayo’s presentation:

**(Link to paste in browser):**https://philstatwars.files.wordpress.com/2021/01/mayo-172021-discussion-1.mp4**SHORT LINK (quick):**https://wp.me/abBgTB-xc

**Mayo’s Memos: **Any info or events that arise that seem relevant to share with y’all before the meeting.

**You may wish to look at my rejoinder to a number of statisticians: **Rejoinder “On the Birnbaum Argument for the Strong Likelihood Principle”. (It is also above in the link to the complete discussion in the 3^{rd} reading option.)

**I often find it useful to look at other treatments.** So I put together** this short supplement** to glance through to clarify a few select points.

*****Meeting 12 of our the general Phil Stat series which began with the LSE Seminar PH500 on May 21

Hi, I would be interested to hear your opinion on this attempt by Greg Gandenberger to proof some form of the Likelihood Principle which – at least claims – to take into account your critic of the proof. (https://gandenberger.org/wp-content/uploads/2013/11/new_proof_of_lp_post.pdf). Thanks and best!

Sebastion:

I don’t know if you were at the presentation. Your comment was in spam, probably as you haven’t commented before, and I was not notified of this, found it by accident.

I recommend that you read through my slides and what Greg says, and decide for yourself if his dismissing my disproof holds any water. Slides 62-3 address what he says. His first allegation is that

“Mayo’s comments here are true of the operational sufficiency principle. Birnbaum’s sufficiency principle does not say anything about how inference is to be performed”. (Gandenberger 2015).

I don’t know what ‘operational sufficiency’ means here except something like actual sufficiency, and yes, Birnbaum claims the LP holds for ANY account that holds Sufficiency and Conditionality. So I don’t know why Greg can see himself as showing anything at all that counters my refutation.

Yes, Birnbaum purports to derive a UNIVERSAL GENERALIZATION: For any case of informative inference (about a parameter in a given model), from any school, if SP and CP hold, then LP follows.

Are you familiar with what’s required to refute an argument purporting to show a universal generalization? Many people are not.

All I need to do is supply a single counterexample, and any LP violation in sampling theory will do. That’s because it constitutes a case of a statistical account where there are LP violations, SP and CP holds AND YET THERE IS NO LOGICAL CONTRADICTION. So

((SP & CP) -> LP)

is not a theorem.

So Birnbaum’s argument (from SP and CP to LP) is invalid. There really isn’t much more to it.

Greg also goes back, as he has in an earlier paper, to saying, well the “proof” holds for accounts that obey the LP!!! Is that what it takes to make it into the ‘breakthrough’ volume these days? I don’t think so, as that would be to show a tautology.

Birnbaum’s argument, as Birnbaum is quite clear, was to give an argument that is relevant for a sampling theorist and for “approaches which are independent of this [Bayes’] principle”[Birnbaum (1962), page 283].

Bayesians and Likelihoodists have MUCH simpler ways to derive the LP, e.g., it follows directly from inference by way of Bayes Theorem.

Of course ANY ARGUMENT CAN BE MADE VALID!!! How? Two ways. Include the conclusion in the premises. That’s what assuming what you’re aiming to prove amounts to—but anything can be proved that way. It’s circular. A second way is to introduce contradictory premises. Again, anything can be proved that way because anything follows from a contradiction. That is, the following is a theorem, for any X.

((A -> (~A -> X))

I hope this helps.

Thanks for the detailed answer! I was there, unfortunately I had to leave early, but I will watch the rest of the talk once I find time and think it through again with your comments here in mind. (This is of course not a contribution to this topic, but I wanted to reply and thank you anyway.)

The following includes the 3 comments and my reply regarding the forum that had been on errorstatistics.com.

***

Yusuke Ono

Thank you very much for your wonderful papers and presentation.

Yesterday, I commented in the seminar that the two likelihoods in your Example1 aren’t proportional.

But as you replied, the example(ii) in Cox(1978; p.53) is the same as your Example1, and it’s said that the two likelihoods are identical.

A Japanese mathematician, Prof. Gen Kuroki, also points out to me in Twitter that the two likelihoods in your Example1 are identical.

I am now investigating whether I told a lie or not, but I am very confused. If someone know more information or mathematical derivations, I would like them to show the hint to me.

Sorry for this confusion.

*******

January 8, 2021

Yusuke Ono

Now, I understand I was wrong.

A Japanese mathematician, Prof. Gen Kuroki, explains to me why the two likelihoods in your Example1 become identical.

In Q&A time at yesterday’s session, I told false information (I told that two likelihoods aren’t proportional).

I am very sorry for my misunderstanding.

If you are interested in the reason why I misunderstood, I would like to explain it somewhere.

*******

January 8, 2021

Mayo

Yusuke: A lot of people get this wrong, which is why I showed the equation from Cox and Hinkley (1974) in the “supplement” I prepared for my presentation yesterday. The thing is that the example is given by BOTH sides, so if there were anything wrong with it it would be a problem for them in their choice of example. There are zillions of LP violations: any use of a confidence level, p-value, standard error, error probability will do, and one doesn’t even need to name an example to make the points on either side. I used this deramatic example because, amazingly enough, it’s one the pro-LP people are happy about (i.e., they don’t think it should matter if you’re guaranteed to reject a null hypothesis erroneously). Less dramatic examples about (e.g., binomial vs negative binomial). It seemed quicker in a presentation to have an example where the LP violation was a difference in p-values, rather than keep saying “an LP violation”.

If you want to explain your point, you are welcome to do so in a comment here. Thank you for your interest.

*******

January 8, 2021

Yusuke Ono

Thank you for your reply, and I am sorry again for my confusion.

Let me just explain where I was wrong. I calculated the density function conditioned on n (, Pr(X1 = x1, X2 = x2, …, Xn = xn | N = n),) also for the Trying and Trying Again experiments. I should have not conditioned on n. The density function which I need to calculate may be denoted by Pr(X1 = x1, X2 = x2, …, Xn = xn, N = n). This unconditional likelihood becomes identical to the one for the fixed-n experiment.

Reply