**The second meeting of our New Phil Stat Forum*:**

**The Statistics Wars
and Their Casualties**

**September 24: 15:00 – 16:45 (London time)
10-11:45 am (New York, EDT) **

**“Bayes Factors from all sides:
who’s worried, who’s not, and why”**

**Richard Morey**

.

Richard Morey is a Senior Lecturer in the School of Psychology at the Cardiff University. In 2008, he earned a PhD in Cognition and Neuroscience and a Masters degree in Statistics from the University of Missouri. He is the author of over 50 articles and book chapters, and in 2011 he was awarded the Netherlands Research Organization Veni Research Talent grant Innovational Research Incentives Scheme grant for work in cognitive psychology. His work spans cognitive science, where he develops and critiques statistical models of cognitive phenomena; statistics, where he is interested in the philosophy of statistical inference and the development of new statistical tools for research use; and the practical side of science, where he is interested in increasing openness in scientific methodology. Morey is the author of the BayesFactor software for Bayesian inference and writes regularly on methodological topics at his blog.

**Readings:**

**R. Morey:** Should we Redefine Statistical Significance

Relevant background readings for this meeting covered in the initial LSE 500 Phil Stat Seminar can be found on the Meeting #4 blogpost

** SIST:** Excursion 4 Tour II **Megateam**: Redefine Statistical Significance:

Information and directions for joining our forum are here..

**Slides and Video Links:**

Morey’s slides “Bayes Factors from all sides: who’s worried, who’s not, and why” are at this link: https://richarddmorey.github.io/TalkPhilStat2020/#1

Video Link to Morey Presentation: https://philstatwars.files.wordpress.com/2020/09/richard_presentation.mp4

Video Link to Discussion of Morey Presentation: https://philstatwars.files.wordpress.com/2020/09/richard_discussion.mp4

**Mayo’s Memos: **Any info or events that arise that seem relevant to share with y’all before the meeting.

*****Meeting 9 of our the general Phil Stat series which began with the LSE Seminar PH500 on May 21

Richard Morey: Nice presentation, thanks.

I had asked a question about BF versus posterior odds that I made the mistake of mentioning Jim Berger’s name in, and that dominated the response. But my real question concerns when a BF is not equal to the posterior odds. I may have misunderstood, but I thought I heard you say that you didn’t really care if posterior odds existed? If they don’t then how can BF be defined other than just skipping to likelihood ratios (which is too often done). It seems to me that both prior and posterior odds must exist in order for BF to exist, by definition. And if I’m using BF it’s because I want to do a Bayesian analysis, so I’m not willing to disregard the prior odds assigned to my model or models.

Consider problems that are a little more complex than testing a parameter value – say choosing between a binomial data model and a beta-binomial (mixture) data model. One model assumes that a set of binomial parameters are all equal, the other model assumes that they are all distinct. I may well have evidence from a previous experiment (testing a vaccine in pigs to prevent PRRS in pigs, for example) that observed proportions from each binomial (live, normal young for each adult sow) tend to be somewhat variable within groups (control and vaccinated). But it is not clear from visual examination of my data that I should go with the mixture model. A frequentist might do a test for homogeneity of binomial proportions, but a Bayesian might want to use posterior odds with unequal prior odds for the two models. In this situation, would you want to ignore the posterior odds and simply use whatever you believe was suggested by the Bayes Factor?

Not to drag this out too long, but I appreciated your ending — I can be a frequentist when I feel that is what the problem calls for, and I can be a Bayesian when I feel that is what a problem calls for. They are, however, different approaches and different uses of concepts of probability so I am highly unlikely (however you want to interpret that) to try and mix the two in the analysis of a single problem.

Hi Mark, my position is that none of the quantities in any statistical procedure “exist” in any real way; our models are useful fictions, so if we dwell on questions like “what is the *real* probability type I error rate” or “what are my true prior odds” then we miss the usefulness of the insights of statistical procedures. We’re looking in the wrong place. So prior odds don’t exist because we don’t have such beliefs. We might find that positing them is useful, not it isn’t necessary to glean insight from the procedure.

There are situations in which decisions are necessary and posterior probabilities can be useful. My extreme evidentialist perspective mostly reflects my own background, which is more basic science. I’m definitely in favour, though, of utilising the aspects of the procedures that seem to fit the bill for a particular problem.

Dear Richard: thanks a lot for your nice talk today. I am working in a field, basic biomedicine, where sample sizes are always an issue. My question is: are Bayes factors sensitive to “sample size” and “statistical power”? I would be grateful if you could suggest readings about it. Thanks again. Cilene.

Hi Cilene, it depends what you mean by sensitive. For a given effect size estimate, the Bayes factor changes as a function of N, because the distribution of the test statistics change with N. How this happens is a bit different from the p value. I wrote a blog about this here: http://bayesfactor.blogspot.com/2015/04/all-about-that-bias-bias-bias-its-no.html

Dear Cilene:

Richard will doubtless come to answer questions when he can, here are some quick replies that I would give. The answer to the first i yes. (check his slides and/or the reading for today.)

On power, it’s important to see that Bayes factors do not use power. Power is a property of a Neyman-Pearson test: it’s the probability of rejecting the test hypothesis or null hypothesis when a specific alternative is true. It’s the complement of the type II error probability for a Neyman-Pearson test. Now in trying to relate those tests to BFs, you might see an attempt to use power/alpha to compute a Bayesian-type posterior probability of Ho. This is not a legitimate usage of power, or so I argue. Here is a section of my book that discusses it, starting section 5.6. It is short. Morey and I have written about this “diagnostic screening” model of tests, and it is used by the authors of the “Redefining Statistical Significance” article.

I hope this helps, until Richard swings by.

Thank you so much dear Debora, we will study it. Best wishes, Cilene