P-Values – love ‘em or hate ‘em?

P-values. https://en.wikipedia.org/wiki/File:P-value_Graph.png Reproduced under Creative Commons Attribution-ShareAlike 3.0 License

The Annals of Internal Medicine have today published an article on Responding to Reviewers and Editors About Statistical Significance Testing.  Attendees of any statistical training that I conduct, will already know my opinion on p-values and their merits and drawbacks.

P-values in and of themselves provide very little information, rather they just dichotomize findings into ‘significant’ or ‘not significant’ along some fairly arbitrary line (usually, but not always, p<0.05).  But what do they actually tell us?  They tell us nothing of the magnitude or precision of the effect observed, which is probably more important when trying to detangle whether or not an effect is real.

Many statisticians argue for presenting confidence intervals, which I agree with, however, as Savitz and colleagues note, when it comes to publication, both editors and reviewers might request that p-values are included.  Savitz and colleagues recommend the following in these situations:

  1. Clarify the philosophy that significance tests are not being used to interpret the results.
  2. Cite authoritative sources explaining that quantitative interpretation is consistent with recommendations from the American Statistical Association, the International Committee of Medical Journal Editors, and current textbooks on epidemiologic methods.

The problem with p-values needs to be addressed early before the manuscript is even started.  We would recommend that protocols and statistical analysis plans consider using a non-p-value based approach, however, careful consideration of the regulatory decision making process is also needed.  The p-value is still considered the ‘cornerstone of FDA decision making’. Kennedy-Shaffer’s essay on this topic makes interesting reading.

The Centre for Drug Evaluation and Research put a statement out in 2021 which acknowledged some of the difficulties associated with p-values, but also noted that p-values provide an important benchmark that provide certainty for drug developers, and provide consistency, fairness and transparency when reviewing drug applications.  So love ‘em or hate ‘em, p-values are likely to be around for a good while longer.

So what is our recommendation? If you are going to produce p-values, make sure they are put into context, and include a confidence interval.