Nobel Prize and Statistical Significance


Recently, three scientists jointly won the 2019 Nobel Prize for Medicine for their pioneering work on how human cells respond to changing oxygen levels: William Kaelin Jr, Sir Peter Ratcliffe, and Gregg Semenza. I understand that science is certainly much more than just statistics, but I thought to myself, "We hear from critics how p-values, statistical significance, frequentism, etc., are supposedly bad for science, but they never seem to talk about the good of these techniques or outright deny there is any. What if these Nobel Prize winners use p-values and statistical significance?"

So I checked some of the papers from the past to the present from the Nobel laureates in the general area of research their Nobel Prize was in. Note that this is not to say the researchers never use any Bayesian techniques or that they always use p-values and statistical significance. In fact, I found a small amount of papers using only Bayesian techniques, frequentist and Bayesian techniques together, and some papers using none at all. Rather, this is to show that scientists of the highest caliber, doing some of the most important work, use p-values and statistical significance. I also didn't compare much dates, but I'm pretty sure some of their papers were even published after the ASA II and Nature pieces saying to not say significant and to not dichotomize.

Here is a small sampling of the results:

William Kaelin Jr

Mutant p53 induces a hypoxia transcriptional program in gastric and esophageal adenocarcinoma Transaminase Inhibition by 2-Hydroxyglutarate Impairs Glutamate Biosynthesis and Redox Homeostasis in Glioma Autochthonous tumors driven by Rb1 loss have an ongoing requirement for the RBP2 histone demethylase

Sir Peter Ratcliffe

Inherent DNA-binding specificities of the HIF-1a and HIF-2a transcription factors in chromatin PHD2 inactivation in Type I cells drives HIF-2a-dependent multilineage hyperplasia and the formation of paraganglioma-like carotid bodies

Gregg Semenza

Glutaminase 1 expression in colorectal cancer cells is induced by hypoxia and required for tumor growth, invasion, and metastatic colonization A RASSF1A-HIF1a loop drives Warburg effect in cancer and pulmonary hypertension Reciprocal Regulation of DUSP9 and DUSP16 Expression by HIF1 Controls ERK and p38 MAP Kinase Activity and Mediates Chemotherapy-Induced Breast Cancer Stem Cell Enrichment

Banerjee, Duflo, Kremer

Winners of the prize in Economics (Banerjee, Duflo, and Kremer) also use p-values and statistical significance language in their work.

A multifaceted program causes lasting progress for the very poor: Evidence from six countries

The miracle of microfinance? Evidence from a randomized evaluation In fact, in a 2010 New Yorker article The Poverty Lab that profiles Duflo and her coworkers' work, we read that using such tools is essentially their general approach to solving important real-world problems
"Within economics, Duflo and her colleagues are sometimes referred to as the randomistas. They have borrowed, from medicine, what Duflo calls a "very robust and very simple tool": they subject social-policy ideas to randomized control trials, as one would use in testing a drug. This approach filters out statistical noise; it connects cause and effect. The policy question might be: Does microfinance work? Or: Can you incentivize teachers to turn up to class? Or: When trying to prevent very poor people from contracting malaria, is it more effective to give them protective bed nets, or to sell the nets at a low price, on the presumption that people are more likely to use something that they've paid for? (A colleague of Duflo's did this study, in Kenya.) As in medicine, a J-PAL trial, at its simplest, will randomly divide a population into two groups, and administer a "treatment" - a textbook, access to a microfinance loan - to one group but not to the other. Because of the randomness, both groups, if large enough, will have the same complexion: the same mixture of old and young, happy and sad, and every other possible source of experimental confusion. If, at the end of the study, one group turns out to have changed-become wealthier, say - then you can be certain that the change is a result of the treatment. A researcher needs to ask the right question in the right way, and this is not easy, but then the trial takes over and a number drops into view. There are other statistical ways to connect cause and effect, but none so transparent, in Duflo's view, or so adept at upsetting expectations. Randomization "takes the guesswork, the wizardry, the technical prowess, the intuition, out of finding out whether something makes a difference," she told me. And so: in the Kenya trial, the best price for bed nets was free."

This was for some 2019 laureates, but what about for some past years' Nobel Prizes?

James P. Allison

Combination CTLA-4 Blockade and 4-1BB Activation Enhances Tumor Rejection by Increasing T-Cell Infiltration, Proliferation, and Cytokine Production "Restoring function in exhausted CD8 T cells during chronic viral infection" Cancer regression and autoimmunity induced by cytotoxic T lymphocyte-associated antigen 4 blockade in patients with metastatic melanoma Synergism of Cytotoxic T Lymphocyte-associated Antigen 4 Blockade and Depletion of CD25+ Regulatory T Cells in Antitumor Therapy Reveals Alternative Pathways for Suppression of Autoreactive Cytotoxic T Lymphocyte Responses Depletion of Carcinoma-Associated Fibroblasts and Fibrosis Induces Immunosuppression and Accelerates Pancreas Cancer with Reduced Survival

Frances H. Arnold

Structure-Guided Recombination Creates an Artificial Family of Cytochromes P450 A diverse family of thermostable cytochrome P450s created by recombination of stabilizing fragments Random Field Model Reveals Structure of the Protein Recombinational Landscape

Remember 2013

And lest we forget, in 2013, the Nobel Prize for Physics was shared by Peter Higgs and François Englert "for the theoretical discovery of a mechanism that contributes to our understanding of the origin of mass of subatomic particles, and which recently was confirmed through the discovery of the predicted fundamental particle, by the ATLAS and CMS experiments at CERN's Large Hadron Collider". The research papers in this area tend to use p-values and statistical significance:

More Examples

If we look back a few more years in Economics (the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel), Chemistry, Physiology and Medicine, and Physics, we can find more examples (that I do not list here). I did not include Literature and Peace in my review, although it is possible there was some data analysis done in those areas. If we include other frequentist notions such as sampling, confidence intervals, and standard frequentist methods like Pearson/Spearman correlation, linear regression, logistic regression, ANOVA, large sample asymptotics, and bootstrapping, that would also increase the counts. Counts could also increase if we include in the count other researchers basing their work off of the Nobel laureates' work, and these researchers are using p-values, statistical significance, etc.

Also, I and others (Mayo comes to mind) are not convinced that researchers are not using things like p-values and statistical significance if they don't include them in their published papers. Researchers would have to use something like this if they are making claims of strength (or not) of relationships, terms in models, differences in distributions, and so on. In other words, researchers could still be checking p-values and statistical significance "offline" or "behind the scenes", and then write-up their paper and have it published without mentioning p-values and statistical significance, according to their personal tastes and/or arbitrary journal standards. Therefore, it is not inconceivable that there is an undercounting of examples of using p-values and statistical significance language here.


It seems like learning about and using p-values, statistical significance, adjusting for multiple comparisons, and using nonparametric frequentist tests might be the way not to hurt science as critics claim, but to help do really good science. If Nobel Prize winning science isn't good enough evidence to convince about the merits of p-values and statistical significance, I really don't know what is.

Thanks for reading.

Please anonymously VOTE on the content you have just read: