Sunday, November 3, 2013

Non-reproducible research in the news

An epidemic of non-reproducible research in the life and behavioral sciences has been revealed in recent years. Much of the spadework has been done by John Ioannidis and collaborators, discussed earlier on DTLR. Well known biopharmaceutical industry reports from Bayer (Prinz, et al., 2011) and Amgen (Begley & Ellis, 2012) provide further confirmation. 

Glenn Begley, one of the co-authors of these papers, was interviewed for a story by Jennifer Couzin-Frankel in the recent Science special issue on Communication in Science, discussed on DTLR last month. Couzin-Frankel (2013) discusses Begley's failed attempts to reproduce the results published by a prominent oncologist in Cancer Cell. At a 2011 conference, Begley invited the author to breakfast and inquired about his team's inability to reproduce the results from the paper. According to Begley, the oncologist replied, “We did this experiment a dozen times, got this answer once, and that's the one we decided to publish.” Begley couldn't believe what he'd heard.

Indeed, I am simultaneously shocked but not surprised. Shocked, because it displays an utter lack of critical thinking on the oncologist's part. Not surprised, because in my experience critical thinking is rarely formally taught to scientific researchers, and the incentive system for scientists rewards such lax behavior. The oncologist may have forgotten why he got into science and medicine to begin with. The pressures of a career in academic medicine may have corrupted his integrity, but the work of Ioannidis and others alluded to above shows that this phenomenon is pretty common.

The rest of Couzin-Frankel's article discusses how clinical studies often get published even when the primary objective of the study has failed. Usually (but not always) the authors are up front about the failure, but try to spin the results positively in various ways. For instance, by making enough unplanned post hoc statistical comparisons, inevitably they'll find one that achieves (nominal) statistical significance, and they'll use that to justify the publication. Evidently journals allow this to occur, resulting in tremendous bias in what gets published. These are examples of selective reporting (cherry-picking) and exaggeration that result in misleading interpretations. This is not how science ought to be done.

Couzin-Frankel's article ends with a discussion of journals dedicated to publishing negative results, as well as recent efforts by mainstream medical journals to allow publishing negative studies.

Non-reproducible research has also gotten the attention of The Economist, which ran a cover story and editorial on it a few weeks ago. As additional evidence they cite the following statistic: “In 2000-2010 roughly 80,000 patients took part in clinical trials based on research that was later retracted because of mistakes or improprieties.” Thus there are real consequences. Patients are needlessly exposed to clinical trials that may have negligible scientific value; their altruism is being abused. This should be a worldwide scandal, and I congratulate The Economist for shining a harsh light on the problem.

The Economist points out that much of this research is publicly funded, and hence a scientific scandal becomes a political and financial one. “When an official at America's National Institutes of Health (NIH) reckons, despairingly, that researchers would find it hard to reproduce at least three-quarters of all published biomedical findings, the public part of the process seems to have failed.” They then discuss the journal PLoS One, which publishes papers without regard to novelty and significance, but only for methodological soundness. “Remarkably, almost half the submissions to PLoS One are rejected for failing to clear that seemingly low bar.” Among the statistical issues the article discusses are multiplicity, blinding, and overfitting.

The Economist points discusses the main reasons for these problems: scarcity of funding for science, which leads to hyper-competition; the incentive system that rewards non-reproducible research and punishes those interested in reproducibility; incompetent peer review; and statistical malpractice. The suggest a number of solutions: raising publication standards, particularly on statistical matters; making study protocols publicly available prior to running a trial; making trial data publicly available; and making funding available for attempt to reproduce work, not just publish new work.

The Economist's article has generated a certain amount of controversy, but I think it gets it mostly right.  I would have formulated the statistical discussion differently, and I think the article misses the chance to point out more fundamental statistical problems.  I also don't give much weight to the comments by Harry Collins about "tacit knowledge".  A truly robust scientific result should be reproducible under slightly varying conditions.

References


Begley, C.G., and Ellis, L.M. (2012): Drug development: raise standards for preclinical cancer research. Nature, 483: 531-533.

Jennifer Couzin-Frankel, 2013: The power of negative thinking. Science, 342: 68-69.

Prinz, F., Schlange, T., and Asadullah, K. (2011): Believe it or not: how much can we rely on published data on potential drug targets? Nature Reviews Drug Discovery, 10: 712.


No comments:

Post a Comment