Sunday, February 23, 2014

A call for more reproducible research in drug discovery/development

Phase III drug trials are typically randomized, blinded, controlled clinical trials.  However, last month in JAMA, Djulbegovic et al. (2013) argued "more than 80% of phase 1 studies and more than 50% of phase 2 studies are currently nonrandomized."  They argue that these early phase studies should all be randomized, and that even preclinical studies in animals and cell cultures should also be randomized.  (Randomization is probably even less prevalent in preclinical research than in the early phase studies discussed in the quote.)  In other words, the authors advocate study designs that encourage reproducibility across the spectrum of clinical and preclinical research.

They argue that non-randomized studies can easily lead to incorrect decisions, both pro and con.  Thus randomized studies are more efficient and provide stronger backing for decision making.  (They also make the case that randomized studies are more ethical; I'm not sure I find their reasoning here as compelling.)  They hope that use of more rigorous study designs across the drug development arena could be one way to address the industry's infamously high failure rate.

Here is a key passage from the article, describing the literature in preclinical research.

This literature yields an excess of statistically significant findings that cannot be eventually replicated, let alone translated into clinical successes.  For preclinical research conducted by the industry, routine adoption of rigorous randomized designs should be straightforward--no company wants to spend millions of dollars for the clinical testing of useless treatments.  In fact, industry researchers have taken the lead in raising the concerns about the reproducibility of preclinical research and suggesting partial solutions.  For preclinical research conducted by non industry researchers, similar rigorous practices can also be routinely adopted and requested.  Funders and journals can specify that they will sponsor and publish animal studies only if they fulfill rigorous randomization criteria.  Justified exceptions to this rule are likely to be rare.
(I have not included the footnotes; see the original.)  The major lesson for me in the above passage is that the main contribution of statistics to such studies is in the design, not the analysis.  "Statistically significant" findings by no means guarantee that the study has a chance of being reproduced, whereas good study design principles would greatly enhance the likelihood that such studies are reproducible.  Unfortunately much of the teaching and practice of statistics, both by statisticians and non-statisticians, tends to emphasize the mathematical/calculational side, rather than the study design side.

I'd like to see a devil's advocate's response to all this.  I find the authors' views compelling, and have difficulty imagining the grounds for which one might disagree.

Reference


Benjamin Djulbegovich, Iztok Hozo, and John P. A. Ioannidis, 2013:  Improving the drug development process:  more not less randomized trials.  Journal of the American Medical Association, 311 (4):  355-356.


Saturday, February 1, 2014

Responses to "When Mice Mislead"

This past week's issue of Science (the Jan. 24, 2014 issue) has two letters to the editor, responding to a report last November, "When Mice Mislead" by Jennifer Couzin-Frankel, which I discussed in an earlier post.  The first letter, by Richard Traystman and Paco Herson, points to earlier findings, similar to those reported by Couzin-Frankel, in the stroke research community.  Most importantly, they assert that "It is unlikely that poor methods used in animal studies account for all the negative clincial trials that have been performed based on preclinical studies.  After all, some investigators do perform appropriate experiments, and even those studies rarely lead to positive clinical trials."  The authors point to the fact that mouse studies are usually done with healthy young mice, whereas human subjects in neuroprotective drug clinical trials are often older and have many co-morbidities.  They propose that aged mice with comorbid diseases be used in stroke trials, as a better animal model of human disease.

The second letter is from statistician Gary Churchill.  He zeroes in on one key question:  "Was the result replicated in more than one genetic background?"  He goes on to identify two "root causes" for nonreproducible research:

Science today is driven by an incentive system that often rewards precedence and impact over quality of the work.  Statistical training of scientists often emphasizes analytical techniques over experimental design and quantitative reasoning.  These are systemic problems that will not change without substantial effort
Meanwhile, Churchill endorses the message of Couzin-Frankel's article with his maxim:  "Be wise, randomize."

I think that both of these letters add value to the original piece by Couzin-Frankel. In particular, Churchill's second "root cause" is particularly interesting, as both statisticians and lay scientists or mathematicians who teach statistics are all guilty of overemphasizing methodology, modeling, and inference at the expense of study design and critical thinking. 

References

Jennifer Couzin-Frankel, 2013: When mice mislead. Science, 342: 922-925.

Richard J. Traystman and Paco S. Herson, 2014:  Misleading results:  translational challenges.  Science, 343:  369-370.

Gary Churchill, 2014:  Misleading results:  don't blame the mice.  Science, 343, 370.


The value and place of prespecifying data analysis plans

DTLR does not usually stray into the social sciences, but a paper in Science last month (Miguel, et al., 2013) provides another opportunity to dwell on reproducible research. Prospective, designed experiments are becoming more common in the social and behavioral sciences, particularly in economics and program evaluation. However, as in the natural sciences, “Commentators point to a dysfunctional reward structure in which statistically significant, novel, and theoretically tidy results are published more easily than null, replication, or perplexing results.” Reporting standards in social science journals are similarly lax as those in biology journals, and “researchers have incentives to analyze and present data to make them more 'publishable,' even at the expense of accuracy.” Examples of poor practices include the publication of positive results which form a subset of a larger study with mixed or null results, as well as presenting exploratory findings dressed up as confirmatory results.

The authors propose that three core practices be emphasized: disclosure, registration and preanalysis plans, and open data and materials. These concepts are familiar to those who work in clinical trials. However, the authors believe that the situation can be improved yet further than in the medical trial model. In the latter, the “dominant role” of government regulatory agencies “arguably slows adoption of innovative statistical methods.” The authors are also resistant to a “one-size-fits-all” approach for trial registration, preferring a method-specific approach. They foresee some convergence between methods used in behavioral research with those in medical trials, particularly in the neuroscience arena.

Near the end of the paper, there is a particularly eloquent passage that I'd like to quote in full.
The most common objection to the move toward greater research transparency pertains to preregistration. Concerned that preregistration implies a rejection of exploratory research, some worry that it will stifle creativity and serendipitous discovery. We disagree.
Scientific inquiry requires imaginative exploration. Many important findings originate as unexpected discoveries. But findings from such inductive analysis are necessarily more tentative because of the greater flexibility of methods and tests and, hence, the greater opportunity for the outcome to obtain by chance. The purpose of prespecification is not to disparage exploratory analysis but to free it from the tradition of being portrayed as formal hypothesis testing.
The above two paragraphs can easily carry over into all experimental research, not just those in the social and behavioral sciences.


Reference

 
E. Miguel, et al., 2014: Promoting transparency in social science research. Science, 343: 30-31.