Earlier this month, Scientific American
blogger (and neuroscience graduate student) Jared Horvath posted a rebuttal to the Economist cover story on “How Science Goes Wrong”,
the latter of which I discussed in a previous post. Let's examine Horvath's post in detail; as we'll see, there is not much to like.
Horvath asserts that “[U]nreliable
research and irreproducible data have been the status quo since the
inception of modern science. Far from being ruinous, this unique
feature of research is integral to the evolution of science.”
Horvath presents several historical examples of great scientists
publishing findings based on data that later investigators have not
been able to reproduce. Three cases are described in detail
(Galileo, Dalton, and Millikan) and others are mentioned (Mendel,
Darwin, and Einstein). Nonetheless, he agrees that their
contributions were of tremendous value to science. “Their work, if
ultimately invalid, proved useful.” Horvath concludes that “If
replication were the gold standard of scientific progress, we would
still be banging our heads against our benches trying to arrive at
the precise values that Galileo reported. Clearly this isn't the
case.”
I believe Horvath's reading of history
is deeply flawed. First of all, my take-home from his case studies
is that it is very easy for scientists to fool themselves, and
sometimes they are lucky enough to be right. Second, the
“irreproducible” findings were only useful because they were
consistent with the findings of others, which in the aggregate
pointed the way to correct theories. In modern times, the
development of scientific theory would be greatly expedited by
publishing more defensible results in the first place.
Horvath goes on to describe the
serendipitous route to the discovery of Viagra by Pfizer scientists,
stating that it illustrates “the true path by which science
evolves.” Although the original goal of the drug had failed, the
data revealed other, unexpected applications. “Had the initial
researchers been able to massage their data to a point where they
were able to publish results that were later found to be
irreproducible, this would not have changed the utility of a sub-set
of their results for the field of male potency.”
Again, I disagree with Horvath's
interpretation of events. The point of clinical research is not to
massage the data until it is reproducible. This in fact was not the
problem with the drug, rather its failure for the original study
objectives was the problem. If the study design and execution are
sound, there is no fiddling with the data needed at all, and this is
precisely how the Pfizer scientists salvaged the drug. It is a
non-sequitur to jump from the nonlinear path of drug development to
somehow giving a blessing to non-reproducible research.
Horvath goes on to criticize the
conventional portrait of the scientific method as progressing in
discrete, cumulative steps. “In reality, science progresses in
subtle degrees, half-truths and chance. An article that is 100
percent valid has never been published. While direct replication may
be a myth, there may be information or bits of data that are useful
among the noise.” Once again, I find these arguments
non-sequiturs. The orthodox portrayal of the scientific method has
been criticized by countless philosophers of science. However, this
issue is completely disconnected from that of non-reproducible
research. If we were to bless the half-baked publication of results,
as Horvath seems to do, we would also give blessing to the kind of
work discussed by Glenn Begley. Begley once cornered an oncologist
whose work he (Begley) could not reproduce. Upon questioning, the
oncologist admitted, “We did this experiment a dozen times, got
this answer once, and that's the one we decided to publish”
(Couzin-Frankel, 2013).
Horvath goes on to celebrate what we
can learn from failure, and that “with enough time all
scientific wells run dry.” Learning from failure is certainly a
good thing, as discussed by Couzin-Frankel (2013). However, the
right way to do that is to learn from failures resulting from well
designed, conducted, and reported studies, not from the garbage that
the Economist article was addressing. In the latter case, I believe
“garbage in, garbage out” is the lesson. Horvath seems to be
arguing that “garbage in, gospel out” (the ironic motto of this
blog)!
Do all scientific wells run dry? I
think not. The classical mechanics of Galileo and Newton still
provides the framework for the study of classical fluid dynamics, a
lively discipline that thrives to this day. The concepts of
Darwinian evolution continue to influence how we fight infectious
disease, for instance, by updating the influenza vaccine on an annual
basis.
I'm all for disclosing the messy nature
of scientific progress, learning from failure (and publishing the
results), and so on. However, all of these things can only be done
when critical thinking (including statistical thinking) are
constantly at work throughout the process. We should all be seeking
to make research reproducible not by “massaging data” but by
thinking critically, using good study design and execution, employing sound data
collection, and providing full disclosure of methods and results. It seems to
me that Horvath has not really understood the reasons for
non-reproducible research that have been put forward by Ioannidis (2005)
and others.
I hate to end the year on a sour note,
but this is likely to be my last post for 2013. Nonetheless the raw
material for this blog is considerable, and hopefully I will have
more to post in the new year. Thanks for reading!
References
Jennifer Couzin-Frankel, 2013: The power of negative thinking. Science, 342: 68-69.
John P.A. Ioannidis, 2005: Why most published research findings are false. PLoS Medicine, 2 (8), e124: 696-701.