Sunday, December 29, 2013

The replication myth?

Earlier this month, Scientific American blogger (and neuroscience graduate student) Jared Horvath posted a rebuttal to the Economist cover story on “How Science Goes Wrong”, the latter of which I discussed in a previous post. Let's examine Horvath's post in detail; as we'll see, there is not much to like.

Horvath asserts that “[U]nreliable research and irreproducible data have been the status quo since the inception of modern science. Far from being ruinous, this unique feature of research is integral to the evolution of science.” Horvath presents several historical examples of great scientists publishing findings based on data that later investigators have not been able to reproduce. Three cases are described in detail (Galileo, Dalton, and Millikan) and others are mentioned (Mendel, Darwin, and Einstein). Nonetheless, he agrees that their contributions were of tremendous value to science. “Their work, if ultimately invalid, proved useful.” Horvath concludes that “If replication were the gold standard of scientific progress, we would still be banging our heads against our benches trying to arrive at the precise values that Galileo reported. Clearly this isn't the case.”

I believe Horvath's reading of history is deeply flawed. First of all, my take-home from his case studies is that it is very easy for scientists to fool themselves, and sometimes they are lucky enough to be right. Second, the “irreproducible” findings were only useful because they were consistent with the findings of others, which in the aggregate pointed the way to correct theories. In modern times, the development of scientific theory would be greatly expedited by publishing more defensible results in the first place.

Horvath goes on to describe the serendipitous route to the discovery of Viagra by Pfizer scientists, stating that it illustrates “the true path by which science evolves.” Although the original goal of the drug had failed, the data revealed other, unexpected applications. “Had the initial researchers been able to massage their data to a point where they were able to publish results that were later found to be irreproducible, this would not have changed the utility of a sub-set of their results for the field of male potency.”

Again, I disagree with Horvath's interpretation of events. The point of clinical research is not to massage the data until it is reproducible. This in fact was not the problem with the drug, rather its failure for the original study objectives was the problem. If the study design and execution are sound, there is no fiddling with the data needed at all, and this is precisely how the Pfizer scientists salvaged the drug. It is a non-sequitur to jump from the nonlinear path of drug development to somehow giving a blessing to non-reproducible research.

Horvath goes on to criticize the conventional portrait of the scientific method as progressing in discrete, cumulative steps. “In reality, science progresses in subtle degrees, half-truths and chance. An article that is 100 percent valid has never been published. While direct replication may be a myth, there may be information or bits of data that are useful among the noise.” Once again, I find these arguments non-sequiturs. The orthodox portrayal of the scientific method has been criticized by countless philosophers of science. However, this issue is completely disconnected from that of non-reproducible research. If we were to bless the half-baked publication of results, as Horvath seems to do, we would also give blessing to the kind of work discussed by Glenn Begley. Begley once cornered an oncologist whose work he (Begley) could not reproduce. Upon questioning, the oncologist admitted, “We did this experiment a dozen times, got this answer once, and that's the one we decided to publish” (Couzin-Frankel, 2013).

Horvath goes on to celebrate what we can learn from failure, and that “with enough time all scientific wells run dry.” Learning from failure is certainly a good thing, as discussed by Couzin-Frankel (2013). However, the right way to do that is to learn from failures resulting from well designed, conducted, and reported studies, not from the garbage that the Economist article was addressing. In the latter case, I believe “garbage in, garbage out” is the lesson. Horvath seems to be arguing that “garbage in, gospel out” (the ironic motto of this blog)!

Do all scientific wells run dry? I think not. The classical mechanics of Galileo and Newton still provides the framework for the study of classical fluid dynamics, a lively discipline that thrives to this day. The concepts of Darwinian evolution continue to influence how we fight infectious disease, for instance, by updating the influenza vaccine on an annual basis.

I'm all for disclosing the messy nature of scientific progress, learning from failure (and publishing the results), and so on. However, all of these things can only be done when critical thinking (including statistical thinking) are constantly at work throughout the process. We should all be seeking to make research reproducible not by “massaging data” but by thinking critically, using good study design and execution, employing sound data collection, and providing full disclosure of methods and results. It seems to me that Horvath has not really understood the reasons for non-reproducible research that have been put forward by Ioannidis (2005) and others.

I hate to end the year on a sour note, but this is likely to be my last post for 2013. Nonetheless the raw material for this blog is considerable, and hopefully I will have more to post in the new year. Thanks for reading!

References


Jennifer Couzin-Frankel, 2013:  The power of negative thinking.  Science, 342:  68-69.

John P.A. Ioannidis, 2005:  Why most published research findings are false.  PLoS Medicine, 2 (8),  e124:  696-701.

Wednesday, December 25, 2013

Some notes on high energy physics

DTLR has been on hiatus for the last month or so, due to travel and other commitments.  I will slowly try to get back into the groove of things in the coming weeks.

A couple items of interest related to high energy physics have appeared in the interim.  First, there is a very candid Guardian interview with 2013 Physics Nobel Laureate, Peter Higgs, by Decca Aitkenhead.  I found a few things striking in the interview.  First, he did not seem particularly bothered by the fact that only two of the six or so theorists involved in the prediction of the Higgs particle were awarded the Nobel Prize.  I was bothered, as described in earlier posts here and here.  Second, the article states that Higgs has not written many papers over the course of his career, and was considered an embarrassment to his department for his lack of productivity.  Again he does not seem to be bothered by this.  I do sympathize with his complaint that in today's research environment, he might not have had the time or space for the deep thinking required for formulating his theory.  In fact he says he might not be able to get a job in today's environment.  This should lead us to reflect on the degraded condition of scientific research infrastructure in our own time.

On the other hand, the lack of productivity would certainly bother me if it were to happen to me.  I would have either changed fields or careers, or somehow found some other way to contribute to society.  Perhaps Higgs continued at least to teach before his retirement?  Richard Hamming (1997) was unsympathetic to great scientists whose productivity slowed when they were given the opportunity to work without constraints, as at the Institute for Advanced Study in Princeton, NJ.  Hamming, like the Nobel laureate economist, James M. Buchanan (1994), most admired hard workers who were driven to be productive.  (Hamming names John Tukey as a prototype of the successful hard-working scientist.)

Third, the story of the rejection of Higgs' initial manuscript is perhaps a salutary one, as it forced him to rewrite it and explicitly identify the new particle.  Evidently it made the importance of the paper more obvious and led to publication.  This is a (perhaps rare) example of peer review doing what it was intended to do.

Finally, it is stated that Higgs turned down a knighthood, but was tricked into accepting another national honor.  I agree wholeheartedly with his explanation:  "I'm rather cynical about the way the honours system is used, frankly. A whole lot of the honours system is used for political purposes by the government in power."  I also concur with his discomfort with the name "God Particle" that has been used to describe the Higgs boson.

Looking forward now, the incoming Fermilab director, Nigel Lockyer, has an interesting perspective on the future of big high energy physics experiments, published in Nature earlier this month.  The lack of resources for such big projects is enforcing global cooperation.  For instance, he says that the world really needs a long baseline neutrino experiment, but that the world can only afford to pay for one.  With three candidate sites, physicists from all nations need to work together whichever one ends up getting funded.  It is interesting to note that he says that talk of brain drains and gains has been replaced with 'brain circulation.'  All of this sits well with me.

References


James M. Buchanan, 1994:  Ethics and Economic Progress.    University of Oklahoma Press.  See Chapter 1 in particular.

Richard W. Hamming, 1997:  The Art of Doing Science and Engineering:  Learning to Learn.  Gordon and Breach.  See Chapter 30 in particular.




Tuesday, November 19, 2013

Follow up on John Bohannon's 'Open access sting'

Last month, I made note of the 'open access sting' carried out by John Bohannon and published in Science.  Last week, this post at the Scholarly Kitchen features an interview with Bohannon in which he is given an opportunity to answer his critics.  I find all of his answers persuasive.  Bohannon has adequately defended his work, in my view.

However, I still sympathize with the critics who would have liked a control group of non-open-access publishers as well as those who believe the peer review system, writ large, is broken.  Bohannon makes clear that addressing both of these issues was out of scope for his project.  That's fair.  Many of us, however, are indeed concerned with these broader matters.  I don't think it would be necessary to carry out a 'sting' to demonstrate that non-open-access publishers, including top-notch ones like Science, have a track record of publishing substandard or even deeply flawed research.  I wrote at length about this earlier.

H/T:  In the Pipeline

Friday, November 15, 2013

bioRxiv goes live

In an earlier post I mentioned the forthcoming preprint server for the life sciences, bioRxiv.  According to Nature, the site has now launched.  See the write-up by Ewen Callaway here.



The STEM-Crisis Myth

The Chronicle of Higher Education this week has an article by Michael Anft examining the "STEM-Crisis Myth", or "The much-hyped shortage of science and tech graduates."  Politicians, business and academic leaders, and professional science societies are constantly warning of a shortage of American college students who want to study science and engineering.  The actual data backing up such claims, the Chronicle shows, is at best contestable.  The article tries to air both sides of the debate.  The president of Arizona State University is quoted as stating, "there's too much reliance on anecdotes" about the alleged poor job market for science graduates.

The Chronicle cites data alleging that about half of STEM majors leave their field within 10 years, and that 1 in 5 American scientists contemplate leaving the country.  There are dueling studies, however.  My personal experience is more consistent with the allegation that the STEM-jobs-crisis is a myth.  The industry in which I once worked has experienced a massive downsizing of its workforce in the last decade or so, including in my field.  More senior scientists have had trouble finding new positions that make full use of their talents.  Meanwhile, young graduates have trouble finding jobs, and many post-docs have been trapped in academic limbo with too many chasing too few faculty and industry positions.  Sequestration and the instability of the federal budget threatens federal funding for science across academia as well as the national labs.  Put simply, there isn't much money available for basic and applied research in the public and private sectors these days.

The 'myth' has been discussed in other venues long before this, and I am glad that the Chronicle has decided to cover it.  DTLR encourages skepticism about the alleged shortage of scientists in the labor market.  Those who have been talking up the shortage have a vested interest in increasing enrollments in universities, membership in scientific societies, and expanding the labor pool of science graduates in order to push down labor costs.  This includes academic, business, and professional society leaders across the sciences, as well as politicians.  The rhetoric is highly self-serving, because it promotes their own vested interests at the expense of the young people to whom they are serving up deceptive statements.  DTLR believes that any professional society, university, or business leader is committing fraud when they recruit youngsters into science and engineering with the promise of a bountiful job market when they graduate.  Many of them realize this, because they take an alternate tack by arguing that STEM training is a good foundation for any career, as the Chronicle notes, and the article ends by promoting the liberal arts idea of a "broader education" including the sciences and humanities.  I'm not sure there is much data to support these views.

Read the article and decide for yourself.

Reference


Michael Anft, 2013:  The STEM-Crisis Myth.  Chronicle of Higher Education, LX (11):  A30-A33 (Nov. 15, 2013).






Sunday, November 3, 2013

Non-reproducible research in the news

An epidemic of non-reproducible research in the life and behavioral sciences has been revealed in recent years. Much of the spadework has been done by John Ioannidis and collaborators, discussed earlier on DTLR. Well known biopharmaceutical industry reports from Bayer (Prinz, et al., 2011) and Amgen (Begley & Ellis, 2012) provide further confirmation. 

Glenn Begley, one of the co-authors of these papers, was interviewed for a story by Jennifer Couzin-Frankel in the recent Science special issue on Communication in Science, discussed on DTLR last month. Couzin-Frankel (2013) discusses Begley's failed attempts to reproduce the results published by a prominent oncologist in Cancer Cell. At a 2011 conference, Begley invited the author to breakfast and inquired about his team's inability to reproduce the results from the paper. According to Begley, the oncologist replied, “We did this experiment a dozen times, got this answer once, and that's the one we decided to publish.” Begley couldn't believe what he'd heard.

Indeed, I am simultaneously shocked but not surprised. Shocked, because it displays an utter lack of critical thinking on the oncologist's part. Not surprised, because in my experience critical thinking is rarely formally taught to scientific researchers, and the incentive system for scientists rewards such lax behavior. The oncologist may have forgotten why he got into science and medicine to begin with. The pressures of a career in academic medicine may have corrupted his integrity, but the work of Ioannidis and others alluded to above shows that this phenomenon is pretty common.

The rest of Couzin-Frankel's article discusses how clinical studies often get published even when the primary objective of the study has failed. Usually (but not always) the authors are up front about the failure, but try to spin the results positively in various ways. For instance, by making enough unplanned post hoc statistical comparisons, inevitably they'll find one that achieves (nominal) statistical significance, and they'll use that to justify the publication. Evidently journals allow this to occur, resulting in tremendous bias in what gets published. These are examples of selective reporting (cherry-picking) and exaggeration that result in misleading interpretations. This is not how science ought to be done.

Couzin-Frankel's article ends with a discussion of journals dedicated to publishing negative results, as well as recent efforts by mainstream medical journals to allow publishing negative studies.

Non-reproducible research has also gotten the attention of The Economist, which ran a cover story and editorial on it a few weeks ago. As additional evidence they cite the following statistic: “In 2000-2010 roughly 80,000 patients took part in clinical trials based on research that was later retracted because of mistakes or improprieties.” Thus there are real consequences. Patients are needlessly exposed to clinical trials that may have negligible scientific value; their altruism is being abused. This should be a worldwide scandal, and I congratulate The Economist for shining a harsh light on the problem.

The Economist points out that much of this research is publicly funded, and hence a scientific scandal becomes a political and financial one. “When an official at America's National Institutes of Health (NIH) reckons, despairingly, that researchers would find it hard to reproduce at least three-quarters of all published biomedical findings, the public part of the process seems to have failed.” They then discuss the journal PLoS One, which publishes papers without regard to novelty and significance, but only for methodological soundness. “Remarkably, almost half the submissions to PLoS One are rejected for failing to clear that seemingly low bar.” Among the statistical issues the article discusses are multiplicity, blinding, and overfitting.

The Economist points discusses the main reasons for these problems: scarcity of funding for science, which leads to hyper-competition; the incentive system that rewards non-reproducible research and punishes those interested in reproducibility; incompetent peer review; and statistical malpractice. The suggest a number of solutions: raising publication standards, particularly on statistical matters; making study protocols publicly available prior to running a trial; making trial data publicly available; and making funding available for attempt to reproduce work, not just publish new work.

The Economist's article has generated a certain amount of controversy, but I think it gets it mostly right.  I would have formulated the statistical discussion differently, and I think the article misses the chance to point out more fundamental statistical problems.  I also don't give much weight to the comments by Harry Collins about "tacit knowledge".  A truly robust scientific result should be reproducible under slightly varying conditions.

References


Begley, C.G., and Ellis, L.M. (2012): Drug development: raise standards for preclinical cancer research. Nature, 483: 531-533.

Jennifer Couzin-Frankel, 2013: The power of negative thinking. Science, 342: 68-69.

Prinz, F., Schlange, T., and Asadullah, K. (2011): Believe it or not: how much can we rely on published data on potential drug targets? Nature Reviews Drug Discovery, 10: 712.


Data management plan to be included in open access mandate

This month's APS News has a front-page report by Michael Lucibella, “Open Access Mandate will Include Raw Data.” The story focuses on the forthcoming mandate from the U.S. Office of Science and Technology Policy (OSTP), regarding open access to journal papers derived from federally funded research, one year after publication. Lucibella says that although no official statement has been made, it is expected that the mandate would include a data management plan, to make data sets generated by public funds available to the public as well. The story quotes the OSTP memo stating that “scientific data resulting from unclassified research supported wholly or in part by Federal funding should be stored and publicly accessible to search, retrieve, and analyze.” Specifics are “just starting to take shape.” The story goes on to outline various challenges to such a mandate.

One valuable feature of the mandate is that “Data points that have been expunged from the final analysis will likely have to be included, the idea being that scientists can evaluate why those points were eliminated.” In principle this is a good thing, but it will be nearly impossible to enforce. Also, there may be some subjectivity involved, as data that are clearly from documented technical errors should probably not be included (in my view); transcription errors should be corrected before posting. Also, I would like meta data to be included along with the raw data files.

The story states that computer codes would not be included in the mandate, “though talks are continuing over this point.” The story quotes statistician Victoria Stodden, who expresses concern about the omission of computer codes, which will obstruct reproducible research. I share Stodden's concern and I hope the mandate will include computer codes.

Modulo the concern about computer programs, DTLR endorses both the mandate to make journal articles public after one year, as well as the mandate to make the data publicly available.  I was a co-author on two publications where we provided supplemental information that included data sets and computer scripts.  However, I've co-authored nearly 20 refereed papers in total, and obviously most of them did not include such supplemental information.  As a result, all these years later, it is impossible for me to reproduce any of that work.  (Caveat:  some of this research was not financed by public funds; nonetheless I believe the principle should apply to all published research.)  I wish such a mandate had been in place at the beginning of my career, so that all of my published work could be reproducible.  With job changes and so on, I've long lost track of data sets and computer codes that were employed in doing the work reported in those papers.

Reference


Online comments for scientific papers

This week's Nature has an editorial (“Time to Talk”) discussing online commenting for scientific papers. The value of such a feature is to make a more permanent record of the sort of critical give-and-take over research that might be witnessed when work is presented at a conference or department seminar, or behind the scenes during peer review. Presently “lively debates on blogs and social media” can be difficult to find and preserve, as it is too diffuse (such as on individual blogs like DTLR). The editorial points out some fields, such as evolutionary biology, where an established online hub already exists. The alternative model is for each journal to host its own commenting feature. The editorial states that Nature's and the Public Library of Science's have not gained much traction.

Thus, Nature is excited that PubMed is stepping into the breach, adding a commenting feature there, that might allow it to serve as a highly accessible and visible hub for comments. Most of the rest of the editorial focuses on the potential for incivility in online commenting. For example, they cite the journal ACS Nano expressing concern that scientists are trigger happy with charges of misconduct such as plagiarism or scientific fraud. 

The comments to the Nature editorial itself are valuable (there are 3 as I write this). The first comment points out that one aspect not discussed is anonymity, and there are both pros and cons for it. I would favor allowing anonymous comments, but that signed comments should be given greater visibility. For instance, signed comments could be visible by default, whereas one might have to click to see the anonymous comments. Editors should moderate comment sections and remove irresponsible content.

Thus far I myself have not offered user comments to any scientific literature. I am more likely to write about research here, on DTLR, unless I feel I have an extremely important point to make, and that I am extremely confident about making it. However, frankly I don't spend a lot of time reading original research these days, so I don't expect I'll be making much of a contribution.

A related development mentioned in the editorial is the imminent release by the Cold Spring Harbor Laboratory of bioRxiv, a preprint server for biologists, in the same mold as arXiv for physics. It will offer user comments.

Wednesday, October 30, 2013

Communication in Science: Pressures and Predators

The October 4, 2013, issue of Science features a special section, “Communication in Science: Pressures and Predators.” The item receiving the most attention is a sting operation by John Bohannon, who submitted versions of a fake, deliberately flawed article to 304 open access journals. Roughly half of them were actually accepted for publication. It is difficult to separate those journals who did so due to a slipshod editorial process (which is shared by many traditional journal publishers) and those that are deliberately predatory, the topic of Jeffrey Beall's blog.

Bohannon's experiment was criticized by Michael Eisen for not including a control group, that is, a group of traditional, subscription-based journals, in its sample. Philip Moriarty at Physics Focus echoes Eisen's criticisms. Both are fairly hostile to traditional journals for similar reasons. The epidemic of nonreproducible research, discussed in earlier posts on DTLR, serves to illustrate the broken-ness of the peer review system that they speak of. (Ironically, Eisen himself is interviewed in the piece that follows Bohannon's in the Science special feature, about “heretical” publisher Vitek Tracz.)  Eisen and Moriarty are pretty angry at Science for their hypocrisy, but they should have read the rest of the special feature. Jennifer Couzin-Frankel's piece on “The Power of Negative Thinking”, which advocates publication of negative results, is quite blunt about the epidemic of non-reproducible research. For me, Couzin-Frankel's piece is the most important article in the special feature, and I will dedicate a separate post to it shortly.  (Also of great interest is the Policy Forum piece by Diane Harley -- I may write further about that one too.)

In the meantime, the points made by Eisen and Moriarty are well taken. Nonetheless, as Bohannon, Beall, and others have shown, the open access journal movement has opened the floodgates for hundreds of predatory online journals that maintain no standards whatsoever. This surely deserves the widespread coverage that Bohannon's piece has garnered. I only wish that Couzin-Frankel's article had attracted equal scrutiny, along with the recent moves by Nature to raise its level of play on these matters, discussed earlier on DTLR.

It should be disclosed that I once served as a referee for an open access journal from a publisher on Beall's list. Thus I can testify that they (OMICS Group) at least did send out one paper from one journal for review. (Fortunately my review was positive and the paper was published; I do not know what would have happened had I submitted a negative review. I also strongly suspect that I was chosen to referee the paper because the submitting author was asked to provide a list of names of potential reviewers. Many traditional journals do the same.) Some of the other publishers on Beall's list do not even bother with even the appearance of a legitimate review process. I have also had a paper of my own rejected by an open access journal, one not included on Beall's list. The publisher of that journal, Hindawi, also passed Bohannon's test and rejected his phony paper.

Recommended reading


The Special Section in Science contains the following items (as well as a number of sidebar pieces by Jon Cohen and the other authors). Readers' attention was also called to a number of related Perspective and other items appearing in the same issue.

Richard Stone and Barbara Jasny, 2013: Scientific discourse: buckling at the seams. Science, 342: 57.

A cartoon by Randall Munroe (XKCD).

John Bohannon, 2013: Who's afraid of peer review? Science, 342: 60-65. (“A spoof paper concocted by Science reveals little or no scrutiny at many open-access journals”)

Tania Rabesandratana, 2013: The seer of science publishing. Science, 342: 66-67. (“Vitek Tracz was ahead of the pack on open access. Now he wants to rewrite the rules of peer review”)

Jennifer Couzin-Frankel, 2013: The power of negative thinking. Science, 342: 68-69. (“Gaining ground in the ongoing struggle to coax researchers to share negative results”)

David Malakoff, 2013: Hey, you've got to hide your work away. Science, 342: 70-71. (“Debate is simmering over how and when to publish sensitive data”)

Eliot Marshal, 2013: Lock up the genome, lock down research? Science, 342: 72-73. (“Researchers say that gene patents impede data sharing and innovation; patent lawyers say there's no evidence for this”)

Jeffrey Mervis, 2013: The annual meeting: improving what isn't broken. Science, 342: 74-79. (“Annual meetings are moneymakers for most scientific societies, and scientists continue to flock to them. But as the world changes, how long can the status quo hold?”)

Diane Harley, 2013: Scholarly communication: cultural contexts, evolving models. Science, 342: 80-82.

Tuesday, October 29, 2013

Should there be an alternative to the Nobel Prize?

At Physics Focus, Tara Shears has a post that makes a similar suggestion to one that appeared in DTLR earlier this month, that a prize should be awarded for an accomplishment rather than a set of individuals.  Naturally my thinking is on the issue aligns well with hers.  One of the commenters to her post, John Duffield, pointed out that the Nobel committee is bound to obey the terms of Alfred Nobel's will, and that the suggestions of Shears and Sean Carroll (New York Times) should be applied to a new prize, not the Nobels.  I surely agree here too.  Thus, we'll always have the Nobels, warts and all, but their prestige will need to be rivaled by new prizes that better reflect the scientific process.  Will a benefactor step forward, wealthy but interested in reforming the reward system in science? Alternately, Science and other magazines usually publish a list of top 10 discoveries of the year or some such, and these put the focus on the achievements rather than the individuals.  Perhaps such lists, if suitably hyped up, could achieve what Carroll, Shears, and I are aiming for? 

Sabine Hossenfelder has an opposing view on the Nobel Prizes on her blog.  It consists of two lines of reasoning.  The first boils down to the following.

Giving such an honor to institutions is akin to doing away with private property in communism and believing that everybody cares for the well-being of the group as they do for their own. It doesn’t work because most people want to be recognized as individuals, not as members of collectives. That’s true also for scientists.
I found this an unexpected but quite thoughtful contribution to the conversation.  It is true that as a member of a 'collaboration', winning a prize is not exactly something one can place on one's CV.  Indeed, I found it inappropriate when individual members of the IPCC claimed to be "Nobel Laureates".  In any case, the Nobel Prizes will continue to propagate according to Alfred Nobel's will.  Perhaps having them co-exist with the Science list of top discoveries is both realistic and desirable.  The prestige and/or visibility of the Science list (or equivalent) just needs to be elevated to close to the level of the Nobel Prizes.  

I do think that this year's prize, which gave the short shrift to Kibble and robbed Brout because he had the ill fortune to die too young, still illustrate unresolved issues with the Nobel Prizes.  The Nobel committee probably would lose credibility for not recognizing in some way the discovery of the Higgs boson, but there are few good ways to do so within the constraints of Nobel's will.  Still, perhaps Kibble should have been given the third slot.  A precedent is the 2001 Nobel in Physics for Bose-Einstein condensation, which did not just go to Wieman and Cornell, who were the "first", but also to Ketterle, whose early work arguably went further than Wieman and Cornell's, but was published 4 months later.

Hossenfelder's second line of reasoning is that Nobel Laureates can be powerful spokespeople for their fields.  I don't find this one compelling.  Some of the leading spokespeople for physics are not Nobel Laureates, such as Stephen Hawking or Neil de Grasse Tyson.  These folks are much better known to the public than most living Nobel Laureates in physics.

Monday, October 28, 2013

Upcoming on DTLR

This month has by far been the busiest since DTLR began in July of this year.  And yet my posts on the three biggest fish are still in the planning stages.  So, here I just want to alert readers to the background material on two of these.

First, there is the special issue from Science on "Communicating in Science:  Pressures and Predators," published earlier this month.  Second, there is the cover story in The Economist from last week, "How Science Goes Wrong."  (See also the accompanying Leader.)  Both of these tackle central topics of this blog, and I am grateful for the high profile these have issues have taken.

The third 'big fish' of the month that I plan to comment on is a reaction to the federal government shutdown at the beginning of the month.  (Full disclosure:  I was furloughed as a result of this episode.)  I plan to muse on the role of private funding for science.

All three topics are very timely and I hope to have my comments available soon.  In the meantime, on the technology side, see this interesting post by David Auerbach at Slate on the travails of the Affordable Care Act's website rollout this month.

Sunday, October 27, 2013

Software validation in computational biology

Last month in Nature, there was a brief article by Erika Check Hayden about an experiment in peer review of scientific software being carried out by the new Mozilla Science Lab. Nine papers published in PLoS Computational Biology, selected by its editors, would have their code subject to a peer review by software engineers. The experiment and its motivation are described in the article; I also recommend reading the user comments posted at the end. (See also the earlier pieces by Zeeya Merali and Nick Barnes, published together in Nature in 2010.)  Apparently there has been some controversy, as scientists are understandably nervous about having their work subjected to a new form of review. However, scientists are not well trained in software development concepts such as version control, validation, and verification, and the code they write may become difficult to maintain, or even worse, produce undetected errors that have worked their way into published research.

The Mozilla Science Lab was introduced this past summer, and is led by Kaitlin Thaney. It sponsors Greg Wilson's Software Carpentry; I strongly recommend having a look at the latter's website. I've read Wilson's essays in Computing in Science and Engineering and other publications over the years, and have been sympathetic to his views. I've heard rumors that the some of the code at Fermilab is spaghetti code, with bits and pieces of it written by many hands over many decades. Such an unwieldy mass of legacy code is almost impossible to maintain. I was told about one bug whose fix generated another, more serious bug that was impossible to debug. It was decided to restore the original bug and leave it in the code!

I am fortunate in that one of my formative experiences was an internship with a small company that, as a matter of survival, implemented a fairly disciplined software construction methodology, based in part on Steve McConnell's Code Complete. Because the company was small and had a certain rate of turnover, all of their software had to be highly maintainable, assuming the original coder was no longer employed at the firm. It was a point of pride there that you wouldn't be able to tell who wrote a piece of code found in the software they developed, without looking at the header (which had version control data), for we all conformed to the same software style.

DTLR endorses the Mozilla experiment in peer review of software. I hope we learn a lot from their experiment, even if it is deemed to be a failure in the end. In a letter to the editor, Alden and Read (2013) state that software quality should be built in from the beginning, before any data are taken, and not “inspected in” at the peer review stage. They are of course right, but to protect the rest of the community I do think software peer review is a concept that should at least be explored.

References


Nick Barnes, 2010: Publish your computer code: it is good enough. Nature, 467: 753.

Zeeya Merali, 2010: Computational science:...error. Nature, 467, 775-777.

Erika Check Hayden, 2013: Mozilla plan seeks to debug scientific code. Nature, 501: 472.

Kieran Alden and Mark Read, 2013: Scientific software needs quality control. Nature, 502: 448.


Biology's dry future

A few weeks ago, Science magazine featured a very interesting story by Robert F. Service titled “Biology's Dry Future.” The subtitle tells us, “The explosion of publicly available databases housing sequences, structures, and images allows life scientists to make fundamental discoveries without ever getting their hands 'wet' at the lab bench.” The story highlights two quotes from interviews. The first is by Atul Butte of Stanford University School of Medicine: “I'm like a kid in a candy store. There is so much we can do.” The second is by David Heckerman of Microsoft Research: “You basically don't need a wet lab to explore biology.”

The title of the story is not quite accurate. There will always need to be wet lab biologists to do experiments and generate data. What is novel here is the new breed of biologist who works on data generated by other labs, but need not have a lab themselves. This may be new to biology, but physicists have long had a split between experimentalists and theorists, recently joined by computationalists.

Of particular interest to DTLR are the three “growing pains” mentioned in the article: data access, data standardization, and genetic privacy. I will focus on the first two here. Regarding data access:

In many cases, researchers who have spent their careers generating powerful data sets are reluctant to share. They may be hoping to mine it themselves before others make discoveries based on their work. Or the data may be raw and in need of further analyses or annotation. “These are really hard problems,” Butte says. “We need better systems to reward people that share their data.”

DTLR endorses that last sentence. First of all, anyone who makes the effort to generate a good data set should make the effort to document and annotate it for use. Even if the data are never shared, pretending that it might be shared one day instills the necessary discipline for documentation and annotation. Moreover, if the work is publicly funded, then in my view the social contract requires that the data be made available to the broader scientific community at some point, perhaps after an appropriate time period of exclusive use, say, no more than two years. (This is about the time needed for a grad student or post-doc to squeeze at least one paper out of results.) The new Nature online journal for data sets would provide an excellent venue to generate a peer reviewed publication for the data set alone, rather than discoveries that can be made with it. Bear in mind that in physics, Nobel prizes are awarded to both theorists and experimentalists. Biology as a discipline should adopt a similar cultural mindset to reward both wet bench and dry bench biologists.

Regarding standardization:

Not only do research groups file their data using different software tools and file formats, but also in many cases the design of the experiments—and therefore precisely what is being measured—can differ. Butte and others argue that dealing with multiple file formats is somewhat cumbersome but that the problem is surmountable. But it can be harder to account for differences in experimental design when comparing large data sets.

DTLR could not have said it better. The core problem here is experimental design, and it will always be a limiting factor for dry lab biologists trying to combine data from more than one experiment. A similar problem exists in clinical medicine, under the term 'meta-analysis', and I'm not sure there are really good solutions there either. The best approach, in my view, is to take any findings based on multiple data sets as tentative, exploratory, and hypothesis-generating, rather than definitive. The findings should then be confirmed (or refuted) in a new experiment. This is where the dry lab biologist might have to return to the bench.

Finally, DTLR cautions that dry lab biologists should still spend some time in the lab, at least while in training. There is no substitute for bench time for getting a feel for how sloppy and imprecise experimental data can be, and where the pitfalls and potential systematic and random errors may arise from. It is too easy for a dry bench scientist to take data found in a database at face value. Spending time at the bench will provide a needed reality check.

Reference


Robert F. Service, 2013: Biology's dry future. Science, 342: 186-189.

Sexual harassment in science

Last week's issue of Nature calls for an end to sexual harassment in science. The editorial was triggered by a scandal centered on a blogger who was the editor of the Scientific American blogs. He has resigned. The editorial casts a wider net, introducing the generic character “Dr. Inappropriate” who represents “the widespread tacit acceptance of adolescent behavior.” DTLR endorses their call for fighting back against sexual harassment in science. Although I am keenly sensitive to the potential for false accusations, I condemn scientists who commit sexual harassment, particularly when they do so in the context of an unequal power relationship with the target.

As a male scientist, I have been fortunate not to have been the target of such harassment. However, as a graduate student in the late 1990s, I witnessed an incident involving two “Dr. Inappropriates” that shocked me out of my sheltered perception about the behavior of mature scientists.

The setting was as follows. Our department had a weekly seminar, and on one occasion the speaker was a distinguished physics educator from another university.  She was the author of a textbook being used in an experimental introductory class in my department. She had brought along her (female) teaching assistant (now a faculty member in her own right). Following the seminar, they invited the audience to join them in a nearby classroom where they would run a simulation of a cooperative learning class, with us as the 'students'. (Around this time I myself had already taught a couple recitation sections using the cooperative learning format.) A sales representative from the publisher of the experimental textbook was also present; she was also a female.

I was one of the more junior graduate students at the time, just getting started in research.  (At the time there were very few female faculty or students in the department; there are considerably more today.)  I sat down at one of the tables in the classroom, and was soon joined by three male physics faculty members. (Two are now retired; the third is still on the faculty as I write.) Let us call them E, K, and H.  Professor E had a strong interest in physics education, and I had worked with him in the past. The other two didn't know me very well. In any case, K took the time to introduce himself to me again, and there was some small talk. H more or less ignored me.  (I have other unflattering stories to tell about them, but those will be for another time.)

So, the simulated cooperative learning class began. We were all given an assignment to work on in our groups. Each group consisted of whoever was sitting at a table together. The seminar speaker and her assistant circulated in the room, acting as Socratic facilitators for the groups. They listened to our conversations among ourselves, and tried to help us along. E played along with the simulation. However, K and H started drifting into “adolescent behavior”. First, K commented to the rest of us that the teaching assistant, out of earshot, was cute. K and H then pretended to be 'bad students' and basically annoyed the teaching assistant when she stopped by to listen to us or help us along. Unlike E, they were not taking the exercise seriously.  

At one point, the publisher's sales representative came over, obviously interested in promoting the textbook. By now I sensed that K had lost interest in the whole exercise, but H (who was on the textbook committee) started chatting with the sales rep. I can't remember the nature of what he said, but I do remember the impression that he started getting very flirtatious with her. Her reaction was very polite and professional, and she rapidly maneuvered the conversation toward discussion of her kids. She showed H a wallet picture of the children. This put an end to the flirtation, and they started talking about the textbook instead.

The sales rep's maneuver struck me as masterful. It could not have been the first time she had to deal with this sort of thing, and she maneuvered H out of the 'adolescent' behavior firmly but without ruffling his feathers at all. I appreciate the awkward position she was in. She was obviously trying to curry favor with a member of the textbook committee, but also quick to eliminate the flirtation.

This was the first time I had seen physics professors behaving badly. I regret that I did not challenge them. It would have been risky for me to do so, but that is no excuse. A challenge is precisely what the Nature editorial recommends as a non-legalistic way to discourage such conduct. (However, based on my other interactions with K and H, I speculate they would have ignored any such challenge from me, or even attempted retaliation.) I can only imagine how much more difficult it would have been had I been a female graduate student instead. It would be totally demoralizing to witness the entire episode and understand that this is how some physics professors are capable of treating women when in a professional setting.

I ran into E a few days later, and he expressed his dismay in his colleagues. (He was also a little unhappy with me, as I pretty much said nothing throughout the whole exercise, failing to back up E in playing along with the simulation.) It is notable that even E did not feel empowered to challenge K and H in person at the time – all three were tenured. Tenured professors are untouchable.

What do I take away from this? It isn't clear whether sexual harassment by any legal definition took place. The episode I describe falls into what the Nature editorial calls a grey zone. K's comment about the 'cute' teaching assistant occurred when she was out of earshot, and didn't lead to anything further. H's flirtations ended once it was made clearly unwelcome. K's and H's 'bad student' act was not sexual harassment, but just general 'adolescent' behavior. Nonetheless, my witnessing of this episode made it possible for me to imagine that sexual harassment might indeed take place, under different circumstances, and that it could be perpetrated by people who are set up to be mentors and authority figures to me and other students. Indeed, what I observed that day seems relatively harmless compared to the Scientific American blog scandal, where a real abuse of power is alleged.

The reason the episode was a shock to me was that every single faculty member I had ever interacted with until then was, in my view, an honorable and professional person, and I had never seen professors act immaturely until that moment. Perhaps the one positive outcome was that my blissful naivete ended that afternoon.

Recommended reading


Eileen Pollack, 2013:  Why are there still so few women in science?  New York Times, Oct. 3, 2013.

UPDATE:  A bit off topic, but Sabine Hossenfelder has a (back)reaction to Pollack's piece on her blog.


Friday, October 25, 2013

October 2013 is a great month for physics fans

Of course the month began with the announcement of the Nobel Prizes in physics, chemistry, and medicine. The conversation about the physics prize in particular, discussed earlier on this blog, has been lively. Meanwhile, here in the U.S., Physics Today offered its usual bounty in this month's issue, including an article on “Measuring the Hubble Constant” by prominent astrophysicists Mario Livio and Adam Reiss, the latter a Nobel Laureate. (More on this article below.) Meanwhile, the 50th anniversary issue of the New York Review of Books features an essay on the last 50 years or so of the development of particle physics and cosmology by Nobelist Steven Weinberg (Nov. 7, 2013 issue). Both fields, though in states of disunity and confusion in the early 1960s, gradually came to develop their own highly successful and unifying 'standard models'. The two fields also converged. Weinberg manages to tell the story without naming a single physicist, including himself.

In the U.K., it gets even better. Physics World celebrates its 25th anniversary by publishing a spectacular special issue, which is available for free download (volume 26, number 10, October 2013). And last weekend the Financial Times' FT Weekend Magazine offered its first special issue devoted to a single science, physics (Oct. 18, 2013 issue). (Unfortunately there doesn't seem to be a single link that compiles all the online articles. Here is the lead editorial.) Finally, even The Economist covers physics in an article about the possibility of particle accelerators made of glass, which would allow them to be smaller (Oct. 19, 2013 issue).

Returning to the article by Livio and Reiss, I note with particular interest a discussion of comparing the Hubble constant H0 based on global methods, such as those made by the Planck satellite, with measurements based on local objects.

Local measurements of H0 are complementary to other, higher-redshift probes. Indeed, we'd be remiss if we did not note an apparent tension, at the 3 sigma level, between current measurements of H0 based on local objects and its deduced value based on the standard cosmological model and new Planck results for the cosmic microwave background. That tension may be the harbinger of new physics, but past experience indicates that discrepancies below 3 sigma disappear when more data are available.

Indeed, the threshold for discovery in physics is often five sigma, which was the threshold used in the discovery of the Higgs boson. As a physicist who has strayed into the life sciences, two aspects are particularly striking. First, I admire physicists' relentless skepticism of 3 sigma results (2 sigma is routinely considered 'statistically significant' in the life and behavioral sciences) and willingness to collect more data. The epidemic of non-reproducible research in the life and behavioral sciences betrays the precise opposite tendency in those fields. Second, in physics when we talk about measuring universal constants, there are often many independent procedures for measuring the same universal phenomena. This has been true throughout the history of physics. In the clinical sciences, there is usually exactly one clinically meaningful endpoint, and other substitutes (biomarkers, surrogate endpoints) may provide compelling evidence, but never sufficient in a true outcomes study. If you want to prevent a remission of cancer, you must measure the time to remission. If you want to prevent a heart attack, measure the time to the next one.

References


The Economist, 2013: Small really is beautiful. The Economist, vol. 469, no. 8858, pp. 83-84.

Mario Livio and Adam G. Reiss, 2013: Measuring the Hubble constant. Physics Today, 66 (10), 41-47.

Steven Weinberg, 2013: Physics: what we do and don't know. The New York Review of Books, vol. LX, no. 17, pp. 86-88.



Sunday, October 20, 2013

Making decisions with experts

In yesterday's New York Times, economist Noreena Hertz writes about medical decision making, illustrating it with her personal experience. The piece is entitled “Why we make bad decisions” and focuses on the case where a lay person must make a decision about his or her health or finances in consultation with one or more experts. Physicians often make errors, and the more confident ones are more prone to error. Unfortunately the lay person usually defers to the expert.

How does a lay person evaluate expert opinion. or aggregate multiple, conflicting expert opinions? The author suggests that first, you have to educate yourself. Go into a conversation with the expert as knowledgeable as you can, with as much literacy in the jargon of the field as you can pick up. Be aware of your state of mind and your lack of objectivity. There are known biases and heuristics that might lead to irrational decisions; be aware of them and of your vulnerability to them. Her example is the optimistic bias many of us seem to have, well documented in studies of human behavior. We have a tendency to latch on to good news and tune out bad news.

I would add that often the expert is responding to a set of incentives that differ from yours. Thus, you need to be your own best advocate; do not rely on the expert to have your best interests at heart.

Over the years I've dipped into the literature on “judgment and decision making,” which is what this area of research is called in psychology, or “behavioral economics,” which is what economists call it. I hope to learn more about the field in the future.

DTLR salutes Professors Casadevall and Fang

Back in January, Science magazine featured a story by Jennifer Couzin-Frankel about a pair of dissident medical scientists, Arturo Casadevall and Ferric Fang. They're convinced that science today is very unhealthy, and they have done a number of data-driven studies on the infrastructure of scientific research to diagnose various problems with the enterprise. They've looked at peer review, retraction rates, funding mechanisms, and the incentive system for scientists. Initially I was not familiar with their work, so I began by reading their two editorials in Infection and Immunityhere and here, alluded to in their Huffington Post blog post from 2012. (They also revisit these issues in a post on Project Syndicate from this past summer.) Their intent is to spark conversation rather than provide definitive solutions, so I will accept the invitation to discuss their findings. They write, “What we propose is nothing less than a comprehensive reform of scientific methodology and culture” and I certainly agree that such a dramatic overhaul is needed.

This blog, DTLR, has been particularly concerned about non-reproducible research. Casadevall and Fang do not address this particular issue directly. Rather, they tackle some of the underlying pressures on scientists due to the incentive system they current face. Their diagnosis of the problems has the ring of truth, and I refer readers to their papers; I won't rehash their work here. Rather, I want to comment on their proposed solutions. Addressing the issues they've identified is likely to reduce the incidence of non-reproducible research that I've written about on this blog.

First, the authors propose reforming the reward system for scientists. This includes eliminating the priority rule, which gives credit to the first to publish. The authors recognize the value of competition, but they want to introduce complementary reward mechanisms for collaborative research. No specific details are given, so the idea needs to be fleshed out. Nonetheless I agree on the principle. They also want to replace easy “surrogate methods of quality, such as grant dollars, bibliometric analysis, and journal impact factor” with “careful peer evaluations of scientific quality”. Of course, this is easier said that done. In principle I agree here too, but we need to see a detailed, specific proposal on how this would be done. Arguments about the flaws of impact factors, h-indices, and so on are a perennial favorite in the scientific community, so doubtless there are many ideas out there.

Next, the authors talk about re-embracing philosophy. It sounds to me that they want to add one or more philosophy courses to the curriculum for science students. Practically, this would be difficult, as the curriculum is pretty stocked full. Moreover, I think that if students are forced to take such a course, many may not take it seriously and others may resent it. I happen to be one of those who would welcome such a course (and I did take Philosophy of Science in college), but I would hesitate to impose my values on others. Also, I think some students might rather read a book or two rather than take a full-blown, for-credit course on the subject. Thus, I would make such a course available to every student, but keep it optional rather than required. (In my case, I've continued to read bits and pieces of philosophy throughout my career in science.)

Next, they call for enhanced training in probability and statistics. They recognize the value of statistics in the “design, execution, and interpretation of many scientific experiments.” I would certainly applause, with a caveat. Training should be focused on statistical thinking, with statistical methodology taking a back seat. Too often the training students do get on statistics is focused on methods and software, rather than critical thinking. The misuse of statistical methods, abundant in the literature, is the result. Many scientists, and many statisticians, are simply not qualified to teach statistical thinking. My book review of Marder (2009) on this blog touches on many issues in this regard, and I plan to tackle other specific cases in future posts.  Also in a future post, or more likely in another forum altogether, I will expand on the discussion of statistical thinking vs. statistical methodology in relation to scientific training and practice.

Next, the authors call for developing checklists for reducing methodological errors. This is a superb suggestion, and they provide an example for one case, “observation in which a stimulus elicits an effect”. The example is well thought out. Earlier on this blog I noted a new checklist introduced by Nature for encouraging reproducible research. These are welcome steps forward; more progress is needed.

The authors then turn to structural reforms. They call for more public funding for science and an increase in the number and diversity of new scientists. They find that in recent years, directed research has overshadowed investigator-initiated research in NIH funding. Here I would part company with the authors. I do agree that working on increasing the diversity of the scientific community is important. I cannot agree that increasing the absolute size of the community, either in number of scientists or amount of funding, is a realistic or desirable goal. Frankly, until the scientific community cleans up its own act, by dramatically restructuring the incentive system so that reproducible research is rewarded and non-reproducible research is punished, I would vote to reduce science funding even further, and I would discourage young people from considering science careers. This is a very harsh stance, but I don't think any stance less extreme will convey my level of anger and dismay. As a practical matter, in the United States is is politically unlikely that funding for science will increase when funding for many other worthy societal goals is necessarily decreasing.

Finally, the authors call for reform in the regulatory burden on scientists, restrictions on laboratory size, and a “scientific study of science.” I do not dwell in the academic trenches of science, so I will only briefly discuss the last point. They discuss trying to discover the optimal number of scientists in society, optimal research group size, optimal time for scientific training, etc. I would encourage study of all of these questions, but I would not expect blanket “optimal” answers to result. The system must allow for slack and variability. Such research can only provide guidance, not mandatory and immutable rules. In fact, I would expect studies to address some of the questions they pose to end up completely inconclusive.

Nonetheless, Casadevall and Fang deserve much praise for asking very tough questions and calling for a dramatic reform of the science community's infrastructure. Although I fail to agree with all of their positions, on balance I believe they are on target, and the rest of us should rally around them. Science needs more internal critics like them and like John A.P. Ioannidis (whom I've written about in an earlier post). All of these folks go out on a limb to make controversial observations about how science functions and how it goes wrong, as well as proposing solutions. We all need to follow their lead.

References


Arturo Casadevall and Ferric C. Fang, 2012: Reforming science: methodological and cultural reforms. Infect. Immun., 80 (3): 891-896.

Jennifer Couzin-Frankel, 2013: Shaking up science. Two journal editors take a hard look at honesty in science and question the ethos of their profession. Science, 339: 386-389.

Ferric C. Fang and Arturo Casadevall, 2012: Reforming science: structural reforms. Infect. Immun., 80 (3): 897-901.




Saturday, October 19, 2013

Why can't science and the humanities get along?

Last week's cover of the Chronicle Review features a rambling essay by historian David A. Hollinger, “The Rift: Can STEM and the humanities get along?” It focuses on the alleged rift between the sciences and the humanities within the academy. At some level I don't really care about the culture of the academy, which is disconnected from the rest of society in many ways. However, let me try to make sense of the article anyway.

First, Hollinger makes the case that there is such a rift (based on anecdotal evidence: complaints he's heard from colleagues), and that the wedge “threatens the ability of all modern disciplines to provide—in the institutional context of universities—the services for which they have been designed.” I would argue, though not here, that there are many other reasons that universities struggle to provide the services for which they have been designed, at a reasonable cost, and the rift that Hollinger speaks of has a negligible contribution. This is really a topic not for a separate post, but for another blog entirely, so I will not pursue it further here. (Please consult the Spellings Commission Report on the Future of Higher Education if this interests you.)

Next, Hollinger dismisses the “Two Cultures” conversation of C.P. Snow, because the current rift takes place in a far more hostile political, social, and cultural setting than 50 years ago. Hollinger points to the more recent culture wars, but fails to mention the notorious Alan Sokal hoax, and the 'science wars' of 1990s vintage. Hollinger states that the problem boils down to the accusation that humanities courses and research are often tainted by leftist ideological bias (or political correctness), and that few have accused science courses of displaying such biases. Hollinger counters that scientists and engineers are just as capable of making “fools of themselves” in other ways. I have no dispute with this point, as much space on this blog is devoted to the problems of non-reproducible research in science and medicine. In fact, the problems I write about are arguably much worse for society than any relatively harmless indoctrination that takes place in humanities classrooms for privileged middle class students. Nonetheless, this does not excuse or make permissible such indoctrination, if indeed it is taking place. The fact that both the sciences and the humanities are capable of crummy work does not explain why there is a rift between them. [Regarding indoctrination in the classroom, see Fish (2008).]

Next, Hollinger discusses a report by the Commission on the Humanities and Social Sciences, which apparently prompted him to write this piece. I have not read that report, but his main complaint is that the report ignores the “deep kinship between humanistic scholarship and natural science.” Hollinger states that the humanities “are the great risk takers in the tradition of the Enlightenment”. He provides the following two arguments for this claim. First, the humanities have defended the use of evidence and reason, which should give it common cause with the sciences in a culture which is rapidly devaluing such tenets of critical thinking. This is only partially true. Critical thinking is certainly valued in some sectors of the humanities, but it is flouted in others, as the Sokal hoax showed. (This blog shows that science is guilty of the same.) If I had time I would present other examples of bad reasoning found in the journal literature of literary studies and other humanities fields, but I'm devoting this blog to bad reasoning found in the sciences. Suffice it to say that I do not think that conditions are better in the humanities than they are in the sciences, but this is based on personal experience, and I haven't made the effort to document it.

Second, Hollinger presents a single example of risk taking by the humanities: the hiring of Harold Cruse in the 1960s by the University of Michigan. Cruse did not complete college, but he wrote a critically acclaimed book on African-American intellectual history, and was hired as a full professor of history. However, today's academy would not take such a risk, in either the sciences or the humanities. The tenure system, as it now operates, discourages risk taking and rewards conformism for those on the tenure track. It also protects substandard work by those who achieve tenure. There are certainly exceptions, but by and large the incentives created by the tenure system have the opposite effect of its intent to protect faculty who are doing unpopular research and questioning authority and vested interests.

Hollinger states that the humanities exist in the borderlands between mere opinion and the “methodologically narrower, largely quantitative, rigor-displaying disciplines”, or between “scholarship and ideology.” Hence the risk taking is higher there than in the sciences. Examples include the studies of historically disadvantaged groups such as women's studies, Latino studies, etc. This is probably the only point made in the article that I remotely find merit in.

Hollinger mentions Texas Republicans who want to remove a critical thinking component from public education. He states that “Colleagues in the natural sciences would do well to imagine what society would look like with a significantly diminished place for the human sciences. Techies yes, critics no.” Actually it doesn't require much imagination. The Soviet Union was a superb example, and physicist Andrei Sakharov was supremely accomplished as both a “techie” and a “critic”.

Finally, Hollinger complains about salary inequity among various university departments. He shows that he has a good understanding of the economic reasons for such inequities: in fields where talented individuals have attractive alternatives to working in a university, a university will have to compete financially for that person's services. Nonetheless Hollinger seems to feel affronted by this state of affairs. I think he should take his own advice and enroll in a microeconomics course on the other side of his campus. He also thinks universities are degenerating into “sites where lots of independently interesting valuable things happen” instead of coherent institutions, where “being a professor was a calling in itself.” My reaction to this is So What?

In summary, there is very little in the essay that I can use. There is good stuff and bad stuff going on in both the humanities and in the sciences. Does any of this explain why there is a rift between them? I am a scientist by training and by employment, but I read a lot of history (and occasional bits of philosophy), listen to J.S. Bach, and look at the paintings of George Tooker. Hollinger's article does not seem to be relevant to my intellectual life. What about yours?

References


Stanley Fish, 2008: Save the World on Your Own Time. Oxford University Press.

David A. Hollinger, 2013: The rift: Can STEM and humanities get along? The Chronicle Review, Oct. 18, 2013 issue, pp. B6-B9.

Wednesday, October 16, 2013

Will you help me replicate your experiment?

Last week in Nature Medicine, Elizabeth Devitt wrote about two studies that illustrate the perverse incentives in the scientific community in relation to reproducible research and data stewardship in the life sciences.

First, authors of papers in the Annals of Internal Medicine were surveyed about their willingness to provide additional information about their study protocols to other investigators who may be trying to replicate their experiments.  The proportion answering 'yes' dropped from about 80% in 2008 to 60% today (see her article for a graphic).  Now, I don't know if the size of this drop is comparable to the sampling error; it is not clear what survey methodology was used and how precise the estimates are.  Nonetheless, the article discusses the lack of incentive for scientists to help others reproduce their experiments.  This is a major gap in the incentive system for science, where competition for funding and status is fierce.  We should take heart that at least there are still a majority of authors answering 'yes'.

The second item Devitt mentions is a study showing the decline over time of even the availability of such information, as data becomes lost or inaccessible.  This study surveyed authors of papers in Molecular Ecology published from 1991-2011.  This illustrates the issue of data stewardship that Nature's new open data journal will hopefully help to address.

More on the 2013 Nobel Prize in Physics

In a New York Times opinion piece, Sean Carroll argues persuasively that in the future, the Nobel committee should consider institutions and collaborations for award of the Nobel Prize in Physics.  He states that there is a self-imposed tradition of only awarding the prize to 3 living individuals per year.  This seems to be consistent across the Nobel science prizes, but not the Peace Prize.  (The literature prize seems to go to a single living author each year.)  As long as the terms of Alfred Nobel's will are adhered to, Carroll's argument is persuasive with me, and this year's prize to Englert and Higgs, to the exclusion of others, certainly makes the case quite vividly.

There is only the logistical difficulty of how an institution or collaboration would make use of the prize money.  An institution could certainly put the money in its endowment or operational fund, and the Nobel Peace Prizes awarded to institutions provide a precedent.  How about a more loosely defined 'collaboration'?  This is certainly less clear.  Would all its members be considered 'Nobel laureates'?  Some have tried to take that approach with IPCC individual members when the IPCC won half of the Nobel Peace Prize in 2007.  (IPCC stands for the Intergovernmental Panel on Climate Change.)  I felt this was highly unjustified.

In the case of the Higgs theorists (setting aside for now the two experimental groups who made the discovery last year), one of them was dead by the time of the prize, and the others could be divided into at least 3-4 groups of collaborators; a number of others had done precursor work such as Anderson, Goldstone, and Nambu.  Laureates Englert and Higgs represented two of the pivotal collaborations; a third led by T.W. Kibble was completely excluded from the Prize.  So, no single 'collaboration' could be credited for even the theoretical proposal of the BEH (Brout-Englert-Higgs) mechanism.

Thus, while Carroll's proposal is a sound one, it is not clear to me how it could be applied to the discovery of the Higgs boson, an achievement which nicely reflects the messy nature of scientific discovery.  Could up to three 'collaborations' receive the award?  If so, would all members be considered Nobel Laureates?  Would the qualified individuals need to be identified by name by the Nobel Committee?   Carroll's piece is a good start to the conversation, but the proposal needs to be fleshed out more thoroughly before it could be seriously considered.




More applause to the Nature family of journals

Continuing its journalistic examination of the infrastructure of science publishing, and attempts to improve it, the Nature family of journals has made a number of new steps recently.

On the journalistic side, the current issue has a very thought-provoking report by Eugenie Samuel Reich on the effects of publishing in high-prestige journals.  The psychology here is similar to that of hiring managers who use the schools that candidates graduated from as a filter for 'quality'.  Especially for those candidates whose accomplishments are not known and understood by the hiring manager personally, use of the school-filter is a heuristic device for winnowing down the candidate pool.  Similarly, unless you're in the same field as a given scientist, it is difficult for you to evaluate the quality of that person's work and its impact in his/her community.  Seeing which journals the scientist has published in works as a heuristic for making such an evaluation.  Since doing so is easy, and alternative approaches are considerably more difficult and time consuming, the journal-filter is widely used and has real impact in terms of hiring, funding, and status.  Hence, those who have published in Nature or Science belong to a 'golden club'.  Reich's report makes a notable point in the case of astronomy, as people within the field evaluate papers as they are posted on the arXiv preprint server.  This doesn't help those from outside the field, because a pre-print server has essentially no peer review, and only experts can use it well.

Incidentally, the Nobel Prizes serve a similar role for the general public, who are not familiar with the prizes given within a scientific field or sub-field, but everyone knows that the Nobels are the world's highest form of recognition.

On the operational side, the Nature family of journals is apparently the first and only publisher that makes its citation and bibliographic information openly available as "linked open data", according to David Shotton, director of the Open Citations Corpus, in this Nature commentary.  Moreover, Nature is launching an online data journal, according to this announcement.  This step is welcomed as it contributes both to reproducible research and data stewardship, both topics discussed on DTLR in recent months.

Finally, back in August Nature Methods, having concluded a series of tutorials on data visualization ('Points of View'), have started another tutorial series on statistics, called 'Points of Significance.'  The announcement appeared here and was discussed on their blog by Daniel Evanko here.  It is clear to me that both statistical training as well as the understanding of statistical thinking varies greatly among scientists, and hopefully the new series will provide good advice.  Of course, how well the articles in the series are written, and their content, will be crucial to their success.  (The failure of formal statistics courses in producing well informed scientists is a major reason such a series is needed in the first place.  It is very difficult to convey the concepts of statistical thinking, so the challenge for the authors is considerable.)  The first two entries are on sampling error and error bars, both good nuts-and-bolts topics, but hopefully they will get to the nuances of study design and the pitfalls of data analysis in good time.

As one of the 'prestige' journals, Nature is in a unique position to drive real change in the infrastructure of science publishing.  Unfortunately I do not subscribe to their journals (except that I did receive the inaugural year of Nature Physics, which was some time ago) but I do receive Science, which has not emerged as a leader in reproducible research.


Sunday, October 13, 2013

Book Review. Probability: A Very Short Introduction, by John Haigh

Probability:  A Very Short Introduction by John Haigh (2012) is a nontechnical and quite brief (main text:  117 pages) overview of probability theory and applications.  Only elementary arithmetic, and occasionally high school algebera are used, but equations are absent except in a 2-page appendix.  Most of the math is explained verbally.  There are occasional excusions into stochastic processes, decision theory, game theory, information theory, and statistics.  (Statistical physics is notably absent among the applications discussed.)  Readers should note that the book is written from a British  perspective, and references to British sports (cricket) and televsion shows, and King Richard III, may be lost on US readers.

Chapter 1, "Fundamentals", begins by explaining and comparing three interpretations of the concept of probability:  classical ("objective"), frequentist, and subjective.  The classical approach is based on first principles, such as symmetry.  For instance, one could decree that each face of a (fair) die is equally probable.  The frequentist approach states that probabilities emerge as empirical, long run frequencies of trial replications.  The subjective approach refers to a degree of belief, or a personal opinion.  The mathematics of probability are largely independent of which interpretation is being used.  Chapter 2, "Workings of Probability", introduces a few basic mathematical ideas:  the addition law for disjoint events, the multiplication law (in the context of conditioning), independence, and ways to handle overlapping and multiple events. 

Chapter 3, "Historical sketch", uses a historical account to introduce (in lesser detail) the laws of large numbers, the central limit theorem, Bayes' theorem, probability distributions (including the Gaussian and Poisson), the Markov property, the law of the iterated logarithm, measure theory, and martingales.  I am persuaded by the author's argument (p. 35) that the normal distribution's name is unfortunate, and that "Gaussian" should be preferred.  Chapter 4, "Chance Experiments", introduces additional distributions (uniform, binomial, geometric, and exponential) and mentions several extreme value distributions.  The concepts of mean and variance are also introduced.

Chapter 5, "Making Sense of Probabilities", is perhaps the most interesting one in the book.  This chapter and those that follow describe a series of case studies where probabilistic thinking can be applied.  This chapter starts with explaining odds in gambling.  Next, biostatistics appears with a comparison of the risk difference with the risk ratio, a topic discussed at greater length by Gigerenzer (2002), a book with which I will make several comparisons.  After a short discussion on "combining tiny probabilities" (referring to the Borel-Cantelli lemmas), there is a very good 3-page section titled "Some Misunderstandings".  Several of the misunderstandings are connected to quoting probabilities without specifying the reference class, an issue emphasized in Gigerenzer (2002).  The chapter ends with sections on "Describing Ignorance" (which introduces the Beta family of distributions) and "Utility".

Chapter 6, "Games People Play", discusses various applications in gambling as well as TV game shows.  Proficient card counters are banned by casinos; the author finds that "No better tribute to the power of understanding probability has ever been paid" (p. 77). Chapter 7, "Applications in Science, Medicine, and Operations Research" tackles Brownian motion, pseudo-random number generators, Monte Carlo simulations, error correcting codes, and several case studies in health:  amniocentesis decision making, estimating probabilies of carrying a gene based on knowledge of phenotypes in one's family tree, modeling epidemics, and improving the efficiency of mass blood tests by pooling samples.  The chapter concludes with sections on airline overbooking and queuing theory.

Chapter 8, "Other Applications", discusses law (particularly the Prosecutor's Fallacy), randomized response in surveys, and diagnostic tests for the use of performance enhancing drugs in sports.  In this latter section, the subtleties of a diagnostic test's sensitivity and specificity are finally discussed (pp. 100-102).  (As most readers will eventually have personal experience with medical diagnostics, I much prefer the more in-depth treatment of this topic found in Gigerenzer, 2002.)  The chapter concludes with sections on applications in sports and finance, including the Black-Scholes equation. 

The final chapter, "Curiosities and Dilemmas", returns to games, including a discussion of Parrondo's Paradox.  It concludes with some dilemmas surrounding genetic testing.  The book ends with the aforementioned appendix as well as a brief list of References and Further Reading.

The first sentence in the book is "Probability is the formalization of the study of the notion of uncertainty".  I take issue with this claim.  At the very least, the phrase "a formalization" should replace "the formalization".  There are other kinds of uncertainty, such as those found in approximation theory (where the error bounds on a given approximation can be thought of as a kind of deterministic uncertainty) or fuzzy logic.  Moreover, Heisenberg's Uncertainty Theorem is nowhere to be found in the book; it can be thought of as a consequence of wave-particle duality in quantum physics.

There are many nontechnical books about probability theory and applications; many are reviewed by the eminent probabilist David Aldous, including an earlier and longer work by John Haigh.  I've read none of the books he has reviewed except for Taleb (2005), which covers some of the psychological aspects of applying probability in real life.  However I have read Gigerenzer (2002), not mentioned by Aldous, but which remains my own favorite work on probabilistic thinking. This is why I make several comparisons above between it and Haigh (2012).  To finish the comparison, Gigerenzer (2002) seems very much focused on risk and its psychology, whereas Haigh (2012) is concerned with probability at large, and gives a more systematic and wider overview of its theory and applications.  However, I do think Gigerenzer is stronger when it comes to explaining the pitfalls of probabilistic thinking.  One strength of Haigh (2012), compared to the works reviewed by Aldous, is that Haigh (2012) is a very quick read, appropriate for the VSI (Very Short Introductions) series.  From that perspective alone, it may have a niche audience. 

References

 

Gerd Gigerenzer, 2002:   Calculated Risks:  How to Know When the Numbers Deceive You.  Simon & Schuster.

John Haigh, 2012:  Probability:  A Very Short Introduction.  Oxford University Press.  (Very Short Introductions, Vol. 310.)

Nassim Nicholas Taleb, 2005: Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets. Second edition. Random House.