Tuesday, November 19, 2013

Follow up on John Bohannon's 'Open access sting'

Last month, I made note of the 'open access sting' carried out by John Bohannon and published in Science.  Last week, this post at the Scholarly Kitchen features an interview with Bohannon in which he is given an opportunity to answer his critics.  I find all of his answers persuasive.  Bohannon has adequately defended his work, in my view.

However, I still sympathize with the critics who would have liked a control group of non-open-access publishers as well as those who believe the peer review system, writ large, is broken.  Bohannon makes clear that addressing both of these issues was out of scope for his project.  That's fair.  Many of us, however, are indeed concerned with these broader matters.  I don't think it would be necessary to carry out a 'sting' to demonstrate that non-open-access publishers, including top-notch ones like Science, have a track record of publishing substandard or even deeply flawed research.  I wrote at length about this earlier.

H/T:  In the Pipeline

Friday, November 15, 2013

bioRxiv goes live

In an earlier post I mentioned the forthcoming preprint server for the life sciences, bioRxiv.  According to Nature, the site has now launched.  See the write-up by Ewen Callaway here.



The STEM-Crisis Myth

The Chronicle of Higher Education this week has an article by Michael Anft examining the "STEM-Crisis Myth", or "The much-hyped shortage of science and tech graduates."  Politicians, business and academic leaders, and professional science societies are constantly warning of a shortage of American college students who want to study science and engineering.  The actual data backing up such claims, the Chronicle shows, is at best contestable.  The article tries to air both sides of the debate.  The president of Arizona State University is quoted as stating, "there's too much reliance on anecdotes" about the alleged poor job market for science graduates.

The Chronicle cites data alleging that about half of STEM majors leave their field within 10 years, and that 1 in 5 American scientists contemplate leaving the country.  There are dueling studies, however.  My personal experience is more consistent with the allegation that the STEM-jobs-crisis is a myth.  The industry in which I once worked has experienced a massive downsizing of its workforce in the last decade or so, including in my field.  More senior scientists have had trouble finding new positions that make full use of their talents.  Meanwhile, young graduates have trouble finding jobs, and many post-docs have been trapped in academic limbo with too many chasing too few faculty and industry positions.  Sequestration and the instability of the federal budget threatens federal funding for science across academia as well as the national labs.  Put simply, there isn't much money available for basic and applied research in the public and private sectors these days.

The 'myth' has been discussed in other venues long before this, and I am glad that the Chronicle has decided to cover it.  DTLR encourages skepticism about the alleged shortage of scientists in the labor market.  Those who have been talking up the shortage have a vested interest in increasing enrollments in universities, membership in scientific societies, and expanding the labor pool of science graduates in order to push down labor costs.  This includes academic, business, and professional society leaders across the sciences, as well as politicians.  The rhetoric is highly self-serving, because it promotes their own vested interests at the expense of the young people to whom they are serving up deceptive statements.  DTLR believes that any professional society, university, or business leader is committing fraud when they recruit youngsters into science and engineering with the promise of a bountiful job market when they graduate.  Many of them realize this, because they take an alternate tack by arguing that STEM training is a good foundation for any career, as the Chronicle notes, and the article ends by promoting the liberal arts idea of a "broader education" including the sciences and humanities.  I'm not sure there is much data to support these views.

Read the article and decide for yourself.

Reference


Michael Anft, 2013:  The STEM-Crisis Myth.  Chronicle of Higher Education, LX (11):  A30-A33 (Nov. 15, 2013).






Sunday, November 3, 2013

Non-reproducible research in the news

An epidemic of non-reproducible research in the life and behavioral sciences has been revealed in recent years. Much of the spadework has been done by John Ioannidis and collaborators, discussed earlier on DTLR. Well known biopharmaceutical industry reports from Bayer (Prinz, et al., 2011) and Amgen (Begley & Ellis, 2012) provide further confirmation. 

Glenn Begley, one of the co-authors of these papers, was interviewed for a story by Jennifer Couzin-Frankel in the recent Science special issue on Communication in Science, discussed on DTLR last month. Couzin-Frankel (2013) discusses Begley's failed attempts to reproduce the results published by a prominent oncologist in Cancer Cell. At a 2011 conference, Begley invited the author to breakfast and inquired about his team's inability to reproduce the results from the paper. According to Begley, the oncologist replied, “We did this experiment a dozen times, got this answer once, and that's the one we decided to publish.” Begley couldn't believe what he'd heard.

Indeed, I am simultaneously shocked but not surprised. Shocked, because it displays an utter lack of critical thinking on the oncologist's part. Not surprised, because in my experience critical thinking is rarely formally taught to scientific researchers, and the incentive system for scientists rewards such lax behavior. The oncologist may have forgotten why he got into science and medicine to begin with. The pressures of a career in academic medicine may have corrupted his integrity, but the work of Ioannidis and others alluded to above shows that this phenomenon is pretty common.

The rest of Couzin-Frankel's article discusses how clinical studies often get published even when the primary objective of the study has failed. Usually (but not always) the authors are up front about the failure, but try to spin the results positively in various ways. For instance, by making enough unplanned post hoc statistical comparisons, inevitably they'll find one that achieves (nominal) statistical significance, and they'll use that to justify the publication. Evidently journals allow this to occur, resulting in tremendous bias in what gets published. These are examples of selective reporting (cherry-picking) and exaggeration that result in misleading interpretations. This is not how science ought to be done.

Couzin-Frankel's article ends with a discussion of journals dedicated to publishing negative results, as well as recent efforts by mainstream medical journals to allow publishing negative studies.

Non-reproducible research has also gotten the attention of The Economist, which ran a cover story and editorial on it a few weeks ago. As additional evidence they cite the following statistic: “In 2000-2010 roughly 80,000 patients took part in clinical trials based on research that was later retracted because of mistakes or improprieties.” Thus there are real consequences. Patients are needlessly exposed to clinical trials that may have negligible scientific value; their altruism is being abused. This should be a worldwide scandal, and I congratulate The Economist for shining a harsh light on the problem.

The Economist points out that much of this research is publicly funded, and hence a scientific scandal becomes a political and financial one. “When an official at America's National Institutes of Health (NIH) reckons, despairingly, that researchers would find it hard to reproduce at least three-quarters of all published biomedical findings, the public part of the process seems to have failed.” They then discuss the journal PLoS One, which publishes papers without regard to novelty and significance, but only for methodological soundness. “Remarkably, almost half the submissions to PLoS One are rejected for failing to clear that seemingly low bar.” Among the statistical issues the article discusses are multiplicity, blinding, and overfitting.

The Economist points discusses the main reasons for these problems: scarcity of funding for science, which leads to hyper-competition; the incentive system that rewards non-reproducible research and punishes those interested in reproducibility; incompetent peer review; and statistical malpractice. The suggest a number of solutions: raising publication standards, particularly on statistical matters; making study protocols publicly available prior to running a trial; making trial data publicly available; and making funding available for attempt to reproduce work, not just publish new work.

The Economist's article has generated a certain amount of controversy, but I think it gets it mostly right.  I would have formulated the statistical discussion differently, and I think the article misses the chance to point out more fundamental statistical problems.  I also don't give much weight to the comments by Harry Collins about "tacit knowledge".  A truly robust scientific result should be reproducible under slightly varying conditions.

References


Begley, C.G., and Ellis, L.M. (2012): Drug development: raise standards for preclinical cancer research. Nature, 483: 531-533.

Jennifer Couzin-Frankel, 2013: The power of negative thinking. Science, 342: 68-69.

Prinz, F., Schlange, T., and Asadullah, K. (2011): Believe it or not: how much can we rely on published data on potential drug targets? Nature Reviews Drug Discovery, 10: 712.


Data management plan to be included in open access mandate

This month's APS News has a front-page report by Michael Lucibella, “Open Access Mandate will Include Raw Data.” The story focuses on the forthcoming mandate from the U.S. Office of Science and Technology Policy (OSTP), regarding open access to journal papers derived from federally funded research, one year after publication. Lucibella says that although no official statement has been made, it is expected that the mandate would include a data management plan, to make data sets generated by public funds available to the public as well. The story quotes the OSTP memo stating that “scientific data resulting from unclassified research supported wholly or in part by Federal funding should be stored and publicly accessible to search, retrieve, and analyze.” Specifics are “just starting to take shape.” The story goes on to outline various challenges to such a mandate.

One valuable feature of the mandate is that “Data points that have been expunged from the final analysis will likely have to be included, the idea being that scientists can evaluate why those points were eliminated.” In principle this is a good thing, but it will be nearly impossible to enforce. Also, there may be some subjectivity involved, as data that are clearly from documented technical errors should probably not be included (in my view); transcription errors should be corrected before posting. Also, I would like meta data to be included along with the raw data files.

The story states that computer codes would not be included in the mandate, “though talks are continuing over this point.” The story quotes statistician Victoria Stodden, who expresses concern about the omission of computer codes, which will obstruct reproducible research. I share Stodden's concern and I hope the mandate will include computer codes.

Modulo the concern about computer programs, DTLR endorses both the mandate to make journal articles public after one year, as well as the mandate to make the data publicly available.  I was a co-author on two publications where we provided supplemental information that included data sets and computer scripts.  However, I've co-authored nearly 20 refereed papers in total, and obviously most of them did not include such supplemental information.  As a result, all these years later, it is impossible for me to reproduce any of that work.  (Caveat:  some of this research was not financed by public funds; nonetheless I believe the principle should apply to all published research.)  I wish such a mandate had been in place at the beginning of my career, so that all of my published work could be reproducible.  With job changes and so on, I've long lost track of data sets and computer codes that were employed in doing the work reported in those papers.

Reference


Online comments for scientific papers

This week's Nature has an editorial (“Time to Talk”) discussing online commenting for scientific papers. The value of such a feature is to make a more permanent record of the sort of critical give-and-take over research that might be witnessed when work is presented at a conference or department seminar, or behind the scenes during peer review. Presently “lively debates on blogs and social media” can be difficult to find and preserve, as it is too diffuse (such as on individual blogs like DTLR). The editorial points out some fields, such as evolutionary biology, where an established online hub already exists. The alternative model is for each journal to host its own commenting feature. The editorial states that Nature's and the Public Library of Science's have not gained much traction.

Thus, Nature is excited that PubMed is stepping into the breach, adding a commenting feature there, that might allow it to serve as a highly accessible and visible hub for comments. Most of the rest of the editorial focuses on the potential for incivility in online commenting. For example, they cite the journal ACS Nano expressing concern that scientists are trigger happy with charges of misconduct such as plagiarism or scientific fraud. 

The comments to the Nature editorial itself are valuable (there are 3 as I write this). The first comment points out that one aspect not discussed is anonymity, and there are both pros and cons for it. I would favor allowing anonymous comments, but that signed comments should be given greater visibility. For instance, signed comments could be visible by default, whereas one might have to click to see the anonymous comments. Editors should moderate comment sections and remove irresponsible content.

Thus far I myself have not offered user comments to any scientific literature. I am more likely to write about research here, on DTLR, unless I feel I have an extremely important point to make, and that I am extremely confident about making it. However, frankly I don't spend a lot of time reading original research these days, so I don't expect I'll be making much of a contribution.

A related development mentioned in the editorial is the imminent release by the Cold Spring Harbor Laboratory of bioRxiv, a preprint server for biologists, in the same mold as arXiv for physics. It will offer user comments.