This month's APS News has a front-page
report by Michael Lucibella, “Open Access Mandate will Include Raw
Data.” The story focuses on the forthcoming mandate from the
U.S. Office of Science and Technology Policy (OSTP), regarding open access to
journal papers derived from federally funded research, one year after
publication. Lucibella says that although no official statement has
been made, it is expected that the mandate would include a data
management plan, to make data sets generated by public funds
available to the public as well. The story quotes the OSTP memo
stating that “scientific data resulting from unclassified research
supported wholly or in part by Federal funding should be stored and
publicly accessible to search, retrieve, and analyze.” Specifics
are “just starting to take shape.” The story goes on to outline
various challenges to such a mandate.
One valuable feature of the mandate is
that “Data points that have been expunged from the final analysis
will likely have to be included, the idea being that scientists can
evaluate why those points were eliminated.” In principle this is a
good thing, but it will be nearly impossible to enforce. Also, there
may be some subjectivity involved, as data that are clearly from
documented technical errors should probably not be included (in my
view); transcription errors should be corrected before posting.
Also, I would like meta data to be included along with the raw data
files.
The story states that computer codes
would not be included in the mandate, “though talks are continuing
over this point.” The story quotes statistician Victoria
Stodden, who expresses concern about the omission of computer codes,
which will obstruct reproducible research. I share Stodden's concern
and I hope the mandate will include computer codes.
Modulo the concern about computer
programs, DTLR endorses both the mandate to make journal articles
public after one year, as well as the mandate to make the data
publicly available. I was a co-author on two publications where we provided supplemental information that included data sets and computer scripts. However, I've co-authored nearly 20 refereed papers in total, and obviously most of them did not include such supplemental information. As a result, all these years later, it is impossible for me to reproduce any of that work. (Caveat: some of this research was not financed by public funds; nonetheless I believe the principle should apply to all published research.) I wish such a mandate had been in place at the beginning of my career, so that all of my published work could be reproducible. With job changes and so on, I've long lost track of data sets and computer codes that were employed in doing the work reported in those papers.
Reference
Here is the link to the full issue (PDF): http://www.aps.org/publications/apsnews/201311/upload/November-2013.pdf
No comments:
Post a Comment