Last month in Nature, there was a brief article by Erika Check Hayden about an experiment in peer
review of scientific software being carried out by the new Mozilla Science Lab. Nine papers published in PLoS Computational Biology,
selected by its editors, would have their code subject to a peer
review by software engineers. The experiment and its motivation are
described in the article; I also recommend reading the user comments
posted at the end. (See also the earlier pieces by Zeeya Merali and Nick Barnes, published together in Nature in 2010.) Apparently there has
been some controversy, as scientists are understandably nervous about
having their work subjected to a new form of review. However,
scientists are not well trained in software development concepts such
as version control, validation, and verification, and the code they
write may become difficult to maintain, or even worse, produce
undetected errors that have worked their way into published research.
The Mozilla Science Lab was introduced
this past summer, and is led by Kaitlin Thaney. It sponsors Greg
Wilson's Software Carpentry; I strongly recommend having a look at
the latter's website. I've read Wilson's essays in Computing in
Science and Engineering and other publications over the years, and
have been sympathetic to his views. I've heard rumors that the some
of the code at Fermilab is spaghetti code, with bits and pieces of it
written by many hands over many decades. Such an unwieldy mass of
legacy code is almost impossible to maintain. I was told about one
bug whose fix generated
another, more serious bug that was impossible to debug. It was decided to restore
the original bug and leave it in the code!
I am fortunate in that one of my
formative experiences was an internship with a small company that, as
a matter of survival, implemented a fairly disciplined software
construction methodology, based in part on Steve McConnell's Code Complete. Because the company was small and had a certain rate of
turnover, all of their software had to be highly maintainable,
assuming the original coder was no longer employed at the firm. It
was a point of pride there that you wouldn't be able to tell who
wrote a piece of code found in the software they developed, without looking at the header (which had version control data), for we
all conformed to the same software style.
DTLR endorses the Mozilla experiment in
peer review of software. I hope we learn a lot from their
experiment, even if it is deemed to be a failure in the end. In a
letter to the editor, Alden and Read (2013) state that software
quality should be built in from the beginning, before any data are
taken, and not “inspected in” at the peer review stage. They are
of course right, but to protect the rest of the community I do think
software peer review is a concept that should at least be explored.
References
Nick Barnes, 2010: Publish your
computer code: it is good enough. Nature, 467: 753.
Zeeya Merali, 2010: Computational
science:...error. Nature, 467, 775-777.
Erika Check Hayden, 2013: Mozilla plan
seeks to debug scientific code. Nature, 501: 472.
Kieran Alden and Mark Read, 2013:
Scientific software needs quality control. Nature, 502: 448.
No comments:
Post a Comment