Wednesday, October 30, 2013

Communication in Science: Pressures and Predators

The October 4, 2013, issue of Science features a special section, “Communication in Science: Pressures and Predators.” The item receiving the most attention is a sting operation by John Bohannon, who submitted versions of a fake, deliberately flawed article to 304 open access journals. Roughly half of them were actually accepted for publication. It is difficult to separate those journals who did so due to a slipshod editorial process (which is shared by many traditional journal publishers) and those that are deliberately predatory, the topic of Jeffrey Beall's blog.

Bohannon's experiment was criticized by Michael Eisen for not including a control group, that is, a group of traditional, subscription-based journals, in its sample. Philip Moriarty at Physics Focus echoes Eisen's criticisms. Both are fairly hostile to traditional journals for similar reasons. The epidemic of nonreproducible research, discussed in earlier posts on DTLR, serves to illustrate the broken-ness of the peer review system that they speak of. (Ironically, Eisen himself is interviewed in the piece that follows Bohannon's in the Science special feature, about “heretical” publisher Vitek Tracz.)  Eisen and Moriarty are pretty angry at Science for their hypocrisy, but they should have read the rest of the special feature. Jennifer Couzin-Frankel's piece on “The Power of Negative Thinking”, which advocates publication of negative results, is quite blunt about the epidemic of non-reproducible research. For me, Couzin-Frankel's piece is the most important article in the special feature, and I will dedicate a separate post to it shortly.  (Also of great interest is the Policy Forum piece by Diane Harley -- I may write further about that one too.)

In the meantime, the points made by Eisen and Moriarty are well taken. Nonetheless, as Bohannon, Beall, and others have shown, the open access journal movement has opened the floodgates for hundreds of predatory online journals that maintain no standards whatsoever. This surely deserves the widespread coverage that Bohannon's piece has garnered. I only wish that Couzin-Frankel's article had attracted equal scrutiny, along with the recent moves by Nature to raise its level of play on these matters, discussed earlier on DTLR.

It should be disclosed that I once served as a referee for an open access journal from a publisher on Beall's list. Thus I can testify that they (OMICS Group) at least did send out one paper from one journal for review. (Fortunately my review was positive and the paper was published; I do not know what would have happened had I submitted a negative review. I also strongly suspect that I was chosen to referee the paper because the submitting author was asked to provide a list of names of potential reviewers. Many traditional journals do the same.) Some of the other publishers on Beall's list do not even bother with even the appearance of a legitimate review process. I have also had a paper of my own rejected by an open access journal, one not included on Beall's list. The publisher of that journal, Hindawi, also passed Bohannon's test and rejected his phony paper.

Recommended reading


The Special Section in Science contains the following items (as well as a number of sidebar pieces by Jon Cohen and the other authors). Readers' attention was also called to a number of related Perspective and other items appearing in the same issue.

Richard Stone and Barbara Jasny, 2013: Scientific discourse: buckling at the seams. Science, 342: 57.

A cartoon by Randall Munroe (XKCD).

John Bohannon, 2013: Who's afraid of peer review? Science, 342: 60-65. (“A spoof paper concocted by Science reveals little or no scrutiny at many open-access journals”)

Tania Rabesandratana, 2013: The seer of science publishing. Science, 342: 66-67. (“Vitek Tracz was ahead of the pack on open access. Now he wants to rewrite the rules of peer review”)

Jennifer Couzin-Frankel, 2013: The power of negative thinking. Science, 342: 68-69. (“Gaining ground in the ongoing struggle to coax researchers to share negative results”)

David Malakoff, 2013: Hey, you've got to hide your work away. Science, 342: 70-71. (“Debate is simmering over how and when to publish sensitive data”)

Eliot Marshal, 2013: Lock up the genome, lock down research? Science, 342: 72-73. (“Researchers say that gene patents impede data sharing and innovation; patent lawyers say there's no evidence for this”)

Jeffrey Mervis, 2013: The annual meeting: improving what isn't broken. Science, 342: 74-79. (“Annual meetings are moneymakers for most scientific societies, and scientists continue to flock to them. But as the world changes, how long can the status quo hold?”)

Diane Harley, 2013: Scholarly communication: cultural contexts, evolving models. Science, 342: 80-82.

Tuesday, October 29, 2013

Should there be an alternative to the Nobel Prize?

At Physics Focus, Tara Shears has a post that makes a similar suggestion to one that appeared in DTLR earlier this month, that a prize should be awarded for an accomplishment rather than a set of individuals.  Naturally my thinking is on the issue aligns well with hers.  One of the commenters to her post, John Duffield, pointed out that the Nobel committee is bound to obey the terms of Alfred Nobel's will, and that the suggestions of Shears and Sean Carroll (New York Times) should be applied to a new prize, not the Nobels.  I surely agree here too.  Thus, we'll always have the Nobels, warts and all, but their prestige will need to be rivaled by new prizes that better reflect the scientific process.  Will a benefactor step forward, wealthy but interested in reforming the reward system in science? Alternately, Science and other magazines usually publish a list of top 10 discoveries of the year or some such, and these put the focus on the achievements rather than the individuals.  Perhaps such lists, if suitably hyped up, could achieve what Carroll, Shears, and I are aiming for? 

Sabine Hossenfelder has an opposing view on the Nobel Prizes on her blog.  It consists of two lines of reasoning.  The first boils down to the following.

Giving such an honor to institutions is akin to doing away with private property in communism and believing that everybody cares for the well-being of the group as they do for their own. It doesn’t work because most people want to be recognized as individuals, not as members of collectives. That’s true also for scientists.
I found this an unexpected but quite thoughtful contribution to the conversation.  It is true that as a member of a 'collaboration', winning a prize is not exactly something one can place on one's CV.  Indeed, I found it inappropriate when individual members of the IPCC claimed to be "Nobel Laureates".  In any case, the Nobel Prizes will continue to propagate according to Alfred Nobel's will.  Perhaps having them co-exist with the Science list of top discoveries is both realistic and desirable.  The prestige and/or visibility of the Science list (or equivalent) just needs to be elevated to close to the level of the Nobel Prizes.  

I do think that this year's prize, which gave the short shrift to Kibble and robbed Brout because he had the ill fortune to die too young, still illustrate unresolved issues with the Nobel Prizes.  The Nobel committee probably would lose credibility for not recognizing in some way the discovery of the Higgs boson, but there are few good ways to do so within the constraints of Nobel's will.  Still, perhaps Kibble should have been given the third slot.  A precedent is the 2001 Nobel in Physics for Bose-Einstein condensation, which did not just go to Wieman and Cornell, who were the "first", but also to Ketterle, whose early work arguably went further than Wieman and Cornell's, but was published 4 months later.

Hossenfelder's second line of reasoning is that Nobel Laureates can be powerful spokespeople for their fields.  I don't find this one compelling.  Some of the leading spokespeople for physics are not Nobel Laureates, such as Stephen Hawking or Neil de Grasse Tyson.  These folks are much better known to the public than most living Nobel Laureates in physics.

Monday, October 28, 2013

Upcoming on DTLR

This month has by far been the busiest since DTLR began in July of this year.  And yet my posts on the three biggest fish are still in the planning stages.  So, here I just want to alert readers to the background material on two of these.

First, there is the special issue from Science on "Communicating in Science:  Pressures and Predators," published earlier this month.  Second, there is the cover story in The Economist from last week, "How Science Goes Wrong."  (See also the accompanying Leader.)  Both of these tackle central topics of this blog, and I am grateful for the high profile these have issues have taken.

The third 'big fish' of the month that I plan to comment on is a reaction to the federal government shutdown at the beginning of the month.  (Full disclosure:  I was furloughed as a result of this episode.)  I plan to muse on the role of private funding for science.

All three topics are very timely and I hope to have my comments available soon.  In the meantime, on the technology side, see this interesting post by David Auerbach at Slate on the travails of the Affordable Care Act's website rollout this month.

Sunday, October 27, 2013

Software validation in computational biology

Last month in Nature, there was a brief article by Erika Check Hayden about an experiment in peer review of scientific software being carried out by the new Mozilla Science Lab. Nine papers published in PLoS Computational Biology, selected by its editors, would have their code subject to a peer review by software engineers. The experiment and its motivation are described in the article; I also recommend reading the user comments posted at the end. (See also the earlier pieces by Zeeya Merali and Nick Barnes, published together in Nature in 2010.)  Apparently there has been some controversy, as scientists are understandably nervous about having their work subjected to a new form of review. However, scientists are not well trained in software development concepts such as version control, validation, and verification, and the code they write may become difficult to maintain, or even worse, produce undetected errors that have worked their way into published research.

The Mozilla Science Lab was introduced this past summer, and is led by Kaitlin Thaney. It sponsors Greg Wilson's Software Carpentry; I strongly recommend having a look at the latter's website. I've read Wilson's essays in Computing in Science and Engineering and other publications over the years, and have been sympathetic to his views. I've heard rumors that the some of the code at Fermilab is spaghetti code, with bits and pieces of it written by many hands over many decades. Such an unwieldy mass of legacy code is almost impossible to maintain. I was told about one bug whose fix generated another, more serious bug that was impossible to debug. It was decided to restore the original bug and leave it in the code!

I am fortunate in that one of my formative experiences was an internship with a small company that, as a matter of survival, implemented a fairly disciplined software construction methodology, based in part on Steve McConnell's Code Complete. Because the company was small and had a certain rate of turnover, all of their software had to be highly maintainable, assuming the original coder was no longer employed at the firm. It was a point of pride there that you wouldn't be able to tell who wrote a piece of code found in the software they developed, without looking at the header (which had version control data), for we all conformed to the same software style.

DTLR endorses the Mozilla experiment in peer review of software. I hope we learn a lot from their experiment, even if it is deemed to be a failure in the end. In a letter to the editor, Alden and Read (2013) state that software quality should be built in from the beginning, before any data are taken, and not “inspected in” at the peer review stage. They are of course right, but to protect the rest of the community I do think software peer review is a concept that should at least be explored.

References


Nick Barnes, 2010: Publish your computer code: it is good enough. Nature, 467: 753.

Zeeya Merali, 2010: Computational science:...error. Nature, 467, 775-777.

Erika Check Hayden, 2013: Mozilla plan seeks to debug scientific code. Nature, 501: 472.

Kieran Alden and Mark Read, 2013: Scientific software needs quality control. Nature, 502: 448.


Biology's dry future

A few weeks ago, Science magazine featured a very interesting story by Robert F. Service titled “Biology's Dry Future.” The subtitle tells us, “The explosion of publicly available databases housing sequences, structures, and images allows life scientists to make fundamental discoveries without ever getting their hands 'wet' at the lab bench.” The story highlights two quotes from interviews. The first is by Atul Butte of Stanford University School of Medicine: “I'm like a kid in a candy store. There is so much we can do.” The second is by David Heckerman of Microsoft Research: “You basically don't need a wet lab to explore biology.”

The title of the story is not quite accurate. There will always need to be wet lab biologists to do experiments and generate data. What is novel here is the new breed of biologist who works on data generated by other labs, but need not have a lab themselves. This may be new to biology, but physicists have long had a split between experimentalists and theorists, recently joined by computationalists.

Of particular interest to DTLR are the three “growing pains” mentioned in the article: data access, data standardization, and genetic privacy. I will focus on the first two here. Regarding data access:

In many cases, researchers who have spent their careers generating powerful data sets are reluctant to share. They may be hoping to mine it themselves before others make discoveries based on their work. Or the data may be raw and in need of further analyses or annotation. “These are really hard problems,” Butte says. “We need better systems to reward people that share their data.”

DTLR endorses that last sentence. First of all, anyone who makes the effort to generate a good data set should make the effort to document and annotate it for use. Even if the data are never shared, pretending that it might be shared one day instills the necessary discipline for documentation and annotation. Moreover, if the work is publicly funded, then in my view the social contract requires that the data be made available to the broader scientific community at some point, perhaps after an appropriate time period of exclusive use, say, no more than two years. (This is about the time needed for a grad student or post-doc to squeeze at least one paper out of results.) The new Nature online journal for data sets would provide an excellent venue to generate a peer reviewed publication for the data set alone, rather than discoveries that can be made with it. Bear in mind that in physics, Nobel prizes are awarded to both theorists and experimentalists. Biology as a discipline should adopt a similar cultural mindset to reward both wet bench and dry bench biologists.

Regarding standardization:

Not only do research groups file their data using different software tools and file formats, but also in many cases the design of the experiments—and therefore precisely what is being measured—can differ. Butte and others argue that dealing with multiple file formats is somewhat cumbersome but that the problem is surmountable. But it can be harder to account for differences in experimental design when comparing large data sets.

DTLR could not have said it better. The core problem here is experimental design, and it will always be a limiting factor for dry lab biologists trying to combine data from more than one experiment. A similar problem exists in clinical medicine, under the term 'meta-analysis', and I'm not sure there are really good solutions there either. The best approach, in my view, is to take any findings based on multiple data sets as tentative, exploratory, and hypothesis-generating, rather than definitive. The findings should then be confirmed (or refuted) in a new experiment. This is where the dry lab biologist might have to return to the bench.

Finally, DTLR cautions that dry lab biologists should still spend some time in the lab, at least while in training. There is no substitute for bench time for getting a feel for how sloppy and imprecise experimental data can be, and where the pitfalls and potential systematic and random errors may arise from. It is too easy for a dry bench scientist to take data found in a database at face value. Spending time at the bench will provide a needed reality check.

Reference


Robert F. Service, 2013: Biology's dry future. Science, 342: 186-189.

Sexual harassment in science

Last week's issue of Nature calls for an end to sexual harassment in science. The editorial was triggered by a scandal centered on a blogger who was the editor of the Scientific American blogs. He has resigned. The editorial casts a wider net, introducing the generic character “Dr. Inappropriate” who represents “the widespread tacit acceptance of adolescent behavior.” DTLR endorses their call for fighting back against sexual harassment in science. Although I am keenly sensitive to the potential for false accusations, I condemn scientists who commit sexual harassment, particularly when they do so in the context of an unequal power relationship with the target.

As a male scientist, I have been fortunate not to have been the target of such harassment. However, as a graduate student in the late 1990s, I witnessed an incident involving two “Dr. Inappropriates” that shocked me out of my sheltered perception about the behavior of mature scientists.

The setting was as follows. Our department had a weekly seminar, and on one occasion the speaker was a distinguished physics educator from another university.  She was the author of a textbook being used in an experimental introductory class in my department. She had brought along her (female) teaching assistant (now a faculty member in her own right). Following the seminar, they invited the audience to join them in a nearby classroom where they would run a simulation of a cooperative learning class, with us as the 'students'. (Around this time I myself had already taught a couple recitation sections using the cooperative learning format.) A sales representative from the publisher of the experimental textbook was also present; she was also a female.

I was one of the more junior graduate students at the time, just getting started in research.  (At the time there were very few female faculty or students in the department; there are considerably more today.)  I sat down at one of the tables in the classroom, and was soon joined by three male physics faculty members. (Two are now retired; the third is still on the faculty as I write.) Let us call them E, K, and H.  Professor E had a strong interest in physics education, and I had worked with him in the past. The other two didn't know me very well. In any case, K took the time to introduce himself to me again, and there was some small talk. H more or less ignored me.  (I have other unflattering stories to tell about them, but those will be for another time.)

So, the simulated cooperative learning class began. We were all given an assignment to work on in our groups. Each group consisted of whoever was sitting at a table together. The seminar speaker and her assistant circulated in the room, acting as Socratic facilitators for the groups. They listened to our conversations among ourselves, and tried to help us along. E played along with the simulation. However, K and H started drifting into “adolescent behavior”. First, K commented to the rest of us that the teaching assistant, out of earshot, was cute. K and H then pretended to be 'bad students' and basically annoyed the teaching assistant when she stopped by to listen to us or help us along. Unlike E, they were not taking the exercise seriously.  

At one point, the publisher's sales representative came over, obviously interested in promoting the textbook. By now I sensed that K had lost interest in the whole exercise, but H (who was on the textbook committee) started chatting with the sales rep. I can't remember the nature of what he said, but I do remember the impression that he started getting very flirtatious with her. Her reaction was very polite and professional, and she rapidly maneuvered the conversation toward discussion of her kids. She showed H a wallet picture of the children. This put an end to the flirtation, and they started talking about the textbook instead.

The sales rep's maneuver struck me as masterful. It could not have been the first time she had to deal with this sort of thing, and she maneuvered H out of the 'adolescent' behavior firmly but without ruffling his feathers at all. I appreciate the awkward position she was in. She was obviously trying to curry favor with a member of the textbook committee, but also quick to eliminate the flirtation.

This was the first time I had seen physics professors behaving badly. I regret that I did not challenge them. It would have been risky for me to do so, but that is no excuse. A challenge is precisely what the Nature editorial recommends as a non-legalistic way to discourage such conduct. (However, based on my other interactions with K and H, I speculate they would have ignored any such challenge from me, or even attempted retaliation.) I can only imagine how much more difficult it would have been had I been a female graduate student instead. It would be totally demoralizing to witness the entire episode and understand that this is how some physics professors are capable of treating women when in a professional setting.

I ran into E a few days later, and he expressed his dismay in his colleagues. (He was also a little unhappy with me, as I pretty much said nothing throughout the whole exercise, failing to back up E in playing along with the simulation.) It is notable that even E did not feel empowered to challenge K and H in person at the time – all three were tenured. Tenured professors are untouchable.

What do I take away from this? It isn't clear whether sexual harassment by any legal definition took place. The episode I describe falls into what the Nature editorial calls a grey zone. K's comment about the 'cute' teaching assistant occurred when she was out of earshot, and didn't lead to anything further. H's flirtations ended once it was made clearly unwelcome. K's and H's 'bad student' act was not sexual harassment, but just general 'adolescent' behavior. Nonetheless, my witnessing of this episode made it possible for me to imagine that sexual harassment might indeed take place, under different circumstances, and that it could be perpetrated by people who are set up to be mentors and authority figures to me and other students. Indeed, what I observed that day seems relatively harmless compared to the Scientific American blog scandal, where a real abuse of power is alleged.

The reason the episode was a shock to me was that every single faculty member I had ever interacted with until then was, in my view, an honorable and professional person, and I had never seen professors act immaturely until that moment. Perhaps the one positive outcome was that my blissful naivete ended that afternoon.

Recommended reading


Eileen Pollack, 2013:  Why are there still so few women in science?  New York Times, Oct. 3, 2013.

UPDATE:  A bit off topic, but Sabine Hossenfelder has a (back)reaction to Pollack's piece on her blog.


Friday, October 25, 2013

October 2013 is a great month for physics fans

Of course the month began with the announcement of the Nobel Prizes in physics, chemistry, and medicine. The conversation about the physics prize in particular, discussed earlier on this blog, has been lively. Meanwhile, here in the U.S., Physics Today offered its usual bounty in this month's issue, including an article on “Measuring the Hubble Constant” by prominent astrophysicists Mario Livio and Adam Reiss, the latter a Nobel Laureate. (More on this article below.) Meanwhile, the 50th anniversary issue of the New York Review of Books features an essay on the last 50 years or so of the development of particle physics and cosmology by Nobelist Steven Weinberg (Nov. 7, 2013 issue). Both fields, though in states of disunity and confusion in the early 1960s, gradually came to develop their own highly successful and unifying 'standard models'. The two fields also converged. Weinberg manages to tell the story without naming a single physicist, including himself.

In the U.K., it gets even better. Physics World celebrates its 25th anniversary by publishing a spectacular special issue, which is available for free download (volume 26, number 10, October 2013). And last weekend the Financial Times' FT Weekend Magazine offered its first special issue devoted to a single science, physics (Oct. 18, 2013 issue). (Unfortunately there doesn't seem to be a single link that compiles all the online articles. Here is the lead editorial.) Finally, even The Economist covers physics in an article about the possibility of particle accelerators made of glass, which would allow them to be smaller (Oct. 19, 2013 issue).

Returning to the article by Livio and Reiss, I note with particular interest a discussion of comparing the Hubble constant H0 based on global methods, such as those made by the Planck satellite, with measurements based on local objects.

Local measurements of H0 are complementary to other, higher-redshift probes. Indeed, we'd be remiss if we did not note an apparent tension, at the 3 sigma level, between current measurements of H0 based on local objects and its deduced value based on the standard cosmological model and new Planck results for the cosmic microwave background. That tension may be the harbinger of new physics, but past experience indicates that discrepancies below 3 sigma disappear when more data are available.

Indeed, the threshold for discovery in physics is often five sigma, which was the threshold used in the discovery of the Higgs boson. As a physicist who has strayed into the life sciences, two aspects are particularly striking. First, I admire physicists' relentless skepticism of 3 sigma results (2 sigma is routinely considered 'statistically significant' in the life and behavioral sciences) and willingness to collect more data. The epidemic of non-reproducible research in the life and behavioral sciences betrays the precise opposite tendency in those fields. Second, in physics when we talk about measuring universal constants, there are often many independent procedures for measuring the same universal phenomena. This has been true throughout the history of physics. In the clinical sciences, there is usually exactly one clinically meaningful endpoint, and other substitutes (biomarkers, surrogate endpoints) may provide compelling evidence, but never sufficient in a true outcomes study. If you want to prevent a remission of cancer, you must measure the time to remission. If you want to prevent a heart attack, measure the time to the next one.

References


The Economist, 2013: Small really is beautiful. The Economist, vol. 469, no. 8858, pp. 83-84.

Mario Livio and Adam G. Reiss, 2013: Measuring the Hubble constant. Physics Today, 66 (10), 41-47.

Steven Weinberg, 2013: Physics: what we do and don't know. The New York Review of Books, vol. LX, no. 17, pp. 86-88.



Sunday, October 20, 2013

Making decisions with experts

In yesterday's New York Times, economist Noreena Hertz writes about medical decision making, illustrating it with her personal experience. The piece is entitled “Why we make bad decisions” and focuses on the case where a lay person must make a decision about his or her health or finances in consultation with one or more experts. Physicians often make errors, and the more confident ones are more prone to error. Unfortunately the lay person usually defers to the expert.

How does a lay person evaluate expert opinion. or aggregate multiple, conflicting expert opinions? The author suggests that first, you have to educate yourself. Go into a conversation with the expert as knowledgeable as you can, with as much literacy in the jargon of the field as you can pick up. Be aware of your state of mind and your lack of objectivity. There are known biases and heuristics that might lead to irrational decisions; be aware of them and of your vulnerability to them. Her example is the optimistic bias many of us seem to have, well documented in studies of human behavior. We have a tendency to latch on to good news and tune out bad news.

I would add that often the expert is responding to a set of incentives that differ from yours. Thus, you need to be your own best advocate; do not rely on the expert to have your best interests at heart.

Over the years I've dipped into the literature on “judgment and decision making,” which is what this area of research is called in psychology, or “behavioral economics,” which is what economists call it. I hope to learn more about the field in the future.

DTLR salutes Professors Casadevall and Fang

Back in January, Science magazine featured a story by Jennifer Couzin-Frankel about a pair of dissident medical scientists, Arturo Casadevall and Ferric Fang. They're convinced that science today is very unhealthy, and they have done a number of data-driven studies on the infrastructure of scientific research to diagnose various problems with the enterprise. They've looked at peer review, retraction rates, funding mechanisms, and the incentive system for scientists. Initially I was not familiar with their work, so I began by reading their two editorials in Infection and Immunityhere and here, alluded to in their Huffington Post blog post from 2012. (They also revisit these issues in a post on Project Syndicate from this past summer.) Their intent is to spark conversation rather than provide definitive solutions, so I will accept the invitation to discuss their findings. They write, “What we propose is nothing less than a comprehensive reform of scientific methodology and culture” and I certainly agree that such a dramatic overhaul is needed.

This blog, DTLR, has been particularly concerned about non-reproducible research. Casadevall and Fang do not address this particular issue directly. Rather, they tackle some of the underlying pressures on scientists due to the incentive system they current face. Their diagnosis of the problems has the ring of truth, and I refer readers to their papers; I won't rehash their work here. Rather, I want to comment on their proposed solutions. Addressing the issues they've identified is likely to reduce the incidence of non-reproducible research that I've written about on this blog.

First, the authors propose reforming the reward system for scientists. This includes eliminating the priority rule, which gives credit to the first to publish. The authors recognize the value of competition, but they want to introduce complementary reward mechanisms for collaborative research. No specific details are given, so the idea needs to be fleshed out. Nonetheless I agree on the principle. They also want to replace easy “surrogate methods of quality, such as grant dollars, bibliometric analysis, and journal impact factor” with “careful peer evaluations of scientific quality”. Of course, this is easier said that done. In principle I agree here too, but we need to see a detailed, specific proposal on how this would be done. Arguments about the flaws of impact factors, h-indices, and so on are a perennial favorite in the scientific community, so doubtless there are many ideas out there.

Next, the authors talk about re-embracing philosophy. It sounds to me that they want to add one or more philosophy courses to the curriculum for science students. Practically, this would be difficult, as the curriculum is pretty stocked full. Moreover, I think that if students are forced to take such a course, many may not take it seriously and others may resent it. I happen to be one of those who would welcome such a course (and I did take Philosophy of Science in college), but I would hesitate to impose my values on others. Also, I think some students might rather read a book or two rather than take a full-blown, for-credit course on the subject. Thus, I would make such a course available to every student, but keep it optional rather than required. (In my case, I've continued to read bits and pieces of philosophy throughout my career in science.)

Next, they call for enhanced training in probability and statistics. They recognize the value of statistics in the “design, execution, and interpretation of many scientific experiments.” I would certainly applause, with a caveat. Training should be focused on statistical thinking, with statistical methodology taking a back seat. Too often the training students do get on statistics is focused on methods and software, rather than critical thinking. The misuse of statistical methods, abundant in the literature, is the result. Many scientists, and many statisticians, are simply not qualified to teach statistical thinking. My book review of Marder (2009) on this blog touches on many issues in this regard, and I plan to tackle other specific cases in future posts.  Also in a future post, or more likely in another forum altogether, I will expand on the discussion of statistical thinking vs. statistical methodology in relation to scientific training and practice.

Next, the authors call for developing checklists for reducing methodological errors. This is a superb suggestion, and they provide an example for one case, “observation in which a stimulus elicits an effect”. The example is well thought out. Earlier on this blog I noted a new checklist introduced by Nature for encouraging reproducible research. These are welcome steps forward; more progress is needed.

The authors then turn to structural reforms. They call for more public funding for science and an increase in the number and diversity of new scientists. They find that in recent years, directed research has overshadowed investigator-initiated research in NIH funding. Here I would part company with the authors. I do agree that working on increasing the diversity of the scientific community is important. I cannot agree that increasing the absolute size of the community, either in number of scientists or amount of funding, is a realistic or desirable goal. Frankly, until the scientific community cleans up its own act, by dramatically restructuring the incentive system so that reproducible research is rewarded and non-reproducible research is punished, I would vote to reduce science funding even further, and I would discourage young people from considering science careers. This is a very harsh stance, but I don't think any stance less extreme will convey my level of anger and dismay. As a practical matter, in the United States is is politically unlikely that funding for science will increase when funding for many other worthy societal goals is necessarily decreasing.

Finally, the authors call for reform in the regulatory burden on scientists, restrictions on laboratory size, and a “scientific study of science.” I do not dwell in the academic trenches of science, so I will only briefly discuss the last point. They discuss trying to discover the optimal number of scientists in society, optimal research group size, optimal time for scientific training, etc. I would encourage study of all of these questions, but I would not expect blanket “optimal” answers to result. The system must allow for slack and variability. Such research can only provide guidance, not mandatory and immutable rules. In fact, I would expect studies to address some of the questions they pose to end up completely inconclusive.

Nonetheless, Casadevall and Fang deserve much praise for asking very tough questions and calling for a dramatic reform of the science community's infrastructure. Although I fail to agree with all of their positions, on balance I believe they are on target, and the rest of us should rally around them. Science needs more internal critics like them and like John A.P. Ioannidis (whom I've written about in an earlier post). All of these folks go out on a limb to make controversial observations about how science functions and how it goes wrong, as well as proposing solutions. We all need to follow their lead.

References


Arturo Casadevall and Ferric C. Fang, 2012: Reforming science: methodological and cultural reforms. Infect. Immun., 80 (3): 891-896.

Jennifer Couzin-Frankel, 2013: Shaking up science. Two journal editors take a hard look at honesty in science and question the ethos of their profession. Science, 339: 386-389.

Ferric C. Fang and Arturo Casadevall, 2012: Reforming science: structural reforms. Infect. Immun., 80 (3): 897-901.




Saturday, October 19, 2013

Why can't science and the humanities get along?

Last week's cover of the Chronicle Review features a rambling essay by historian David A. Hollinger, “The Rift: Can STEM and the humanities get along?” It focuses on the alleged rift between the sciences and the humanities within the academy. At some level I don't really care about the culture of the academy, which is disconnected from the rest of society in many ways. However, let me try to make sense of the article anyway.

First, Hollinger makes the case that there is such a rift (based on anecdotal evidence: complaints he's heard from colleagues), and that the wedge “threatens the ability of all modern disciplines to provide—in the institutional context of universities—the services for which they have been designed.” I would argue, though not here, that there are many other reasons that universities struggle to provide the services for which they have been designed, at a reasonable cost, and the rift that Hollinger speaks of has a negligible contribution. This is really a topic not for a separate post, but for another blog entirely, so I will not pursue it further here. (Please consult the Spellings Commission Report on the Future of Higher Education if this interests you.)

Next, Hollinger dismisses the “Two Cultures” conversation of C.P. Snow, because the current rift takes place in a far more hostile political, social, and cultural setting than 50 years ago. Hollinger points to the more recent culture wars, but fails to mention the notorious Alan Sokal hoax, and the 'science wars' of 1990s vintage. Hollinger states that the problem boils down to the accusation that humanities courses and research are often tainted by leftist ideological bias (or political correctness), and that few have accused science courses of displaying such biases. Hollinger counters that scientists and engineers are just as capable of making “fools of themselves” in other ways. I have no dispute with this point, as much space on this blog is devoted to the problems of non-reproducible research in science and medicine. In fact, the problems I write about are arguably much worse for society than any relatively harmless indoctrination that takes place in humanities classrooms for privileged middle class students. Nonetheless, this does not excuse or make permissible such indoctrination, if indeed it is taking place. The fact that both the sciences and the humanities are capable of crummy work does not explain why there is a rift between them. [Regarding indoctrination in the classroom, see Fish (2008).]

Next, Hollinger discusses a report by the Commission on the Humanities and Social Sciences, which apparently prompted him to write this piece. I have not read that report, but his main complaint is that the report ignores the “deep kinship between humanistic scholarship and natural science.” Hollinger states that the humanities “are the great risk takers in the tradition of the Enlightenment”. He provides the following two arguments for this claim. First, the humanities have defended the use of evidence and reason, which should give it common cause with the sciences in a culture which is rapidly devaluing such tenets of critical thinking. This is only partially true. Critical thinking is certainly valued in some sectors of the humanities, but it is flouted in others, as the Sokal hoax showed. (This blog shows that science is guilty of the same.) If I had time I would present other examples of bad reasoning found in the journal literature of literary studies and other humanities fields, but I'm devoting this blog to bad reasoning found in the sciences. Suffice it to say that I do not think that conditions are better in the humanities than they are in the sciences, but this is based on personal experience, and I haven't made the effort to document it.

Second, Hollinger presents a single example of risk taking by the humanities: the hiring of Harold Cruse in the 1960s by the University of Michigan. Cruse did not complete college, but he wrote a critically acclaimed book on African-American intellectual history, and was hired as a full professor of history. However, today's academy would not take such a risk, in either the sciences or the humanities. The tenure system, as it now operates, discourages risk taking and rewards conformism for those on the tenure track. It also protects substandard work by those who achieve tenure. There are certainly exceptions, but by and large the incentives created by the tenure system have the opposite effect of its intent to protect faculty who are doing unpopular research and questioning authority and vested interests.

Hollinger states that the humanities exist in the borderlands between mere opinion and the “methodologically narrower, largely quantitative, rigor-displaying disciplines”, or between “scholarship and ideology.” Hence the risk taking is higher there than in the sciences. Examples include the studies of historically disadvantaged groups such as women's studies, Latino studies, etc. This is probably the only point made in the article that I remotely find merit in.

Hollinger mentions Texas Republicans who want to remove a critical thinking component from public education. He states that “Colleagues in the natural sciences would do well to imagine what society would look like with a significantly diminished place for the human sciences. Techies yes, critics no.” Actually it doesn't require much imagination. The Soviet Union was a superb example, and physicist Andrei Sakharov was supremely accomplished as both a “techie” and a “critic”.

Finally, Hollinger complains about salary inequity among various university departments. He shows that he has a good understanding of the economic reasons for such inequities: in fields where talented individuals have attractive alternatives to working in a university, a university will have to compete financially for that person's services. Nonetheless Hollinger seems to feel affronted by this state of affairs. I think he should take his own advice and enroll in a microeconomics course on the other side of his campus. He also thinks universities are degenerating into “sites where lots of independently interesting valuable things happen” instead of coherent institutions, where “being a professor was a calling in itself.” My reaction to this is So What?

In summary, there is very little in the essay that I can use. There is good stuff and bad stuff going on in both the humanities and in the sciences. Does any of this explain why there is a rift between them? I am a scientist by training and by employment, but I read a lot of history (and occasional bits of philosophy), listen to J.S. Bach, and look at the paintings of George Tooker. Hollinger's article does not seem to be relevant to my intellectual life. What about yours?

References


Stanley Fish, 2008: Save the World on Your Own Time. Oxford University Press.

David A. Hollinger, 2013: The rift: Can STEM and humanities get along? The Chronicle Review, Oct. 18, 2013 issue, pp. B6-B9.

Wednesday, October 16, 2013

Will you help me replicate your experiment?

Last week in Nature Medicine, Elizabeth Devitt wrote about two studies that illustrate the perverse incentives in the scientific community in relation to reproducible research and data stewardship in the life sciences.

First, authors of papers in the Annals of Internal Medicine were surveyed about their willingness to provide additional information about their study protocols to other investigators who may be trying to replicate their experiments.  The proportion answering 'yes' dropped from about 80% in 2008 to 60% today (see her article for a graphic).  Now, I don't know if the size of this drop is comparable to the sampling error; it is not clear what survey methodology was used and how precise the estimates are.  Nonetheless, the article discusses the lack of incentive for scientists to help others reproduce their experiments.  This is a major gap in the incentive system for science, where competition for funding and status is fierce.  We should take heart that at least there are still a majority of authors answering 'yes'.

The second item Devitt mentions is a study showing the decline over time of even the availability of such information, as data becomes lost or inaccessible.  This study surveyed authors of papers in Molecular Ecology published from 1991-2011.  This illustrates the issue of data stewardship that Nature's new open data journal will hopefully help to address.

More on the 2013 Nobel Prize in Physics

In a New York Times opinion piece, Sean Carroll argues persuasively that in the future, the Nobel committee should consider institutions and collaborations for award of the Nobel Prize in Physics.  He states that there is a self-imposed tradition of only awarding the prize to 3 living individuals per year.  This seems to be consistent across the Nobel science prizes, but not the Peace Prize.  (The literature prize seems to go to a single living author each year.)  As long as the terms of Alfred Nobel's will are adhered to, Carroll's argument is persuasive with me, and this year's prize to Englert and Higgs, to the exclusion of others, certainly makes the case quite vividly.

There is only the logistical difficulty of how an institution or collaboration would make use of the prize money.  An institution could certainly put the money in its endowment or operational fund, and the Nobel Peace Prizes awarded to institutions provide a precedent.  How about a more loosely defined 'collaboration'?  This is certainly less clear.  Would all its members be considered 'Nobel laureates'?  Some have tried to take that approach with IPCC individual members when the IPCC won half of the Nobel Peace Prize in 2007.  (IPCC stands for the Intergovernmental Panel on Climate Change.)  I felt this was highly unjustified.

In the case of the Higgs theorists (setting aside for now the two experimental groups who made the discovery last year), one of them was dead by the time of the prize, and the others could be divided into at least 3-4 groups of collaborators; a number of others had done precursor work such as Anderson, Goldstone, and Nambu.  Laureates Englert and Higgs represented two of the pivotal collaborations; a third led by T.W. Kibble was completely excluded from the Prize.  So, no single 'collaboration' could be credited for even the theoretical proposal of the BEH (Brout-Englert-Higgs) mechanism.

Thus, while Carroll's proposal is a sound one, it is not clear to me how it could be applied to the discovery of the Higgs boson, an achievement which nicely reflects the messy nature of scientific discovery.  Could up to three 'collaborations' receive the award?  If so, would all members be considered Nobel Laureates?  Would the qualified individuals need to be identified by name by the Nobel Committee?   Carroll's piece is a good start to the conversation, but the proposal needs to be fleshed out more thoroughly before it could be seriously considered.




More applause to the Nature family of journals

Continuing its journalistic examination of the infrastructure of science publishing, and attempts to improve it, the Nature family of journals has made a number of new steps recently.

On the journalistic side, the current issue has a very thought-provoking report by Eugenie Samuel Reich on the effects of publishing in high-prestige journals.  The psychology here is similar to that of hiring managers who use the schools that candidates graduated from as a filter for 'quality'.  Especially for those candidates whose accomplishments are not known and understood by the hiring manager personally, use of the school-filter is a heuristic device for winnowing down the candidate pool.  Similarly, unless you're in the same field as a given scientist, it is difficult for you to evaluate the quality of that person's work and its impact in his/her community.  Seeing which journals the scientist has published in works as a heuristic for making such an evaluation.  Since doing so is easy, and alternative approaches are considerably more difficult and time consuming, the journal-filter is widely used and has real impact in terms of hiring, funding, and status.  Hence, those who have published in Nature or Science belong to a 'golden club'.  Reich's report makes a notable point in the case of astronomy, as people within the field evaluate papers as they are posted on the arXiv preprint server.  This doesn't help those from outside the field, because a pre-print server has essentially no peer review, and only experts can use it well.

Incidentally, the Nobel Prizes serve a similar role for the general public, who are not familiar with the prizes given within a scientific field or sub-field, but everyone knows that the Nobels are the world's highest form of recognition.

On the operational side, the Nature family of journals is apparently the first and only publisher that makes its citation and bibliographic information openly available as "linked open data", according to David Shotton, director of the Open Citations Corpus, in this Nature commentary.  Moreover, Nature is launching an online data journal, according to this announcement.  This step is welcomed as it contributes both to reproducible research and data stewardship, both topics discussed on DTLR in recent months.

Finally, back in August Nature Methods, having concluded a series of tutorials on data visualization ('Points of View'), have started another tutorial series on statistics, called 'Points of Significance.'  The announcement appeared here and was discussed on their blog by Daniel Evanko here.  It is clear to me that both statistical training as well as the understanding of statistical thinking varies greatly among scientists, and hopefully the new series will provide good advice.  Of course, how well the articles in the series are written, and their content, will be crucial to their success.  (The failure of formal statistics courses in producing well informed scientists is a major reason such a series is needed in the first place.  It is very difficult to convey the concepts of statistical thinking, so the challenge for the authors is considerable.)  The first two entries are on sampling error and error bars, both good nuts-and-bolts topics, but hopefully they will get to the nuances of study design and the pitfalls of data analysis in good time.

As one of the 'prestige' journals, Nature is in a unique position to drive real change in the infrastructure of science publishing.  Unfortunately I do not subscribe to their journals (except that I did receive the inaugural year of Nature Physics, which was some time ago) but I do receive Science, which has not emerged as a leader in reproducible research.


Sunday, October 13, 2013

Book Review. Probability: A Very Short Introduction, by John Haigh

Probability:  A Very Short Introduction by John Haigh (2012) is a nontechnical and quite brief (main text:  117 pages) overview of probability theory and applications.  Only elementary arithmetic, and occasionally high school algebera are used, but equations are absent except in a 2-page appendix.  Most of the math is explained verbally.  There are occasional excusions into stochastic processes, decision theory, game theory, information theory, and statistics.  (Statistical physics is notably absent among the applications discussed.)  Readers should note that the book is written from a British  perspective, and references to British sports (cricket) and televsion shows, and King Richard III, may be lost on US readers.

Chapter 1, "Fundamentals", begins by explaining and comparing three interpretations of the concept of probability:  classical ("objective"), frequentist, and subjective.  The classical approach is based on first principles, such as symmetry.  For instance, one could decree that each face of a (fair) die is equally probable.  The frequentist approach states that probabilities emerge as empirical, long run frequencies of trial replications.  The subjective approach refers to a degree of belief, or a personal opinion.  The mathematics of probability are largely independent of which interpretation is being used.  Chapter 2, "Workings of Probability", introduces a few basic mathematical ideas:  the addition law for disjoint events, the multiplication law (in the context of conditioning), independence, and ways to handle overlapping and multiple events. 

Chapter 3, "Historical sketch", uses a historical account to introduce (in lesser detail) the laws of large numbers, the central limit theorem, Bayes' theorem, probability distributions (including the Gaussian and Poisson), the Markov property, the law of the iterated logarithm, measure theory, and martingales.  I am persuaded by the author's argument (p. 35) that the normal distribution's name is unfortunate, and that "Gaussian" should be preferred.  Chapter 4, "Chance Experiments", introduces additional distributions (uniform, binomial, geometric, and exponential) and mentions several extreme value distributions.  The concepts of mean and variance are also introduced.

Chapter 5, "Making Sense of Probabilities", is perhaps the most interesting one in the book.  This chapter and those that follow describe a series of case studies where probabilistic thinking can be applied.  This chapter starts with explaining odds in gambling.  Next, biostatistics appears with a comparison of the risk difference with the risk ratio, a topic discussed at greater length by Gigerenzer (2002), a book with which I will make several comparisons.  After a short discussion on "combining tiny probabilities" (referring to the Borel-Cantelli lemmas), there is a very good 3-page section titled "Some Misunderstandings".  Several of the misunderstandings are connected to quoting probabilities without specifying the reference class, an issue emphasized in Gigerenzer (2002).  The chapter ends with sections on "Describing Ignorance" (which introduces the Beta family of distributions) and "Utility".

Chapter 6, "Games People Play", discusses various applications in gambling as well as TV game shows.  Proficient card counters are banned by casinos; the author finds that "No better tribute to the power of understanding probability has ever been paid" (p. 77). Chapter 7, "Applications in Science, Medicine, and Operations Research" tackles Brownian motion, pseudo-random number generators, Monte Carlo simulations, error correcting codes, and several case studies in health:  amniocentesis decision making, estimating probabilies of carrying a gene based on knowledge of phenotypes in one's family tree, modeling epidemics, and improving the efficiency of mass blood tests by pooling samples.  The chapter concludes with sections on airline overbooking and queuing theory.

Chapter 8, "Other Applications", discusses law (particularly the Prosecutor's Fallacy), randomized response in surveys, and diagnostic tests for the use of performance enhancing drugs in sports.  In this latter section, the subtleties of a diagnostic test's sensitivity and specificity are finally discussed (pp. 100-102).  (As most readers will eventually have personal experience with medical diagnostics, I much prefer the more in-depth treatment of this topic found in Gigerenzer, 2002.)  The chapter concludes with sections on applications in sports and finance, including the Black-Scholes equation. 

The final chapter, "Curiosities and Dilemmas", returns to games, including a discussion of Parrondo's Paradox.  It concludes with some dilemmas surrounding genetic testing.  The book ends with the aforementioned appendix as well as a brief list of References and Further Reading.

The first sentence in the book is "Probability is the formalization of the study of the notion of uncertainty".  I take issue with this claim.  At the very least, the phrase "a formalization" should replace "the formalization".  There are other kinds of uncertainty, such as those found in approximation theory (where the error bounds on a given approximation can be thought of as a kind of deterministic uncertainty) or fuzzy logic.  Moreover, Heisenberg's Uncertainty Theorem is nowhere to be found in the book; it can be thought of as a consequence of wave-particle duality in quantum physics.

There are many nontechnical books about probability theory and applications; many are reviewed by the eminent probabilist David Aldous, including an earlier and longer work by John Haigh.  I've read none of the books he has reviewed except for Taleb (2005), which covers some of the psychological aspects of applying probability in real life.  However I have read Gigerenzer (2002), not mentioned by Aldous, but which remains my own favorite work on probabilistic thinking. This is why I make several comparisons above between it and Haigh (2012).  To finish the comparison, Gigerenzer (2002) seems very much focused on risk and its psychology, whereas Haigh (2012) is concerned with probability at large, and gives a more systematic and wider overview of its theory and applications.  However, I do think Gigerenzer is stronger when it comes to explaining the pitfalls of probabilistic thinking.  One strength of Haigh (2012), compared to the works reviewed by Aldous, is that Haigh (2012) is a very quick read, appropriate for the VSI (Very Short Introductions) series.  From that perspective alone, it may have a niche audience. 

References

 

Gerd Gigerenzer, 2002:   Calculated Risks:  How to Know When the Numbers Deceive You.  Simon & Schuster.

John Haigh, 2012:  Probability:  A Very Short Introduction.  Oxford University Press.  (Very Short Introductions, Vol. 310.)

Nassim Nicholas Taleb, 2005: Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets. Second edition. Random House.

Tuesday, October 8, 2013

The 2013 Nobel Prize in Physics

Congratulations to Francois Englert and Peter Higgs for the 2013 Nobel Prize in Physics, for the theoretical discovery of the Higgs boson, announced earlier today after a one hour delay, presumably due to vigorous discussion within the academy.  Indeed, the award is controversial, as it excludes other theorists who some have argued are equally deserving.

At this hour the best write-up that I've found is by Dan Clery at Science Magazine.  It focuses on precisely this controversy.  Charles Day at Physics Today notes that the experimental groups that discovered the Higgs boson at the Large Hadron Collider, though mentioned by the Nobel committee, were also not honored this year by the Prize.

The inherent difficulties of the Nobel Prize are illustrated by the controversy.  Underlining this, Tom Siegfried posted yesterday at Science Newslist of "Top 10 Physicists with no Nobel", although his list is limited to physicists he has interviewed!  (This is hardly a fair criterion....)

The main issue here is that a prize such as this reinforces the undue emphasis on priority, or time of publication.  This is misleading in a discipline that seeks to establish fairly permanent (ie, reproducible) knowledge about nature.  It would be more fitting to celebrate the process of science by awarding a prize for an accomplishment, rather than a set of individuals.  That way, the cooperative nature of the scientific endeavor would receive focus, and all who take part in a particular achievement could be recognized for their contributions.

The emphasis on priority and publishing "first" has led many to publishing premature, and in many cases, non-reproducible findings.  This is one of the worst features of contemporary science, in my view.

Nonetheless, let us choose today to celebrate all who were involved in the discovery of the BEH mechanism (Brout-Englert-Higgs; the late Dr. Brout was disqualified from the Nobel solely due to not being alive).  Congratulations!

Saturday, October 5, 2013

The Ph.D. Placement Project

Last summer, the Chronicle of Higher Education launched its Ph.D. Placement Project.  Across university graduate programs of all disciplines, it is fairly uncommon for departments to track the "outcomes" of its Ph.D. programs, as measured by the placement of its graduates.  The Chronicle's reporter, Audrey Williams June, presents a case study of one faculty member in one department who actually did create a database of his program's Ph.D. graduates and their subsequent careers (the City University of New York (CUNY)'s Graduate Center's sociology program).

My impression is that in the sciences and engineering, placements are not systematically tracked by graduate departments, and when they are, summary data are not routinely provided to current or prospective graduate students. I welcome data to prove me wrong though.  If I'm right, I believe this is a scandal, and university departments should no longer be able to get away with it.  Departments that are afraid to generate such data or to disclose it, are behaving in a self-serving way, unfitting for their nonprofit status in the economy.  It is difficult for me to conceive of a rational defense of such practices.

Therefore DTLR calls on all graduate degree programs in science and engineering to initiate placement studies of its graduates, whether they stay within the profession or not, and to publicly disclose the results, at least in summary form, once enough data points have been gathered to ensure the privacy of the graduates themselves.  DTLR endorses any efforts by public and private funding agencies and alumni groups to withhold funding from any graduate program that fails to commit to such an initiative.