Tuesday, July 30, 2013

Welcome to “Tide” and “Gyre”, supercomputers for operational weather forecasting

Last thursday, the National Weather Service (NWS) brought online its two new “clone” IBM supercomputers, Tide and Gyre, for operational weather forecasting. The machines are the workhorses of the NWS National Centers for Environmental Prediction (NCEP)'s Weather and Climate Operational Supercomputing System (WCOSS). The two identical supercomputers are located in different places (primary machine at Reston, VA; backup at Orlando, FL).

Patrick Thibodeau's write-up at Computerworld seems to have led media coverage of the event. He refers to last year's Hurricane Sandy, which engendered “a belief that the European Center for Medium-Range Weather Forecasts (ECMWF) had a better storm track model further out. Criticism over the U.S. forecasting ability has followed post Sandy.” Bringing the new computers online, replacing the previously used pair of IBM supercomputers, is a step in the right direction. (A planned phase II of the transition will lead to further improvements.) However, the U.S. will also need to maintain investments in Earth-observing weather satellites, as noted (for instance) by Stephen Stirling of the New Jersey Star Ledger.

The NWS computers do work of national importance, and it is reassuring to read about the verification and validation of the systems prior to going live. Further details can be found in this post by Steve Tracton at the Washington Post. He describes sensitive dependence on initial conditions, manifested as a divergence in forecasts from the same forecast model run on both the old and new supercomputers. This phenomenon, as he points out, is closely related to the path-breaking work on chaos in meteorology and computational science by the late Edward N. Lorenz (1963, 1989).

Update (29 Aug 2013):  see also the NOAA press release.

 References



 
E.N. Lorenz, 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20: 130-141.

E. N. Lorenz, 1989: Computational chaos -- a prelude to computational instability.  Physica D, 35: 299-317.


Monday, July 29, 2013

Nature's policy on reproducible research

The Diffusion Tensor Literary Review (DTLR) applauds the Nature family of journals for its new policy on reproducible research.  They have been doing a great job covering non-reproducible research in science, illustrated by this open-access collection of previous reports.  Their new policy is embodied in a Reporting checklist for life science articles.  If adequately enforced, the new policy should be a major step forward in promoting reproducible research.  I am pretty happy with what is on the checklist, but it would be boring for me to rehash it here (please go read it yourself).  Instead, I will dwell on a few deficiencies, which I don't think are minor.  Despite these complaints, I feel that the policy and the checklist represent a major step forward and I hope other journals consider formulating reproducible research policies of their own.

As Nature acknowledges, the checklist is not exhaustive.  One item in particular should have been mentioned:  if data, signal, or image analysis software was used either for data reduction or analysis, the software version, settings, and options used for such analysis need to be disclosed.  Unfortunately, scientists often do not realize that such procedural minutiae matter, let alone require disclosure.  In my experience, use of software (including commercial software) can be a minefield if users have free reign to fiddle with settings and options that could affect the data reduction and/or data analysis.  In some cases, the hardware-software interface (e.g., data acquisition) itself requires further elucidation.  I recall an occasion where I was working with a commerical biosignal device that allowed the user to choose data reporting at (say) either 500 Hz or 1 kHz.  A colleague and I wondered what the true sampling rate was, and how the reported data were being generated (downsampling?  interpolation?).  After pressing the manufacturer for an explanation, it turns out the actual sampling rate was non-uniform, due to the physics of the transducer, and interpolation was used to report an artificially equally-spaced signal.  Most users would have no need of such details, but we were processing the fine structure of these signals, and really did need to understand the data acquisition and reporting process.

The other deficiency in the checklist that I am not happy about is the checklist's use of p-values as an example of statistical test results, in the guidance on figure legends.  The p-value is one of the least informative statistical inferences that could be reported in an analysis, since it focuses only on statistical significance.  Point and interval estimation of appropriate quantities is more likely to convey both statistical and practical/clinical significance.  In any case, attempts to boil down statistical results into a single number (e.g., a p-value or a correlation coefficient) serve to hide the richness of information contained in the data.  Statistical tests, usually formulated as null hypothesis tests, usually lack any meaningful measures of magnitude and precision.  I will return to this point at greater length in a future post.

As I said in the previous post, there needs to be a major cultural and infrastructural change in the scientific community, and it can best be driven by funding agencies, journals, and employers.  The Nature policy is an excellent contribution to infrastructural change.  Ultimately, though, scientists themselves need to be upset enough about non-reproducible research in their own fields before things will change.  It is necessary but not sufficient to drive the principles of reproducible research from the top down.  Without support from the bottom up, scientists may view reproducibility policies as just additional bureaucratism, and may even seek ways to circumvent their spirit (if not their letter).

Sunday, July 28, 2013

Reproducible research in computational science and engineering

The Diffusion Tensor Literary Review (DTLR) endorses the principles of reproducible research outlined in the final report of the ICERM Workshop on Reproducibility in Computational and Experimental Mathematics.  A summary of the findings was published last month by Stodden, Borwein, and Bailey (2013).

The message seems to be that computational science has failed to maintain standards of reproducibility expected in theoretical and experimental science, let alone good software engineering principles.  This has led to a "credibility crisis".  The workshop participants agreed on three principles, which I quote verbatim from Stodden, et al. (2013):

  1. "It is important to promote a culture change that will integrate computational reproducibility into the research process."
  2. "Journals, funding agencies, and employers should support this culture change."
  3. "Reproducible research practices and the use of appropriate tools should be taught as standard operating procedure in relation to computational aspects of research."
Read the final report of the workshop:  it is surprisingly direct, candid, and eloquent for a committee-produced document.  I have not studied the associated wiki in detail, but the report establishes the big picture.  We need to get the scientific/engineering community on board with these basic principles.  It is difficult to understate that a major cultural change is indeed called for, as the current infrastructure actively discourages scientists from implementing principles of reproducible research.

Personally, as a taxpayer and as a scientist, I find the current state of affairs disgraceful.  Scientists and engineers typically spend other people's money (usually at the expense of the taxpayer) to do their work.  Ensuring that results are reproducible seems to be a minimum expectation for publication, yet in computational science and engineering it is not.  Often scientists cannot even reproduce their own work, let alone the work of others.  We need to be better caretakers of the limited amount of money that society collectively has to spend on science and engineering.

Reproducible research is a theme I plan to return to in future posts on DTLR.  Watch this space for more.

 Reference



V. Stodden, J. Borwein, and D. H. Bailey, 2013:  "Setting the default to reproducible" in computational science research.  SIAM News, 46 (5):  4-6.

Big game hunting in virology: pandoraviruses and their cousins

Earlier this month, the discovery of two new “giant” viruses, dubbed pandoraviruses, was published in Science (Philippe, et al., 2013). This discovery by husband-and-wife team, Jean-Michel Claverie and Chantal Abergel, and their collaborators in Marseille, France, has received a lot of press coverage, including Science's own report (Pennisi, 2013) and a New York Times piece by Carl Zimmer (2013). If you google “pandoravirus” you can find other coverage, including this summary by Ker Than for Inside Science News Service. The pandoravirus work reinforces the earlier breakthrough discovery of the first “giant” virus, mimivirus, a decade earlier, as well as subsequent discoveries of other “giant” viruses and related developments. Zimmer (2011) wrote about some of these earlier findings in the Epilogue of his superb (and short) expository book on virology for laypersons, A Planet of Viruses.
Pandoravirus on the cover of Science

The original serendipitous discovery of the mimivirus, as well as the more directed effort that discovered the pandoraviruses, focus on amoeba-infecting viruses. Algae and possibly other forms of sea life can also be infected by mimiviruses (Zimmer, 2011). As a layperson myself, I wondered whether more complex organisms, including humans, need to worry about “giant” viruses. It initially occurred to me that agents for most of the major infectious diseases of humans, and plants & animals of economic significance, had probably been identified. If any of these agents had been “giant” viruses, wouldn't we have known about them earlier?

Apparently not! Zimmer (2011) notes that the mimivirus has been found in the lungs of hospital patients with pneumonia. “It's not clear yet if mimiviruses actually cause pneumonia...or if they just colonize people who are already sick” (p. 91). More interestingly, Zimmer (2013) cites a preprint which reports finding another “giant” virus in the blood of (apparently) asymptomatic blood donors (Popgeorgiev, et al., in press). Some blood donors even have antibodies to these viruses.  Such viruses had never been seen before in human blood because conventional virus testing uses ultrafiltration, which removes “giant” viruses before you have a chance to find them. Popgeorgiev, et al. (in press) point to an earlier paper (Lagier, 2012) reporting yet another “giant” virus found in a human stool sample of (again) an asymptomatic person, suggesting that such viruses may be present in the gut flora. (Both of these papers are from the Marseille-area researchers involved in the original mimivirus discovery.)

The tentative evidence suggests, therefore, that “giant” viruses may be ubiquitous, despite their recent discovery. These viruses may be residents of the human viriome, and presumably the viriomes of other plants and animals. Naturally, a great deal of research needs to be done, but the hypothesis, once posed, seems eminently plausible. The “giant” viruses' influence on the health and sickness of their hosts remains to be known. While the pandoraviruses have deservedly attracted the headlines, the papers from Marseille on “giant” viruses in the human viriome foreshadow how such research could potentially benefit us in everyday life.

References


E. Pennisi, 2013: Even-bigger viruses shake tree of life. Science, 341: 226-227.

J.-C. Lagier, et al., 2012: Microbial culturomics: paradigm shift in the human gut microbiome study. Clinical Microbiology and Infection, 18: 1185-1193.

N. Philippe, et al., 2013: Pandoraviruses: Amoeba viruses with genomes up to 2.5 Mb reaching that of parasitic eukaryotes. Science, 341: 281-286.

N. Popgeorgiev, et al., in press: Giant blood Marseillevirus recovered from asymptomatic blood donors. J. Infectious Diseases, in press.

C. Zimmer, 2011: A Planet of Viruses. University of Chicago Press.

C. Zimmer, 2013: Changing view on viruses: not so small after all. New York Times, Science Desk, July 18, 2013.


Tuesday, July 23, 2013

On Gollub's “Continuum mechanics in physics education”

Continuum mechanics is the study of deformable media, including solid and fluid dynamics, using a macroscopic, continuum approximation, as opposed to a microscopic, molecular approach. Like electromagnetism, continuum mechanics can be formulated as a local field theory, using a handful of partial differential equations (PDEs) for vector and scalar fields, supplemented with boundary conditions. Thermodynamic equation(s) of state may also be needed. In fluid dynamics, for instance, the basic PDEs are statements of conservation laws, such as the balances of mass, momentum, and energy. For Newtonian fluids, these basic equations are called the Navier-Stokes equations. When dealing with electrically conducting fluids, the basic PDEs must also be coupled with Maxwell's equations.

In the early 21st century U.S. physics curriculum, continuum mechanics is notably absent, except in certain corners of plasma physics, astrophysics, and perhaps biological physics. Most of the teaching in solid and fluid dynamics occurs in schools of engineering and the geosciences, with occasional coverage by applied mathematicians. A similar fate has befallen another great branch of classical physics, acoustics, which could be thought of a special branch of continuum mechanics.

Jerry Gollub
Long ago, James Lighthill (1962) made the case that fluid dynamics is a bona fide branch of physics, and Jerry Gollub (2003, 2008) has argued in favor of restoring some role for continuum mechanics in the physics curriculum. Gollub notes that there are a number of texts that could be used for a dedicated course in fluid dynamics for physics students. Several others, by and for physicists, either focused on fluids or tackling continuum mechanics as a whole, have been published since he first wrote. However, given the already crowded physics curriculum, Gollub has tried to incorporate continuum mechanics in an existing course, rather than creating a new one. By curtailing discussion of other topics, Gollub covers selected topics in continuum mechanics in an introductory physics course for physics majors, as well as in the more advanced undergrad mechanics course. He acknowledges the challenges for other physicists to do so, since not many have expertise of their own in continuum mechanics, and few of the popular mechanics texts include these topics. I know of two major graduate level mechanics texts that cover continua: Fetter and Walecka (1980/2003), and José and Saletan (1998). At the undergrad level, Taylor (2005) and Chaichian, et al. (2012) take a brief look at continua in their final chapters. (Readers, do you know of any others?)

The Navier-Stokes equations occasionally show up in graduate level texts in statistical mechanics, such as Huang (1987, ch. 5) and Reichl (2008, Ch. 8). They have even made appearances in advanced electromagnetism texts, such as the extremely cursory section on magnetohydrodynamic waves in Jackson (1999. Sec. 7.7), where viscosity is ignored. A far more substantive treatment integrating electromagnetism and continuum mechanics may be found in Kovetz (2000). In my view, the undergrad courses in statistical physics and electromagnetism are probably less appropriate places to introduce fluids than a classical mechanics course, although fluid dynamics should be considered a good option at the grad level for any of these courses.

Another appealing way to tackle fluid dynamics is by combining it with a course on nonlinear dynamics and chaos theory, as reflected in textbooks like Hilborn (2000). (A truly dual-topic course might have two texts, one each for fluids and nonlinear dynamics.) It is also possible, as Gollub suggests, to incorporate further coverage of fluid and solid mechanics in more advanced courses such as condensed matter theory and materials science. As mentioned, this approach is already frequently taken in plasma physics, astrophysics, and (hopefully) biological physics. Aref (2008) notes that “The somewhat applied subfield of fluid dynamics known as fluidics has taken on new life in the context of microfluidics and nanofluidics,” presenting yet another opportunity.

Gollub (2008) rightly suggests the use of the Multimedia Fluid Mechanics DVD-ROM as a supplement to the textbooks (Homsy, et al., 2008); this item is now often found packaged with conventional fluid dynamics textbooks. Fluid or solid mechanics projects could be considered in an advanced lab course as well.

Realistically, it is unlikely that continuum mechanics will find a place among the standard topics of the core physics curricula. We are lucky if it gets included as a special topic in another course, but even this will be hit-or-miss across physics departments nationwide. This may be a blessing in disguise, if physics students serious about learning continuum mechanics are sent to good engineering or geoscience courses offered on their campuses. One of the greatest rewards of working in fluid dynamics for me has been the relentless interdisciplinarity of the enterprise. Meetings of the American Physical Society's Division of Fluid Dynamics, the Acoustical Society of America, and the Society of Rheology (all member societies of the American Institute of Physics) are places where physicists are often outnumbered by engineers and applied mathematicians.

References

 

H. Aref, 2008: Something old, something new. Phil. Trans. R. Soc. A, 366: 2649-2670.

M. Chaichian, I. Merches, and A. Tureanu (2012): Mechanics: An Intensive Course. Springer.

A.L. Fetter and J.D. Walecka, 1980/2003: Theoretical Mechanics of Particles and Continua. McGraw-Hill (original) and Dover (reprint).

J. Gollub, 2003: Continuum mechanics in physics education. Physics Today, 56 (12), 10-11.

J. Gollub, 2008: Teaching about fluids. Physics Today, 61 (10), 8-9.

J.D. Jackson, 1999: Classical Electrodynamics, 3d ed. Wiley.

R.C. Hilborn, 2000: Chaos and Nonlinear Dynamics: An Introduction for Scientists and Engineers, 2d ed. (Oxford University Press).

G.M. Homsy, et al, 2008: Multimedia Fluid Mechanics, 2d ed. Cambridge University Press.

K. Huang, 1987: Statistical Mechanics, 2d ed. Wiley.

J.V. José and E.J. Saletan, 1998: Classical Dynamics: A Contemporary Approach. Cambridge University Press.

A. Kovetz, 2000: Electromagnetic Theory. Oxford University Press.

M.J. Lighthill, 1962: Fluid dynamics as a branch of physics. Physics Today, 15 (2): 17-20.

L.E. Reichl, 2008: A Modern Course in Statistical Physics, 3d ed. Wiley-VCH.

J.R. Taylor, 2005: Classical Mechanics. University Science Books.

Sunday, July 21, 2013

Welcome to DTLR!

Greetings and welcome to the Diffusion Tensor Literary Review (DTLR)!  This blog aspires to be a place to discuss broad issues in science, engineering, and medicine.  Of particular interest is the history and philosophy of these fields, and related policy issues.

My posts will be infrequent, "few but ripe" as Gauss would say. Wolfgang Pauli added, “I don't mind your thinking slowly: I mind your publishing faster than you can think.”   The views expressed here do not necessarily reflect the views of anyone other than the authors.