Monday, June 28, 2021

The proton radius puzzle and the hazards of combining data from multiple studies

If journalism is a first draft of history, Physics World writer Edwin Cartlidge has done a superb job this month of reporting on the "Proton Radius Puzzle" and the pitfalls of combining data from multiple studies.  Cartlidge's piece is also a superb case study of a phenomenon described by David Bailey a few years ago.

Over time, a number of experiments around the world, using different physical principles, attempted to measure the radius of the proton.  An international group, CODATA, has the task of compiling all such data and basically reporting the community's best estimate.  This is done by first voting on which studies should be included; then the data are simply (weighted) averaged; presumably the weights are determined by the error bars reported by the individual studies.  The thus combined estimate, it is hoped, will have a more accurate point estimate, and narrow error bars, than any of the individual findings.  The former chair of the CODATA group working on the proton radius is quoted in Cartlidge's article arguing that this process incorporates all "individually credible results" but passes no judgment on whether each of those results is "right or wrong", a task that "would require superhuman powers".

Well, for about eight years, the measurement by the CREMA experiment, using a unique muon-based principle, was excluded from the average, as it was an outlier compared to other reports (in exactly the sense that Bailey has described).  However, in the interim, other groups using the more conventional approaches started to obtain results comparable to the lower value given by CREMA.  In 2018 CODATA finally incorporated the outlying results, though the stated error bars for the combined estimate had increased, unsurprisingly.  See Cartlidge's article for the twists and turns of the story.

DTLR's interest here is in the whole concept of combining data.  Something like this is widely practiced in statistics, under the name "meta-analysis".  I consider this poor practice, because it sweeps under the rug potential systematic errors in the individual results.  In the proton radius case, Cartlidge even seems to suggest that groupthink might have been at play in the CODATA decisions.

Here is DTLR's opinion about combinging data from multiple studies.  Don't do it.  Instead of meta-analysis, the individual study results, with their error bars, should simply be displayed together.  Users should be directed to critically review the study design, execution, analysis, and reporting of the individual studies, seeking out differences among them.  Authors of systematic reviews should use their judgment and discuss the similarities and differences, without blindly pooling all the data together.  Cartlidge writes, "the CREMA result was not really at odds with individual spectroscopy experiments – all but one differed by no more than 1.5 standard deviations, or σ. The only significant disparity – of at least 5 σ – arose when the conventional data were averaged and the error bars shrunk. But that disparity could only be maintained if the muon result itself was kept out of the fitting process – given how much it would otherwise shift the CODATA average towards itself."  My interpretation is that the artificial task of combining the data drove the source of confusion; this would have been avoided by simply presenting all the individual study results separately.  The field has clearly not reached sufficient maturity for a combined "best" estimate to be meaningful, in my opinion, and this is probably even more true of the meta-analyses often reported in the medical and public health literature.

Just days after Cartlidge's article came out, another one authored by him was published that also has combining data at its heart.  This one was about gravitational waves, but the story is complicated even further by the waveform modeling required to interpret gravitaitional wave signals.


Theory, computation, and machine learning in climate science

This month's Physics Today features an excellent article by Schneider, Jeevanjee, and Socolow, "Accelerating Progress in Climate Science".  In particular, its philosophical orientation with regard to the interacting roles of theory, computation, and machine learning is one of the best articulated I have seen, and more broadly relevant than just for climate science.  The emerging role of machine learning in the study of the atmosphere is a topic I've written about previously (here and here).

The authors write, "Researchers have made deductive inferences from fundamental physical laws with some success.  But deducing, say, a coarse-grained description of clouds form the underlying fundamental physical laws has remained elusive.  Similarly, brute-force computing will not resolve all relevant spatial scales anytime soon.  Resolving just the meter-scale turbulence in low clouds globally would require about a factor of 10^11 increase in copmuter performance.  Such a performance boost is implausible in the coming decades and would still not suffice to handle droplet and ice-crystal formation."

They continue, "Machine learning (ML) has undeniable potential for harnessing the exponentially growing volume of Earth observations that is available.  But purely data-driven approaches cannot fully constrain the vast number of coupled degrees of freedom in climate models.  Moreover, the future changed climate we want to predict has no observed analogue, which creates challenges for ML methods because they do not easily generalize beyond the training data."

The authors go on to describe a concept they call parametric sparsity while comparing Newtonian gravity (with a single free parameter) to Ptolemian epicycles and equants, "the deep learning approach of its time."  They note that Newtonian gravity theory has a remarkable track record of "out-of-sample predictions, uncertainty estimates, and causal explanations."  Ptolemy's theory, like deep learning, is a massively parameterized model of empirical data, overfitted to the training data, but providing little guidance on what to expect outside the training data, the authors seem to argue.  The analogy is interesting but imperfect. As I noted previously, deep learning has demonstrated, under some circumstances, an ability to generalize beyond the training data.  I wrote then, "We do not know under what circumstances such generalization can reliably occur, and I believe any such claims about these generalizations must be validated with independent data sets."  Thus, the authors' skepticism about such generalizability is a welcome pragmatic attitude.

The authors write, "Climate science needs to predict a climate that hasn't been observed, on which no model can be trained, and that will only emerge slowly.  Generalizability beyond the observed sample is essential for climate predictions, and interpretability is necessary to have trust in models.  Additionally, uncertainties need to be quantified for proactive and cost-effective climate adaptation."  They advocate for the use of theory to develop coarse-grained models for use in computational simulations.  "Where theory reaches its limits, data-driven approaches can harness the detailed Earth observations now available." The authors' advocacy of theory-first, empirical modeling second, might be seen as an answer to Kerry Emanuel's concern about computing too much and thinking too little (discussed here).

I might depart a little from the authors in expressing some skepticism about the quality of uncertainty quantification.  Any such quantification is likely to be done in the context of the model itself, and thus fail to account for model uncertainty, which can never be fully quantified.  See also my discussion of "Escape from Model-Land" here.

Nonetheless, readers interested in climate science, and more broadly the interacting roles of theory, computation, and machine learning in the scientific endeavor (which truly must be coupled with experiment and observation) should check out the Physics Today article and think about how its ideas might apply to their own work.


Reference


T. Schneider, N. Jeevanjee, and R. Socolow, 2021:  Accelerating progress in climate science.  Physics Today, 74 (6), 44-51.


Wednesday, June 9, 2021

The 200th Anniversary of the Navier-Stokes Equations

Earlier today, Nature Physics published a "Measure for Measure" note by Oxford professor Julia Yeomans about the Navier-Stokes Equations and the Reynolds Number.  She reminds us that next year, 2022, will be the 200th anniversary of the first appearance of these equations at the hand of Claude-Louis Navier in 1822.  Investigating further (Rouse & Ince, 1957; Darrigol, 2005; Eckert, 2006), it appears that Navier read his papers at the French Royal Academy of Sciences that year, but the written publication appeared later in 1827.  Navier's original formulation was based on a now-discredited molecular model.  Meanwhile also in 1822, A.L. Cauchy published his theory of stress in continua.  S. D. Poisson, Barre de Saint-Venant, and I. S. Gromeka are others who contributed to the theoretical development of the Navier-Stokes equations.  In most historians' view, the definitive derivation of the Navier-Stokes equations was given by George Gabriel Stokes in 1845.  Nonetheless it was indeed Navier in 1822 who first presented the equations.  Prof. Yeomans also discusses Osborne Reynolds' 1883 paper on the nondimensional parameter named in his honor.  

DTLR is grateful to Nature Physics and Prof. Yeomans for bringing these notions to the attention of the journal's readers.  As I've written before, fluid mechanics has been underrepresented in the U.S. physics curriculum, and it's nice to see pieces like this in physics journals.

References


O. Darrigol, 2005:  Worlds of Flow:  A History of Hydrodynamics from the Bernoullis to Prandtl.  Oxford University Press.

M. Eckert, 2006:  The Dawn of Fluid Dynamics:  A Discipline between Science and Technology.  Wiley-VCH.

H. Rouse and S. Ince, 1957:  History of Hydraulics.  Iowa Institute for Hydraulic Research.

J. M. Yeomans, 2021:  Fluid flows on many scales.  Nature Physics, 17:  756.