Saturday, April 25, 2020

A critique of "AI Feynman"

Udrescu & Tegmark (2020) describe a neural network-based machine learning algorithm for "symbolic regression", which they define as determining the form of a functional relationship between a response variable and a list of input variables from a data set.  Kepler's realization that Mars' orbit is elliptical, inferred from observational astronomy data, is the prime example they give.  Their "AI Feynman" algorithm has a number of modules, including automated dimensional analysis, polynomial fitting, neural network-based data interpolation, tests for symmetry and separability, and so on.  The paper concludes with the ambitious declaration, "We look forward to the day when, for the first time in the history of physics, a computer, just like Kepler, discovers a useful and hitherto unknown physics formula through symbolic regression!"  While the paper reports important accomplishments and leaves the door open to further improvements in the authors' methodology, DTLR believes that the effort is less than meets the eye.  Here I will discuss some methodological weaknesses, most of which the authors acknowledge, as well as a conceptual weakness.

Methodological weaknesses


The authors compare their "AI Feynman" algorithm to a commercial software called Eureqa.  They defined a set of 100 algebraic equations from the Feynman Lectures and compared the two algorithms' ability to discover the correct functional relationship, using simulated data from the "true" relationship (without and with Gaussian noise in the response variable, but not in any of the independent variables).  Data for the input variables was sampled uniformly between 1 and 5, a rather arbitrary choice.  The authors triumphantly report that AI Feynman had a 100% success rate, while Eureqa achieved only 71%.  However, buried in the paper is the fact that they used the Feynman Lectures as a training set, on which their algorithm was tuned (ie, potentially overfit to).  Thus the comparison with Eureqa on this data set is wholly unfair, and the claimed 100% success rate is not generalizable.  Recognizing this, they define a set of 20 equations drawn from other classic physics texts, reporting that AI Feynman solved 90% while Eureqa solved only 15%.  Finally a third test set of somewhat arbitrary mathematical relationships was defined; AI Feynman achieved 67% success compared to Eureqa's 49%.  The authors explain that this last set of equations may not have the nice properties of physical laws that AI Feynman was designed to exploit.

Data from real experiments would rarely resemble the simulated data that the authors used.  At the early stages of an investigation, part of the domain of the input variables may even be experimentally/observationally inaccessible.  Thus the authors' comparison exercise seems artificial, though a necessary preliminary step before tackling real data.  Nonetheless it is important to point this out to readers who might find the reported performance metrics impressive.  The authors acknowledge that they did not include differential or integral equations in their evaluation, nor noise in the input variables, nor real data sets, though they hope to address all these weaknesses in future work.

A word about dimensional analysis:  In the authors' Newtonian gravity example (their Fig. 2), why were the temperatures of the masses not considered as candidate input variables?  Why is gravity completely ignored in a dimensional analysis of the ideal gas law (Lemons, 2017)?  In both cases, physical intuition is the answer.  The first step in any dimensional analysis is of course the selection of candidate variables to include in the functional relationship.  In the authors' exercise, the candidate variables were pre-selected from knowledge of the true relationship.  In reality, physical intuition (which requires training; it is not the same as natural intuition) must be used.  Santiago (2019) states that this step is the most difficult and requires the most experience.  The physicist generating the experimental data has often already made the decision on which variables should be measured, so the point may be moot, at least for physics problems.  However, if the method is extended to other, more empirical domains, such as public health, the social and behavioral sciences, and finance, it will run into trouble.  It is often far less obvious which variables should be considered even as candidates for inclusion, and dimensional analysis may cease to be an operative method.  In fact, it is often the case that you have little or no data on the input variables you really need, while you have plenty of data on variables of limited relevance to your problem. 

Conceptual weakness


When Max Planck first obtained his blackbody radiation law, as an empirical formula that satisfied constraints imposed both by the known data and physical intuition, he did not publish it.  Why?  Because at that moment, the formula was strictly empirical; it did not provide any physical insight.  Planck published it only after he developed a theoretical model of resonators in thermal equilibrium with the radiation, from which the formula could be derived.  A feature of his model was the quantization of energy.  The latter was not taken seriously until Einstein showed that energy quantization could also resolve the mystery of the photoelectric effect, superfically a completely different physics problem.  The new physics wasn't just one new equation, it was a new concept.  If energy quantization were an intrinsic property of nature, it would manifest in many physics problems, and physicists gradually discovered that this was indeed the case.  Symbolic regression is at best a contributing factor to discovering new physics; it cannot be the sole tool for doing so.

References


D. S. Lemons, 2017:  A Student's Guide to Dimensional Analysis.  Cambridge University Press, Sec. 1.8.

J. G. Santiago, 2019:  A First Course in Dimensional Analysis.  MIT Press, Sec. 5.2.

S.-M. Udrescu and M. Tegmark, 2020:  AI Feynman:  a physics-inspired method for symbolic regression.  Science Advances, 6:  eaay2631.