Monday, January 20, 2014

Hurricane Sandy, climate change, and the limits of data science

Climate science is a very touchy subject, because unfortunately nearly all discussion of it becomes quickly entwined with politics. A reminder of this was recently discussed by Kerr (2013). In the President's State of the Union address a year ago, Mr. Obama said “We can choose to believe that Superstorm Sandy, and the most severe drought in decades, and the worst wildfires some states have ever seen were all just a freak coincidence. Or we can choose to believe in the overwhelming judgment of science—and act before it's too late.”

However, Kerr (2013) notes that “there is little or no evidence that global warming steered Sandy into New Jersey or made the storm any stronger. And scientists haven't even tried yet to link climate change with particular fires.” Kerr also points to a Republican Congressman's equally incorrect claim that “Extreme weather isn't linked to climate change.” Kerr states that several heat waves have indeed been “securely linked” to global warming. Kerr says that “Links between extreme weather and climate change are not only often scientifically suspect, they may also be a risky strategy to take climate change seriously.” After all, climate by definition is a statistical average of weather, which is what we experience on a day-to-day basis. The President, alas, was wrong.

Last March, I attended a lecture by Dr. Richard A. Anthes, an eminent meteorologist and president emeritus of the University Corporation for Atmospheric Research, and a former president of the American Meteorological Society. He pointed out that the track of Hurricane Sandy, with its left turn towards New Jersey, had been predicted days in advance by the ECMWF forecast model (European Center for Medium-Range Weather Forecasts). The U.S. forecast models were not able to give as early a warning, due to technical limitations of the computers and computer models (see my earlier post).

Let's use this example in a thought experiment on how data could be used to make a hurricane forecast. A purely empirical approach (whether by conventional statistical methods or by data mining/machine learning) would likely have failed: never before had a hurricane approached New Jersey from the east in late October. The ECMWF model uses data too, but not for statistical forecasting. It uses data as initial conditions to simulate the atmosphere using partial differential equations that incorporate subject matter knowledge of the physics and chemistry of the atmosphere and ocean. By doing so, it forecast the formation of the storm before it actually formed, as well as its subsequent track. The successful forecast of the ECMWF model was interpreted correctly by political authorities and heeded by the public, saving tens of thousands of lives. If you want an example of science at its best, here it is.

Make no mistake: meteorology as a science does make very judicious use of statistical and monte carlo methods. The atmospheric sciences, however, are driven primarily by methods based on subject matter knowledge, not strictly empirical methods such as those used by statisticians and data scientists (“superficial statistics” in the devastating words of Salby, 2012). Note that both approaches are equally data hungry.

Let's return to climate now. The whole point of climate change is that data in the future will not be like data from the past. As Showstack (2013) reports, Kathryn Sullivan, acting administrator of the National Oceanic and Atmospheric Administration (NOAA) gave the keynote address at the National Research Council's Board on Earth Sciences and Resources in November. She said, “The past is no longer prologue when it comes to the risks we bear at any given place on this planet. The statistical pattern of our past cannot be relied upon fully to tell us what our future will be.” Isn't this a conundrum for a statistician or a data scientist?  What good is the training data when we know it will not be informative about data from the future? 

In my view, the solution to all this is to use first principles modeling, as climate scientists do. As with meteorology, in climatology knowledge of the physics and chemistry of the atmosphere, embodied in the partial differential equations of climate modeling, is preferred to the methods of statistics and data science for predicting both the weather and the climate.

References

 

 

Richard A. Kerr, 2013: In the hot seat. Science, 342: 688-689.

Murray L. Salby, 2012: Physics of the Atmosphere and Climate. Cambridge University Press, p. xvi.

Randy Showstack, 2013: Earth sciences and societal needs explored at National Research Council meeting. EOS, Transactions of the American Geophysical Union, 94 (48): 457-459.






No comments:

Post a Comment