struggling with problems already partly solved by others

Posted on 26 December 2014 by ecoquant

Climate modelers and models see as their frontier the problem of dealing with spontaneous dynamics in systems such as atmosphere or ocean which are not directly forced by boundary conditions such as radiative forcing due to increased greenhouse gas (“GHG”) concentrations in atmosphere. (See, for instance, the talk by Clara Deser, and a paper on which she is a coauthor. See this also.) The key problem for geophysicists is that the actual worldline of Earth’s climate takes one out of a potentially infinite number of trajectories forward from any time point, and due to inherent limitations of physical modeling, climate models forecast the entire cone of trajectories that can develop from any one point, but they cannot predict any specific trajectory. (See Kharin for an overview of the problem.) Thus, inverse inference regarding severity of GHG impacts from observed data are tenuous, except over long periods, such as hundreds of years, from the paleoclimate record, or from ab initio physics.

Now, one of the very nice things about practicing Statistics is that it affords a view of problems seen in many fields. There are plenty of reasons why, for instance, computer scientists concerned with the Internet might not read annals of population biology research, but in my profession, problems and results from population biology have been the most productive of all fields. Similarly, I’d be surprised if geophysicists read much population biology journals. Yet, as is sometimes the case, the same problem which geophysicists face, others face, and some field gets to the problem sooner. In this case, the frontierspeople were biologists.

Simon Wood makes it very clear that not only does this problem arise in population biology, Wood puts down some solid ideas about how to approach it, and others, including Perretti, Munch, and Sugihara and Hartig, Calabrese,Reineking, Wiegand and Huth (paper here) are engaged in the problem as an active research topic, including lively discussion. The similarity with the climate models problem should be apparent from a read of Wood’s 2010 paper.

Wood urges something which reminds me very much of two contributions from Statistics, one being Art Owen’s work on empirical likelihood, and the other being the explosion of interest in the Bayesian world for something called approximate Bayesian computation (“ABC“). I’m going to say a lot more about ABC in future posts, noting, for instance, that there already are almost a half dozen R packages facilitating its use. But my point is that it should be useful to the geophysicists, climatologists, and meteorologists to avail themselves of thoughts and work already done on the problem by others. Indeed, some are beginning to notice. (See also Vrugt and coauthor’s major paper on applications.)

(Update, 2^nd January 2015)

T. Toni, D. Welch, N. Strelkowa, A. Ipsen, M. P. H. Stumpf, “Approximate Bayesian computation scheme for parameter inference and model selection
in dynamical systems“, J. R. Soc. Interface (2009) 6, 187–202, http://dx.doi.org/doi:10.1098/rsif.2008.0172, published online 9 July 2008 [emphasis added in title], with Abstract:

Approximate Bayesian computation (ABC) methods can be used to evaluate posterior distributions without having to calculate likelihoods. In this paper, we discuss and apply an ABC method based on sequential Monte Carlo (SMC) to estimate parameters of dynamical models. We show that ABC SMC provides information about the inferability of parameters and model sensitivity to changes in parameters, and tends to perform better than other ABC approaches. The algorithm is applied to several well-known biological systems, for which parameters and their credible intervals are inferred. Moreover, we develop ABC SMC as a tool for model selection; given a range of different mathematical descriptions, ABC SMC is able to choose the best model using the standard Bayesian model selection apparatus.

But also see C. P. Robert, J.-M. Cornuet, J.-M. Marin, N. S. Pillai, “Lack of confidence in approximate Bayesian computation model choice”, http://arxiv.org/abs/1102.4432, with Abstract:

Approximate Bayesian computation (ABC) have become an essential tool for the analysis of complex stochastic models. Grelaud et al. [(2009) Bayesian Anal 3:427–442] advocated the use of ABC for model choice in the specific case of Gibbs random fields, relying on an intermodel sufficiency property to show that the approximation was legitimate. We implemented ABC model choice in a wide range of phylogenetic models in the Do It Yourself-ABC (DIY-ABC) software [Cornuet et al. (2008) Bioinformatics 24:2713–2719]. We now present arguments as to why the theoretical arguments for ABC model choice are missing, because the algorithm involves an unknown loss of information induced by the use of insufficient
summary statistics. The approximation error of the posterior probabilities of the models under comparison may thus be unrelated with the computational effort spent in running an ABC algorithm. We then conclude that additional empirical verifications of the performances of the ABC procedure as those available in DIY-ABC are necessary to conduct model choice.

Finally, a tutorial on ABC by Richard Wilkinson:

And a lecture updating the situation as of 2012 by Christian Robert and Peter Mueller:

Approximate Bayesian computation (ABC): advances and questions

There’s also a very fine illustration of ABC via rejection sampling in R available from Florian Hartig. A quote from this is key to many modern problems:

The interesting point about this approach is that, while maybe not as efficient as the ABC-MCMC, it will scale linearly on any parallel cluster that is available to you. So if you can get your hands on a large cluster, you can make MANY calculations in parallel.

Update, 29^th January 2015

A. Solonen, P. Ollinaho, M. Laine, H. Haario, J. Tamminen, H. Järvinenk, “Efficient MCMC for climate model parameter estimation: Parallel adaptive chains and early rejection“, Bayesian Analysis, 7(3), 2012, 715–736.

Update, 2016-07-24

Berner, et al do “Stochastic parameterization”, and this is related to the work by Ye, Beamish, Munch, Perrettia, and colleagues on model-free forecasting.

About ecoquant

See https://wordpress.com/view/667-per-cm.net/ Retired data scientist and statistician. Now working projects in quantitative ecology and, specifically, phenology of Bryophyta and technical methods for their study, notably Macrophotography. Some photos of mine: https://www.flickr.com/photos/198372469@N03/

View all posts by ecoquant →

This entry was posted in approximate Bayesian computation, Bayes, Bayesian, biology, climate, climate education, differential equations, ecology, engineering, environment, geophysics, IPCC, mathematics, mathematics education, meteorology, model comparison, NCAR, NOAA, oceanography, physics, population biology, probabilistic programming, rationality, reasonableness, risk, science, science education, statistics, stochastic algorithms, stochastic search. Bookmark the permalink.

1 Response to struggling with problems already partly solved by others

Pingback: On nested equivalence classes of climate models, ordered by computational complexity | Hypergeometric