## On nested equivalence classes of climate models, ordered by computational complexity

I’m digging into the internals of ABC, for professional and scientific reasons. I’ve linked a great tutorial elsewhere, and argued that this framework, advanced by Wood, and Wilkinson (Robert), and Wilkinson (Darren), and Hartig and colleagues, and Robert and colleagues, is very similar to the climate modeling problem. One thing, however, is that results from climate models cannot be computed cheaply. So how in the world can ABC help?

Recall the observation made by many, but here most typically associated with Kharin, that

There is basically one observational record in climate research. It is hardly possible to perform real independent experiments. Classical hypothesis testing is “frequentist” in nature (repeated sampling). Dynamical models can create independent data, subject to model deficiencies.

Within any specific observational record, there is a finite amount of information about the climate system. Presumably, assuming pretty standard continuity conditions, there is a hypersphere of boundary and initial conditions for any given climate model which will produce that observational record. Moreover, over a set of climate models, there are sets of not necessarily overlapping hyperspheres of boundary and initial conditions which give rise to that specific observational record. Now, for any such set, posit a means of ranking their complexity. This can be done using the Akaike information criterion, for example. (Not a very Bayesian thing to do, admittedly. But I’ve dallied with the dark side before.) Numbers of parameters, or amount of CPU cycles needed to simulate a climate year, or many other measures of complexity could be included in such a determination. Within each hypersphere of parameters for each climate model, there is a nested set of simpler models, judged “simpler” by the same metric used to rank the climate models with each other, which map a subset of their parametric hyperspheres to the observational record. These are computationally cheaper. In particular, were “the perfect set of parameters” known, presumably a Taylor expansion about those parameters could be used, within the limits of the observational record, to reproduce it. These are multivariate polynomials.

I suggest, therefore, that much simpler models, along the lines of many like those offered in Professor Ray Pierrehumbert’s text, perhaps in combination, would suffice to reproduce the single specific observational record we have. The embellishments of the existing climate models are motivated by what we know of physics at scale, and from observations of the atmosphere-ocean-land system, but if the observational record is all that’s given, at some point the contributions embellishments are dominated by the observational errors of the record. Some of the climate model parameters, or their arithmetic combinations, will survive incorporation in the simpler models. And these models might well be computationally simple enough to be candidates for massive ABC calibrations.

Sure, when the observational record is extended, the simple models may fall short. But they’ve done their job, and informed good combinations of climate model parameters to constrain either the more complicated, existing models, or expansions of the simple models used in the ABC runs.