## climate internal variability is just residual variance from modeling with a smooth curve?

I happened across what I consider to be an amazing slide while “reading around” the work of Deser and colleagues. It is reproduced below, taken from Dagg and Wills:

(Click image to see a larger picture, and use browser ‘back’ button to return.)
They are actually just conveying a definition from Hawkins and Sutton (2009), specifically from their Appendix A, Equation (1), and the paragraph following it. Specifically, internal variability is defined as the residual random variable after fitting a fourth order polynomial to a time series using ordinary least squares. The data were single members of prediction ensembles from three scenarios for each of 15 models from an IPCC study. What the source of the time series was is really not all that important to my point here. Now, quoting Hawkins and Sutton:

Each individual prediction was fit, using ordinary least squares, with a fourth-order polynomial over the years 1950–2099. The raw predictions $X$ for each model $m$, scenario $s$ and year $t$ can be written as

$X_{m,s,t} = x_{m,s,t} + i_{m, s} + \epsilon_{m,s,t}$ $(1)$

where the reference temperature is denoted by $i$, the smooth fit is represented by $x$, and the residual (internal variability) is $\epsilon$. The reference temperatures used were the year 2000 (Fig. 3) and the mean of the years 1971 to 2000 (for all other analyses), both of which were estimated from the smooth fits.

Okay, forget about how the smooth fit was obtained. Hawkins and Sutton used an OLS-derived polynomial, but it could have been a Bayesian smoother or some other more elaborate model-based or model-free device (e.g., splines) which delivers a curve with vanishing high order time derivatives. What’s astonishing to me is that they explicitly identify the residual as internal variability. I thought internal variability was more complicated than that. I thought interval variability was something like the time integral of variance in climate if suddenly all forcings (boundary conditions) were held at constant values, integrating out to infinity. (The transition to constancy in the conditions can be smoothed, if you like, to impede “ringing” effects.) Apparently not.

Now, if that were a model I had, I would be keenly interested in seeing what made up $\epsilon_{m,s,t}$ In particular, I’d like to further break it down into variance terms which were statistically independent of one another.

There’s nothing at all wrong with modeling things this way, even using a polynomial. But lumping things into an unexplored internal variability and leaving it at that seems, well, incomplete. Maybe there’s a good reason, like it’s already known what’s in $\epsilon_{m,s,t}$ and it’s been decided by the field to be uninteresting or not useful. Or maybe for some good reason $\epsilon_{m,s,t}$ really is the time derivative of the time integral of variance in climate if suddenly all forcings (boundary conditions) were held at constant values.

But I sure would like to know.

This entry was posted in cat1, citizen science, climate, climate education, forecasting, geophysics, mathematics, mathematics education, maths, meteorology, physics, rationality, reasonableness, science, statistics. Bookmark the permalink.

### 4 Responses to climate internal variability is just residual variance from modeling with a smooth curve?

1. jyyh says:

well I guess the residual should be more properly called ‘the variation unexplained by the fitting algorithm’, or some such, but this is something very frequently called natural variation f.e. in studies in ecology, so this might have translated to ‘internal variation in the earth system’. It’s pretty easy to fit a simple function to any series but then there are issues that require more attention, I guess. One climate related source of uncertainty that may begin to get an explanation would be here, http://www.climatecentral.org/news/corals-secrets-of-warming-18468 , but the study is in pretty initial stage still. And what would drive the decadal cycles over the planet, is also a mystery still. It’s not like many natural phenomena on earth follow a long cycle, so it’s rather curious why these look like existing.

2. Thanks much, jyyh.

Also found this article by Hartig, Calabrese, Reineking, Wiegand, and Huth (“Statistical inference for stochastic simulation models – theory and application”, Ecology Letters, 2011, 14: 816–827), which poses and solves a problem eerily similar to the one addressed in climate geophysics, and is identical to the one described by Simon Wood. I talked about that here.

Happy New Year!

3. jyyh says:

Happy New Year to you too!

Thanks of the Hartig article, I didn’t know the mathematical foundations of these methods are being solidified, previously every such experiment with models having multiple interdependent variables had to be checked individually for errors, making the review process really tedious, from what I gathered of the discussions between ecology doctors in Alma Mater. All a bit (or much) too difficult for me. I think I’ve heard of Simon Wood being this young brilliant guy some time in the 2000s, but since then I’ve not been much in contact with the academical world.