Category Archives: information theoretic statistics

Complexity vs Simplicity in Geophysics

Originally posted on GeoEnergy Math:
In our book Mathematical GeoEnergy, several geophysical processes are modeled — from conventional tides to ENSO. Each model fits the data applying a concise physics-derived algorithm — the key being the algorithm’s conciseness but not…

Procrustes tangent distance is better than SNCD

I’ve written two posts here on using a Symmetrized Normalized Compression Divergence or SNCD for comparing time series. One introduced the SNCD and described its relationship to compression distance, and the other applied the SNCD to clustering days at a … Continue reading

A look at an electricity consumption series using SNCDs for clustering

(Slightly amended with code and data link, 12th January 2019.) Prediction of electrical load demand or, in other words, electrical energy consumption is important for the proper operation of electrical grids, at all scales. RTOs and ISOs forecast demand based … Continue reading

Series, symmetrized Normalized Compressed Divergences and their logit transforms

(Major update on 11th January 2019. Minor update on 16th January 2019.) On comparing things The idea of a calculating a distance between series for various purposes has received scholarly attention for quite some time. The most common application is … Continue reading

The Johnson-Lindenstrauss Lemma, and the paradoxical power of random linear operators. Part 1.

Updated, 2018-12-04 I’ll be discussing the ramifications of: William B. Johnson and Joram Lindenstrauss, “Extensions of Lipschitz mappings into a Hilbert space, Contemporary Mathematics, 26:189–206, 1984. for several posts here. Some introduction and links to proofs and explications will be … Continue reading

Liang, information flows, causation, and convergent cross-mapping

Someone recommended the work of Liang recently in connection with causation and attribution studies, and their application to CO2 and climate change. Liang’s work is related to information flows and transfer entropies. As far as I know, the definitive work … Continue reading

Just because the data lies sometimes doesn’t mean it’s okay to censor it

Or, there’s no such thing as an outlier … Eli put up a post titled “The Data Lies. The Crisis in Observational Science and the Virtue of Strong Theory” at his lagomorph blog. Think of it: Data lying. Obviously this … Continue reading

Why smooth?

I’ve encountered a number of blog posts this week which seem not to understand the Bias-Variance Tradeoff in regard to Mean-Squared-Error. These arose in connection with smoothing splines, which I was studying in connection with multivariate adaptive regression splines, that … Continue reading

Polls, Political Forecasting, and the Plight of Five Thirty Eight

On 17th October 2016 AT 7:30 p.m., Nate Silver of FiveThirtyEight.com wrote about how, as former Secretary of State Hillary Clinton’s polling numbers got better, it was more difficult for FiveThirtyEight‘s models to justify increasing her probability of winning, although … Continue reading

On Smart Data

One of the things I find surprising, if not astonishing, is that in the rush to embrace Big Data, a lot of learning and statistical technique has been left apparently discarded along the way. I’m hardly the first to point … Continue reading

Six cases of models

The previous post included an attempt to explain land surface temperatures as estimated by the BEST project using a dynamic linear model including regressions on both quarterly CO2 concentrations and ocean heat content. The idea was to check the explanatory … Continue reading

On Munshi mush

(Slightly updated on 2016-06-11.) Professor Emeritus Jamal Munshi of Sonoma State University has papers recently cited in science denier circles as evidence that the conventional associations between mean global surface temperature and cumulative carbon emissions are, well, bunk, due to … Continue reading

Cory Lesmeister’s treatment of Simson’s Paradox (at “Fear and Loathing in Data Science”)

(Updated 2016-05-08, to provide reference for plateaus of ML functions in vicinity of MLE.) Simpson’s Paradox is one of those phenomena of data which really give Statistics a substance and a role, beyond the roles it inherits from, say, theoretical … Continue reading

Gavin Simpson updates his temperature analysis

See the very interesting discussion at his blog, From the bottom of the heap. It would be nice to see some information theoretic measures on these results, though.

HadCRUT4 and GISTEMP series filtered and estimated with simple RTS model

Happy Vernal Equinox! This post has been updated today with some of the equations which correspond to the models. An assessment of whether or not there was a meaningful slowdown or “hiatus” in global warming, was recently discussed by Tamino … Continue reading

p-values and hypothesis tests: the Bayesian(s) rule

The American Statistical Association of which I am a longtime member issued an important statement today which will hopefully move statistical practice in engineering and especially in the sciences away from the misleading practice of using p-values and hypothesis tests. … Continue reading

dynamic linear model applied to sea-level-rise anomalies

I spent much of the data working up a function for level+trend dynamic linear modeling based upon the dlm package by Petris, Petrone, and Campagnoli, while trying some calculations and code for regime shift detection. One of the test cases … Continue reading

Southern New England Meteorology Conference, 24th October 2015

I attending the 2015 edition of the Southern New England Meteorology Conference in Milton, MA, near the Blue Hill, and its Blue Hill Climatological Observatory, of which I am a member as we as of the American Meteorological Society. I … Continue reading

“Cauchy Distribution: Evil or Angel?” (from Xian)

Cauchy Distribution: Evil or Angel?. From Professor Christian Robert.

Bayesian change-point analysis for global temperatures, 1850-2010

Professor Peter Congdon reports on two Bayesian models for global temperature shifts in his textbook, Applied Bayesian Modelling, as “Example 6.12: Global temperatures, 1850-2010”, on pages 252-253. A direct link is available online. The first is apparently original with Congdon, … Continue reading

engineering and understanding with stable models

Stable distributions or Lévy -stable models is a class of probability distributions which contains the Gaussian, the Cauchy (or Lorentz), and the Lévy distribution. They are parameterized by an which is . Values of of 1 or less give distributions … Continue reading

“… making a big assumption …”

“That’s making a big assumption.” (This post is a follow-on from an earlier one.) In the colloquial, the phrase means basing an argument on a precondition which is unusual or atypical or offends common sense. When applied to scientific hypotheses, … Continue reading