Category Archives: information theoretic statistics

Complexity vs Simplicity in Geophysics

Really interesting mechanistic reductionism illustrating what it means to explain phenomena scientifically. It’s all about the maths.

Posted in abstraction, American Association for the Advancement of Science, American Meteorological Association, American Statistical Association, Azimuth Project, complex systems, control theory, differential equations, dynamical systems, eigenanalysis, information theoretic statistics, mathematics, Mathematics and Climate Research Network, mechanistic models, nonlinear systems, Paul Pukite, spectra, spectral methods, spectroscopy, theoretical physics, wave equations, WHT | Leave a comment

Procrustes tangent distance is better than SNCD

I’ve written two posts here on using a Symmetrized Normalized Compression Divergence or SNCD for comparing time series. One introduced the SNCD and described its relationship to compression distance, and the other applied the SNCD to clustering days at a … Continue reading

Posted in data science, dependent data, descriptive statistics, divergence measures, hydrology, Ian Dryden, information theoretic statistics, J.T.Kent, Kanti Mardia, non-parametric statistics, normalized compression divergence, quantitative ecology, R statistical programming language, spatial statistics, statistical series, time series | 1 Comment

A look at an electricity consumption series using SNCDs for clustering

(Slightly amended with code and data link, 12th January 2019.) Prediction of electrical load demand or, in other words, electrical energy consumption is important for the proper operation of electrical grids, at all scales. RTOs and ISOs forecast demand based … Continue reading

Posted in American Statistical Association, consumption, data streams, decentralized electric power generation, dendrogram, divergence measures, efficiency, electricity, electricity markets, energy efficiency, energy utilities, ensembles, evidence, forecasting, grid defection, hierarchical clustering, hydrology, ILSR, information theoretic statistics, local self reliance, Massachusetts, microgrids, NCD, normalized compression divergence, numerical software, open data, prediction, rate of return regulation, Sankey diagram, SNCD, statistical dependence, statistical series, statistics, sustainability, symmetric normalized compression divergence, time series | 2 Comments

Series, symmetrized Normalized Compressed Divergences and their logit transforms

(Major update on 11th January 2019. Minor update on 16th January 2019.) On comparing things The idea of a calculating a distance between series for various purposes has received scholarly attention for quite some time. The most common application is … Continue reading

Posted in Akaike Information Criterion, bridge to somewhere, computation, content-free inference, data science, descriptive statistics, divergence measures, engineering, George Sughihara, information theoretic statistics, likelihood-free, machine learning, mathematics, model comparison, model-free forecasting, multivariate statistics, non-mechanistic modeling, non-parametric statistics, numerical algorithms, statistics, theoretical physics, thermodynamics, time series | 4 Comments

The Johnson-Lindenstrauss Lemma, and the paradoxical power of random linear operators. Part 1.

Updated, 2018-12-04 I’ll be discussing the ramifications of: William B. Johnson and Joram Lindenstrauss, “Extensions of Lipschitz mappings into a Hilbert space, Contemporary Mathematics, 26:189–206, 1984. for several posts here. Some introduction and links to proofs and explications will be … Continue reading

Posted in clustering, data science, dimension reduction, information theoretic statistics, Johnson-Lindenstrauss Lemma, k-NN, Locality Sensitive Hashing, mathematics, maths, multivariate statistics, non-parametric model, numerical algorithms, numerical linear algebra, point pattern analysis, random projections, recommender systems, science, stochastic algorithms, stochastics, subspace projection methods | 1 Comment

Liang, information flows, causation, and convergent cross-mapping

Someone recommended the work of Liang recently in connection with causation and attribution studies, and their application to CO2 and climate change. Liang’s work is related to information flows and transfer entropies. As far as I know, the definitive work … Continue reading

Posted in Akaike Information Criterion, American Association for the Advancement of Science, Anthropocene, attribution, carbon dioxide, climate, climate change, climate disruption, complex systems, convergent cross-mapping, ecology, Egbert van Nes, Ethan Deyle, Floris Takens, George Sughihara, global warming, Hao Ye, Hyper Anthropocene, information theoretic statistics, Lenny Smith, model-free forecasting, nonlinear systems, physics, statistics, Takens embedding theorem, theoretical physics, Timothy Lenton, Victor Brovkin | 1 Comment

Just because the data lies sometimes doesn’t mean it’s okay to censor it

Or, there’s no such thing as an outlier … Eli put up a post titled “The Data Lies. The Crisis in Observational Science and the Virtue of Strong Theory” at his lagomorph blog. Think of it: Data lying. Obviously this … Continue reading

Posted in Akaike Information Criterion, American Association for the Advancement of Science, American Meteorological Association, American Statistical Association, AMETSOC, Anthropocene, Bayes, Bayesian, climate, climate change, climate models, data science, dynamical systems, ecology, Eli Rabett, environment, Ethan Deyle, George Sughihara, Hao Ye, Hyper Anthropocene, information theoretic statistics, IPCC, Kalman filter, kriging, Lenny Smith, maximum likelihood, model comparison, model-free forecasting, physics, quantitative ecology, random walk processes, random walks, science, smart data, state-space models, statistics, Takens embedding theorem, the right to know, Timothy Lenton, Victor Brovkin | 1 Comment

Why smooth?

I’ve encountered a number of blog posts this week which seem not to understand the Bias-Variance Tradeoff in regard to Mean-Squared-Error. These arose in connection with smoothing splines, which I was studying in connection with multivariate adaptive regression splines, that … Continue reading

Posted in Akaike Information Criterion, American Statistical Association, Antarctica, carbon dioxide, climate change, denial, global warming, information theoretic statistics, likelihood-free, multivariate adaptive regression splines, non-parametric model, science denier, smoothing, splines, statistical dependence | 1 Comment

Polls, Political Forecasting, and the Plight of Five Thirty Eight

On 17th October 2016 AT 7:30 p.m., Nate Silver of FiveThirtyEight.com wrote about how, as former Secretary of State Hillary Clinton’s polling numbers got better, it was more difficult for FiveThirtyEight‘s models to justify increasing her probability of winning, although … Continue reading

Posted in abstraction, American Statistical Association, anemic data, citizen science, citizenship, civilization, economics, education, forecasting, information theoretic statistics, mathematics, maths, politics, prediction markets, sociology, the right to know, theoretical physics, thermodynamics | Leave a comment

On Smart Data

One of the things I find surprising, if not astonishing, is that in the rush to embrace Big Data, a lot of learning and statistical technique has been left apparently discarded along the way. I’m hardly the first to point … Continue reading

Posted in Akaike Information Criterion, Bayes, Bayesian, Bayesian inversion, big data, bigmemory package for R, changepoint detection, data science, data streams, dlm package, dynamic generalized linear models, dynamic linear models, dynamical systems, Generalize Additive Models, generalized linear models, information theoretic statistics, Kalman filter, linear algebra, logistic regression, machine learning, Markov Chain Monte Carlo, mathematics, mathematics education, maths, maximum likelihood, MCMC, Monte Carlo Statistical Methods, multivariate statistics, numerical analysis, numerical software, numerics, quantitative biology, quantitative ecology, rationality, reasonableness, sampling, smart data, state-space models, statistical dependence, statistics, the right to know, time series | Leave a comment

Six cases of models

The previous post included an attempt to explain land surface temperatures as estimated by the BEST project using a dynamic linear model including regressions on both quarterly CO2 concentrations and ocean heat content. The idea was to check the explanatory … Continue reading

Posted in AMETSOC, anemic data, Anthropocene, astrophysics, Bayesian, Berkeley Earth Surface Temperature project, BEST, carbon dioxide, climate, climate change, climate data, climate disruption, climate models, dlm package, dynamic linear models, dynamical systems, environment, fossil fuels, geophysics, Giovanni Petris, global warming, greenhouse gases, Hyper Anthropocene, information theoretic statistics, maths, maximum likelihood, meteorology, model comparison, numerical software, Patrizia Campagnoli, Rauch-Tung-Striebel, Sonia Petrone, state-space models, stochastic algorithms, stochastic search, SVD, time series | 1 Comment

On Munshi mush

(Slightly updated on 2016-06-11.) Professor Emeritus Jamal Munshi of Sonoma State University has papers recently cited in science denier circles as evidence that the conventional associations between mean global surface temperature and cumulative carbon emissions are, well, bunk, due to … Continue reading

Posted in Bayes, Bayesian, Berkeley Earth Surface Temperature project, BEST, carbon dioxide, cat1, climate, climate change, climate data, climate education, climate models, convergent cross-mapping, dynamic linear models, ecology, ENSO, environment, Ethan Deyle, evidence, geophysics, George Sughihara, global warming, greenhouse gases, information theoretic statistics, Kalman filter, mathematics, maths, meteorology, model comparison, NOAA, oceanography, prediction, state-space models, statistics, Takens embedding theorem, Techno Utopias, the right to know, theoretical physics, time series, zero carbon | 1 Comment

Cory Lesmeister’s treatment of Simson’s Paradox (at “Fear and Loathing in Data Science”)

(Updated 2016-05-08, to provide reference for plateaus of ML functions in vicinity of MLE.) Simpson’s Paradox is one of those phenomena of data which really give Statistics a substance and a role, beyond the roles it inherits from, say, theoretical … Continue reading

Posted in Akaike Information Criterion, approximate Bayesian computation, Bayes, Bayesian, evidence, Frequentist, games of chance, information theoretic statistics, Kalman filter, likelihood-free, mathematics, maths, maximum likelihood, Monte Carlo Statistical Methods, probabilistic programming, rationality, Rauch-Tung-Striebel, Simpson's Paradox, state-space models, statistical dependence, statistics, stochastics | Leave a comment

Gavin Simpson updates his temperature analysis

See the very interesting discussion at his blog, From the bottom of the heap. It would be nice to see some information theoretic measures on these results, though.

Posted in AMETSOC, Anthropocene, astrophysics, Berkeley Earth Surface Temperature project, carbon dioxide, changepoint detection, climate, climate change, climate data, climate disruption, climate models, ecology, environment, evidence, Gavin Simpson, Generalize Additive Models, geophysics, global warming, HadCRUT4, hiatus, Hyper Anthropocene, information theoretic statistics, Kalman filter, maths, meteorology, numerical analysis, R, rationality, reasonableness, splines, time series | Leave a comment

HadCRUT4 and GISTEMP series filtered and estimated with simple RTS model

Happy Vernal Equinox! This post has been updated today with some of the equations which correspond to the models. An assessment of whether or not there was a meaningful slowdown or “hiatus” in global warming, was recently discussed by Tamino … Continue reading

Posted in AMETSOC, anemic data, Bayesian, boosting, bridge to somewhere, cat1, changepoint detection, climate, climate change, climate data, climate disruption, climate models, complex systems, computation, data science, dynamical systems, geophysics, George Sughihara, global warming, hiatus, information theoretic statistics, machine learning, maths, meteorology, MIchael Mann, multivariate statistics, physics, prediction, Principles of Planetary Climate, rationality, reasonableness, regime shifts, sea level rise, time series | 5 Comments

p-values and hypothesis tests: the Bayesian(s) rule

The American Statistical Association of which I am a longtime member issued an important statement today which will hopefully move statistical practice in engineering and especially in the sciences away from the misleading practice of using p-values and hypothesis tests. … Continue reading

Posted in approximate Bayesian computation, arXiv, Bayes, Bayesian, Bayesian inversion, bollocks, Christian Robert, climate, complex systems, data science, Frequentist, information theoretic statistics, likelihood-free, Markov Chain Monte Carlo, MCMC, Monte Carlo Statistical Methods, population biology, rationality, reasonableness, science, scientific publishing, statistical dependence, statistics, stochastics, Student t distribution | Leave a comment

dynamic linear model applied to sea-level-rise anomalies

I spent much of the data working up a function for level+trend dynamic linear modeling based upon the dlm package by Petris, Petrone, and Campagnoli, while trying some calculations and code for regime shift detection. One of the test cases … Continue reading

Posted in Bayesian, citizen science, climate change, climate data, climate disruption, dynamic linear models, floods, forecasting, Frequentist, global warming, icesheets, information theoretic statistics, Kalman filter, meteorology, open data, sea level rise, state-space models, statistics, time series | 1 Comment

Southern New England Meteorology Conference, 24th October 2015

I attending the 2015 edition of the Southern New England Meteorology Conference in Milton, MA, near the Blue Hill, and its Blue Hill Climatological Observatory, of which I am a member as we as of the American Meteorological Society. I … Continue reading

Posted in Anthropocene, capricious gods, climate, Dan Satterfield, dynamical systems, ensembles, ENSO, environment, floods, forecasting, geophysics, Hyper Anthropocene, information theoretic statistics, mesh models, meteorology, model comparison, NCAR, NOAA, nor'easters, oceanography, probability, science, spatial statistics, state-space models, statistics, stochastic algorithms, stochastic search, stochastics, time series | 1 Comment

“The Bayesian Second Law of Thermodynamics” (Sean Carroll, and collaborators)

http://www.preposterousuniverse.com/blog/2015/08/11/the-bayesian-second-law-of-thermodynamics/ See also.

Posted in approximate Bayesian computation, Bayesian, bifurcations, Boltzmann, capricious gods, dynamical systems, ensembles, games of chance, Gibbs Sampling, information theoretic statistics, Josiah Willard Gibbs, mathematics, maths, physics, probability, rationality, reasonableness, science, statistics, stochastic algorithms, stochastics, thermodynamics, Wordpress | Leave a comment

“Cauchy Distribution: Evil or Angel?” (from Xian)

Cauchy Distribution: Evil or Angel?. From Professor Christian Robert.

Posted in arXiv, Bayes, Bayesian, Cauchy distribution, information theoretic statistics, mathematics, maths, optimization, probabilistic programming, probability, rationality, reasonableness, statistics, stochastic algorithms, stochastics, Student t distribution | Leave a comment

Bayesian change-point analysis for global temperatures, 1850-2010

Professor Peter Congdon reports on two Bayesian models for global temperature shifts in his textbook, Applied Bayesian Modelling, as “Example 6.12: Global temperatures, 1850-2010”, on pages 252-253. A direct link is available online. The first is apparently original with Congdon, … Continue reading

Posted in Bayes, Bayesian, BUGS, climate, climate change, environment, forecasting, information theoretic statistics, mathematics, MCMC, meteorology, rationality, reasonableness, statistics, stochastic algorithms, Uncategorized | 1 Comment

engineering and understanding with stable models

Stable distributions or Lévy -stable models is a class of probability distributions which contains the Gaussian, the Cauchy (or Lorentz), and the Lévy distribution. They are parameterized by an which is . Values of of 1 or less give distributions … Continue reading

Posted in approximate Bayesian computation, Bayesian, citizen science, climate, climate change, climate education, differential equations, diffusion processes, ecology, economics, forecasting, geophysics, information theoretic statistics, IPCC, mathematics, mathematics education, maths, meteorology, model comparison, NOAA, oceanography, physics, rationality, reasonableness, risk, science, science education, stochastic search, the right to know | Leave a comment

“… making a big assumption …”

“That’s making a big assumption.” (This post is a follow-on from an earlier one.) In the colloquial, the phrase means basing an argument on a precondition which is unusual or atypical or offends common sense. When applied to scientific hypotheses, … Continue reading

Posted in Bayes, Bayesian, climate, climate education, environment, geophysics, information theoretic statistics, mathematics, maths, meteorology, model comparison, oceanography, physics, rationality, reasonableness, risk, statistics | 1 Comment