# Category Archives: Bayesian

## A quick note on modeling operational risk from count data

The blog statcompute recently featured a proposal encouraging the use of ordinal models for difficult risk regressions involving count data. This is actually a second installment of a two-part post on this problem, the first dealing with flexibility in count … Continue reading

## These are ethical “AI Principles” from Google, but they might as well be `technological principles’

This is entirely adapted from this link, courtesy of Google and Alphabet. Objectives Be socially beneficial. Avoid creating or reinforcing unfair bias. Be built and tested for safety. Be accountable to people. Incorporate privacy design principles. Uphold high standards of … Continue reading

## Less evidence for a global warming hiatus, and urging more use of Bayesian model averaging in climate science

(This post has been significantly updated midday 15th February 2018.) I’ve written about the supposed global warming hiatus of 2001-2014 before: “‘Overestimated global warming over the past 20 years’ (Fyfe, Gillett, Zwiers, 2013)”, 28 August 2013 “Warming Slowdown?”, Azimuth, Part … Continue reading

## perceptions of likelihood

That’s from this Github repository, maintained by Zoni Nation, having this description. The original data are from a study by Sherman Kent at the U.S. CIA, and is quoted in at least once outside source discussing the problem. In addition … Continue reading

## Confidence intervals and that IPCC: Why climate scientists need statistical help

At Andrew Gelman’s blog (Statistical Modeling, Causal Inference, and Social Science), Ben Goodrich makes the interesting observation in a length discussion about confidence intervals, how they should be interpreted, whether or not they have any socially redeeming value, und so … Continue reading

## A “capacity for sustained muddle-headedness”

Hat tip to Paul Lauenstein, and his physician brother, suggesting the great insights of the late Dr Larry Weed: Great lines, great quotes, a lot of humor: “… a tolerance of ambiguity …” “Y’know, Pavlov said you must teach a … Continue reading

## Dikran Marsupial’s excellent bit on hypothesis testing applied to climate, or how it should be applied, if at all

Frankly, I wish some geophysicists and climate scientists wrote more as if they thoroughly understood this, let alone deniers to try to discredit climate disruption. See “What does statistically significant actually mean?”. Of course, while statistical power of a test … Continue reading

## “You don’t have that option.”

Dr Neil deGrasse Tyson. I think he’s awesome. Marvelous. I saw him in Boston. He and I did not get off well, at the start, because of my being awestruck, and feeling very awkward, and the short time we had … Continue reading

## Papers of the day

From the Machine Learning and Computational Modeling Lab, School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran: A. Ahmadian, K. Fouladi, B. N. Araabi, “Writer identification using a probabilistic model of handwritten digits and Approximate Bayesian Computation,” International … Continue reading

## David Spiegelhalter on `how to spot a dodgy statistic’

In this political season, it’s useful to brush up on rhetorical skills, particularly ones involving numbers and statistics, or what John Allen Paulos called numeracy. Professor David Spiegelhalter has written a guide to some of these tricks. Read the whole … Continue reading

## Newt Gingrich and Van Jones. Right on.

It’s the thing. And it addresses how media and people forget about the actual statistics, and focus on the White Hot Bright Light. A study by Gelman, Fagan, and Kiss A study by Freyer A counterpoint to the Freyer study … Continue reading

## On Smart Data

One of the things I find surprising, if not astonishing, is that in the rush to embrace Big Data, a lot of learning and statistical technique has been left apparently discarded along the way. I’m hardly the first to point … Continue reading

## Six cases of models

The previous post included an attempt to explain land surface temperatures as estimated by the BEST project using a dynamic linear model including regressions on both quarterly CO2 concentrations and ocean heat content. The idea was to check the explanatory … Continue reading

## Cory Lesmeister’s treatment of Simson’s Paradox (at “Fear and Loathing in Data Science”)

(Updated 2016-05-08, to provide reference for plateaus of ML functions in vicinity of MLE.) Simpson’s Paradox is one of those phenomena of data which really give Statistics a substance and a role, beyond the roles it inherits from, say, theoretical … Continue reading

## “Lucky d20” (by Tamino, with my reblogging comments)

Originally posted on Open Mind:

What with talk of killer heat waves, droughts, floods, etc. etc., this blog tends to get pretty serious. When it does, we don’t deal with happy prospects, but with the danger of worldwide catastrophe. But…

## HadCRUT4 and GISTEMP series filtered and estimated with simple RTS model

Happy Vernal Equinox! This post has been updated today with some of the equations which correspond to the models. An assessment of whether or not there was a meaningful slowdown or “hiatus” in global warming, was recently discussed by Tamino … Continue reading

## p-values and hypothesis tests: the Bayesian(s) rule

The American Statistical Association of which I am a longtime member issued an important statement today which will hopefully move statistical practice in engineering and especially in the sciences away from the misleading practice of using p-values and hypothesis tests. … Continue reading

## “Grid shading by simulated annealing” [Martyn Plummer]

Source: Grid shading by simulated annealing (or what I did on my holidays), aka “fun with GCHQ job adverts”, by Martyn Plummer, developer of JAGS. Excerpt: I wanted to solve the puzzle but did not want to sit down with … Continue reading

## high dimension Metropolis-Hastings algorithms

If attempting to simulate from a multivariate standard normal distribution in a large dimension, when starting from the mode of the target, i.e., its mean γ, leaving the mode γis extremely unlikely, given the huge drop between the value of the density at the mode γ and at likely realisations Continue reading

## Generating supports for classification rules in black box regression models

Inspired by the extensive and excellent work in approximate Bayesian computation (see also), especially that done by Professors Christian Robert and colleagues (see also), and Professor Simon Wood (see also), it occurred to me that the complaints regarding lack of … Continue reading

## R and “big data”

On 2nd November 2015, Wes McKinney, the developer of the highly useful Python pandas module (and other things, including books), wrote an amusing blog post, “The problem with the data science language wars“. I by no means disagree with him. … Continue reading

## dynamic linear model applied to sea-level-rise anomalies

I spent much of the data working up a function for level+trend dynamic linear modeling based upon the dlm package by Petris, Petrone, and Campagnoli, while trying some calculations and code for regime shift detection. One of the test cases … Continue reading

## Thoughts on “Regime Shift?”

John Baez at The Azimuth Project opened a discussion on the recent paper by Reid, et al Philip C. Reid et al, Global impacts of the 1980s regime shift on the Earth’s climate and systems, Global Change Biology, 2015. I … Continue reading

## reblog: “Tiny Data, Approximate Bayesian Computation and the Socks of Karl Broman”

It’s Rasmus Bååth, in a post and video of which I am very fond: http://www.sumsar.net/blog/2014/10/tiny-data-and-the-socks-of-karl-broman/.

## On differential localization of tumors using relative concentrations of ctDNA. Part 2.

Part 1 of this series introduced the idea of ctDNA and its use for detecting cancers or their resurgence, and proposed a scheme whereby relative concentrations of ctDNA at two or more sites after controlled disturbance might be used to … Continue reading

## On differential localization of tumors using relative concentrations of ctDNA. Part 1.

Like most mammalian tissue, tumors often produce shards of DNA as a byproduct of cell death and fracture. This circulating tumor DNA is being studied as a means of detecting tumors or their resurgence after treatment. (See also a Q&A … Continue reading

## Deep Recurrent Learning Networks

(Also known to statisticians as deep exponential families.) Large scale deep learning Four easy lessons on Deep Learning from Google

## “The Bayesian Second Law of Thermodynamics” (Sean Carroll, and collaborators)

http://www.preposterousuniverse.com/blog/2015/08/11/the-bayesian-second-law-of-thermodynamics/ See also.