# Category Archives: statistics

## What is the Tukey loss function?

Originally posted on Statistical Odds & Ends:
The Tukey loss function The Tukey loss function, also known as Tukey’s biweight function, is a loss function that is used in robust statistics. Tukey’s loss is similar to Huber loss in that…

## a song in praise of data scientist Rebekah Jones

I linked to Rebekah Jones‘ keynote address at the August 2020 Data Science Conference on COVID-19 sponsored by the National Institute for Statistical Science. Below is a song in tribute to her, wishing her well. (h/t Bill McKibben) We’re doing … Continue reading

Rationalists, wearing square hats, Think, in square rooms, Looking at the floor, Looking at the ceiling. They confine themselves To right-angled triangles. If they tried rhomboids, Cones, waving lines, ellipses– As, for example, the ellipse of the half- moon– Rationalists … Continue reading

## Calculating Derivatives from Random Forests

(Comment on prediction intervals for random forests, and links to a paper.) (Edits to repair smudges, 2020-06-28, about 0945 EDT. Closing comment, 2020-06-30, 1450 EDT.) There are lots of ways of learning about mathematical constructs, even about actual machines. One … Continue reading

Listen!

## COVID-19 statistics, a caveat : Sources of data matter

There are a number of sources of COVID-19-related demographics, cases, deaths, numbers testing positive, numbers recovered, and numbers testing negative available. Many of these are not consistent with one another. One could hope at least rates would be consistent, but … Continue reading

## “There’s mourning in America”

“We are Republicans and we want Trump defeated.” And the Orange Mango apparently hates this advert. And that’s why it’s here. The Lincoln Project apparently introduced this advert on Twitter with the explanatory text: Since you are awake and trolling … Continue reading

## “Seasonality of COVID-19, Other Coronaviruses, and Influenza” (from Radford Neal’s blog)

Thorough review with documentation and technical criticism of claims of COVID-19 seasonality or its lack. Whichever way this comes down, the links are well worth the visit! Will the incidence of COVID-19 decrease in the summer? There is reason to … Continue reading

## Simplistic and Dangerous Models

Originally posted on Musings on Quantitative Palaeoecology:
A few weeks ago there were none. Three weeks ago, with an entirely inadequate search strategy, ten cases were found. Last Saturday there were 43! With three inaccurate data points, there is enough information…

## “Lockdown WORKS”

Originally posted on Open Mind:
Over 2400 Americans died yesterday from Coronavirus. Here are the new deaths per day (“daily mortality”) in the USA since March 10, 2020 (note: this is an exponential plot) As bad as that news is,…

## What happens when time sampling density of a series matches its growth

This is the newly updated map of COVID-19 cases in the United States, updated, presumably, because of the new emphasis upon testing: How do we know this is the recent of recent testing? Look at the map of active cases: … Continue reading

## “Code for causal inference: Interested in astronomical applications”

via Code for causal inference: Interested in astronomical applications From Professor Ewan Cameron at his Another Astrostatistics Blog.

## Reanalysis of business visits from deployments of a mobile phone app

Updated, 20th October 2020 This reports a reanalysis of data from the deployment of a mobile phone app, as reported in: M. Yauck, L.-P. Rivest, G. Rothman, “Capture-recapture methods for data on the activation of applications on mobile phones“, Journal … Continue reading

## There’s Big Data, Tiny Data, and now Dead Data

You’ve heard of Big Data. You may have heard of Tiny Data. But now, presented in the Harvard Data Science Review, Professor Steve Stigler presents Dead Data See: S. M. Stigler, “Data have a limited shelf life”, Harvard Data Science … Continue reading

## “Tensors in Algebraic Statistics” (Elizabeth Gross)

Professor Elizabeth Gross. Some notes: Segre variety, about (These will be updated as I make progress through the talk.)

## Review of “No … increase of Carbon sequestration from the greening Earth”

(As promised.) Introduction and Abstract This is a review, re-presentation, and report on the August 2019 article, Y. Zhang, C. Song, L. E. Band, G. Sun, (2019), “No proportional increase of terrestrial gross Carbon sequestration from the greening Earth“, Journal … Continue reading

## “Bayesian replication analysis” (by John Kruschke)

“… the ability to express [hypotheses] as distributions over parameters …” Bayesian estimation supersedes the t-test: (Also by Professor Kruschke.)

## “Ten Fatal Flaws in Data Analysis” (Charles Kufs)

Professor Kufs has a fun book, Stats with Cats, and a blog. He also has a blog post tiled “Ten Fatal Flaws in Data Analysis” which, in general, I like. But the presentation has some shortcomings, too, which I note … Continue reading

## A response to a post on RealClimate

(Updated 2342 EDT, 28 June 2019.) This is a response to a post on RealClimate which primarily concerned economist Ross McKitrick’s op-ed in the Financial Post condemning the geophysical community for disregarding Roger Pielke, Jr’s arguments. Pielke, in that link, … Continue reading

## Cumulants and the Cornish-Fisher Expansion

“Consider the following.” (Bill Nye the Science Guy) There are random variables drawn from the same kind of probability distribution, but with different parameters for each. In this example, I’ll consider random variables , that is, each drawn from a … Continue reading

## What’s good for each subgroup can be bad for the group: Simpson’s

Why? Simpson’s “paradox” or observation … There’s actually nothing odd about this. While interpretation depends upon the semantics of individual measurements, it should be expected that, at times, improving things for the overall group will mean as a matter of … Continue reading

## California Marine Debris Prevention: Banning Plastic Bags is Not Enough

NOAA has a full page of videos on marine debris and how to prevent it. The state of California has a 2018 plan on preventing marine debris. Here are some highlights. There is a good deal more in the report, … Continue reading

## Five Thirty Eight podcast: `Can Statistics solve gerrymandering?`

Great podcast, featuring Professor and geometer Moon Duchin, Nate Silver, and Galen Druke. If the link doesn’t work, listen from here or below: Professor Duchin has written extensively on this: M. Duchin, B. E. Tenner, “Discrete geometry for electoral geography”, … Continue reading

## On bag bans and sampling plans

Plastic bag bans are all the rage. It’s not the purpose of this post to take a position on the matter. Before you do, however, I’d recommend checking out this: and especially this: (Note: My lovely wife, Claire, presents this … Continue reading

## Repeating Bullshit

Originally posted on Open Mind:
Question: How does a dumb claim go from just a dumb claim, to accepted canon by the climate change denialati? Answer: Repetition. Yes, keep repeating it. If it’s contradicted by evidence, ignore that or insult…

## A look at an electricity consumption series using SNCDs for clustering

(Slightly amended with code and data link, 12th January 2019.) Prediction of electrical load demand or, in other words, electrical energy consumption is important for the proper operation of electrical grids, at all scales. RTOs and ISOs forecast demand based … Continue reading

## Series, symmetrized Normalized Compressed Divergences and their logit transforms

(Major update on 11th January 2019. Minor update on 16th January 2019.) On comparing things The idea of a calculating a distance between series for various purposes has received scholarly attention for quite some time. The most common application is … Continue reading

## Why Americans and Britons work such long hours

Why Americans and Britons work such long hours.