### Distributed Solar: The Democratizaton of Energy

### Blogroll

- Musings on Quantitative Paleoecology
- Tim Harford's “More or Less''
- WEAPONS OF MATH DESTRUCTION
- Why It’s So Freaking Hard To Make A Good COVID-19 Model
- Hermann Scheer
- The Plastic Pick-Up: Discovering new sources of marine plastic pollution
- Dollars per BBL: Energy in Transition
- Higgs from AIR describing NAO and EA
- Simon Wood's must-read paper on dynamic modeling of complex systems
- Nadler Strategy, LLC, on sustainability

### climate change

- AIP's history of global warming science: impacts
- Exxon-Mobil statement on UNFCCC COP21
- Interview with Wally Broecker
- The Green Plate Effect
- Climate model projections versus observations
- Documenting the Climate Deniarati at work
- Reanalyses.org
- Professor Robert Strom's compendium of resources on climate change
- Dessler's 6 minute Greenhouse Effect video
- Model state level energy policy for New Englad

### Archives

### Jan Galkowski

# Category Archives: big data

## There’s Big Data, Tiny Data, and now *Dead Data*

You’ve heard of Big Data. You may have heard of Tiny Data. But now, presented in the Harvard Data Science Review, Professor Steve Stigler presents Dead Data See: S. M. Stigler, “Data have a limited shelf life”, Harvard Data Science … Continue reading

Posted in big data, dead data, statistics, tiny data
Leave a comment

## “Hadoop is NOT ‘Big Data’ is NOT Analytics”

Arun Krishnan, CEO & Founder at Analytical Sciences comments on this serious problem with the field. Short excerpt: … A person who is able to write code using Hadoop and the associated frameworks is not necessarily someone who can understand … Continue reading

## A new feature: Technical publications of the week

I’m beginning a new style of column, called technical publications of the week. While I can’t promise these will be weekly, I will, from time to time, highlight technical publications I’ve recently read which I consider to be noteworthy. I … Continue reading

Posted in Anthropocene, big data, climate change, climate disruption, data science, data streams, earthquakes, geophysics, global warming, Hyper Anthropocene, Locality Sensitive Hashing, LSH, MinHash, numerical algorithms, numerical analysis, random projections, seismology, subspace projection methods, SVD, the right to be and act stupid, the tragedy of our present civilization, the value of financial assets
1 Comment

## NextGen VOICES: `On data’, `On setbacks’, and `On discovery’

Science Magazine has a periodic column called Science in brief and occasionally that column features a set of what they call “NextGen VOICES”, meaning young scientists. They gather the survey using Twitter (of course) via the hashtag #NextGenSci. For the … Continue reading

## data.table

R provides a helpful data structure called the “data frame” that gives the user an intuitive way to organize, view, and access data. Many of the functions that you would us… Source: Intro to The data.table Package

## On Smart Data

One of the things I find surprising, if not astonishing, is that in the rush to embrace Big Data, a lot of learning and statistical technique has been left apparently discarded along the way. I’m hardly the first to point … Continue reading

Posted in Akaike Information Criterion, Bayes, Bayesian, Bayesian inversion, big data, bigmemory package for R, changepoint detection, data science, data streams, dlm package, dynamic generalized linear models, dynamic linear models, dynamical systems, Generalize Additive Models, generalized linear models, information theoretic statistics, Kalman filter, linear algebra, logistic regression, machine learning, Markov Chain Monte Carlo, mathematics, mathematics education, maths, maximum likelihood, MCMC, Monte Carlo Statistical Methods, multivariate statistics, numerical analysis, numerical software, numerics, quantitative biology, quantitative ecology, rationality, reasonableness, sampling, smart data, state-space models, statistical dependence, statistics, the right to know, time series
Leave a comment

## K-Nearest Neighbors: dangerously simple

Originally posted on mathbabe:

I spend my time at work nowadays thinking about how to start a company in data science. Since there are tons of companies now collecting tons of data, and they don’t know what do to do…

Posted in big data, data science, evidence, machine learning
Leave a comment

## R and “big data”

On 2nd November 2015, Wes McKinney, the developer of the highly useful Python pandas module (and other things, including books), wrote an amusing blog post, “The problem with the data science language wars“. I by no means disagree with him. … Continue reading