### Distributed Solar: The Democratizaton of Energy

### Meta

# Category Archives: data science

## “Hadoop is NOT ‘Big Data’ is NOT Analytics”

Arun Krishnan, CEO & Founder at Analytical Sciences comments on this serious problem with the field. Short excerpt: … A person who is able to write code using Hadoop and the associated frameworks is not necessarily someone who can understand … Continue reading

## Is the answer to the democratization of Science doing more Citizen Science?

I have been following, with keen interest, the post and comment thread pertaining to “Democratising science” at the blog I monitor daily, … and Then There’s Physics. I think the core subject being discussed is a little different from my … Continue reading

Posted in American Association for the Advancement of Science, American Meteorological Association, American Statistical Association, AMETSOC, astronomy, astrophysics, biology, citizen data, citizen science, citizenship, data science, ecology, education, environment, evidence, life purpose, local self reliance, marine biology, mathematics, mathematics education, maths, moral leadership, new forms of scientific peer review, open source scientific software, science, science education, statistics, the green century, the right to know
Leave a comment

## A new feature: Technical publications of the week

I’m beginning a new style of column, called technical publications of the week. While I can’t promise these will be weekly, I will, from time to time, highlight technical publications I’ve recently read which I consider to be noteworthy. I … Continue reading

Posted in Anthropocene, big data, climate change, climate disruption, data science, data streams, earthquakes, geophysics, global warming, Hyper Anthropocene, Locality Sensitive Hashing, LSH, MinHash, numerical algorithms, numerical analysis, random projections, seismology, subspace projection methods, SVD, the right to be and act stupid, the tragedy of our present civilization, the value of financial assets
1 Comment

## Why scientific measurements need to be adjusted

There is an excellent piece in Ars Technica about why scientific measurements need to be adjusted, and the implications of this for climate data. It is written by Scott K Johnson and is called “Thorough, not thoroughly fabricated: The truth … Continue reading

Posted in American Association for the Advancement of Science, American Meteorological Association, American Statistical Association, AMETSOC, Berkeley Earth Surface Temperature project, Canettes Blues Band, citizen data, climate data, data science, environment, evidence, geophysics, GISTEMP, HadCRUT4, mathematics education, meteorological models, obfuscating data, open data, physics, science, spatial statistics, Tamino, the right to know, the tragedy of our present civilization, Variable Variability
Leave a comment

## Sleeping Giant Awakening

Originally posted on Climate Denial Crock of the Week:

https://twitter.com/johnmyers/status/809097380456865792 Wikipedia: Isoroku Yamamoto’s sleeping giant quotation is a quote by the Japanese Admiral Isoroku Yamamoto regarding the 1941 attack on Pearl Harbor by forces of Imperial Japan. The quotation is portrayed at the very end of…

Posted in adaptation, American Association for the Advancement of Science, American Meteorological Association, American Solar Energy Society, American Statistical Association, AMETSOC, Anthropocene, California, Carbon Worshipers, citizen data, citizen science, climate, climate change, climate data, climate disruption, data science, Donald Trump, ecology, Ecology Action, geophysics, global warming, Hyper Anthropocene, ignorance, Jerry Brown, science, sustainability, the right to be and act stupid, the right to know, the stack of lies, the tragedy of our present civilization
Leave a comment

## Cathy O’Neil’s WEAPONS OF MATH DESTRUCTION: A Review

(Revised and updated Monday, 24th October 2016.) Weapons of Math Destruction, Cathy O’Neil, published by Crown Random House, 2016. This is a thoughtful and very approachable introduction and review to the societal and personal consequences of data mining, data science, … Continue reading

Posted in citizen data, citizen science, citizenship, civilization, compassion, complex systems, criminal justice, Daniel Kahneman, data science, deep recurrent neural networks, destructive economic development, economics, education, engineering, ethics, Google, ignorance, Joseph Schumpeter, life purpose, machine learning, Mathbabe, mathematics, mathematics education, maths, model comparison, model-free forecasting, numerical analysis, numerical software, open data, optimization, organizational failures, planning, politics, prediction, prediction markets, privacy, rationality, reason, reasonableness, risk, silly tech devices, smart data, sociology, Techno Utopias, testing, the value of financial assets, transparency
Leave a comment

## NextGen VOICES: `On data’, `On setbacks’, and `On discovery’

Science Magazine has a periodic column called Science in brief and occasionally that column features a set of what they call “NextGen VOICES”, meaning young scientists. They gather the survey using Twitter (of course) via the hashtag #NextGenSci. For the … Continue reading

## “Holy crap – an actual book!”

Originally posted on mathbabe:

Yo, everyone! The final version of my book now exists, and I have exactly one copy! Here’s my editor, Amanda Cook, holding it yesterday when we met for beers: Here’s my son holding it: He’s offered…

Posted in American Association for the Advancement of Science, Buckminster Fuller, business, citizen science, citizenship, civilization, complex systems, confirmation bias, data science, data streams, deep recurrent neural networks, denial, economics, education, engineering, ethics, evidence, Internet, investing, life purpose, machine learning, mathematical publishing, mathematics, mathematics education, maths, moral leadership, multivariate statistics, numerical software, numerics, obfuscating data, organizational failures, politics, population biology, prediction, prediction markets, privacy, quantitative biology, quantitative ecology, rationality, reason, reasonableness, rhetoric, risk, Schnabel census, smart data, sociology, statistical dependence, statistics, the right to be and act stupid, the right to know, the value of financial assets, transparency, UU Humanists
Leave a comment

## data.table

R provides a helpful data structure called the “data frame” that gives the user an intuitive way to organize, view, and access data. Many of the functions that you would us… Source: Intro to The data.table Package

## On Smart Data

One of the things I find surprising, if not astonishing, is that in the rush to embrace Big Data, a lot of learning and statistical technique has been left apparently discarded along the way. I’m hardly the first to point … Continue reading

Posted in Akaike Information Criterion, Bayes, Bayesian, Bayesian inversion, big data, bigmemory package for R, changepoint detection, data science, data streams, dlm package, dynamic generalized linear models, dynamic linear models, dynamical systems, Generalize Additive Models, generalized linear models, information theoretic statistics, Kalman filter, linear algebra, logistic regression, machine learning, Markov Chain Monte Carlo, mathematics, mathematics education, maths, maximum likelihood, MCMC, Monte Carlo Statistical Methods, multivariate statistics, numerical analysis, numerical software, numerics, quantitative biology, quantitative ecology, rationality, reasonableness, sampling, smart data, state-space models, statistical dependence, statistics, the right to know, time series
Leave a comment

## “Catching long tail distribution” (Ted Dunning)

One of the best presentations on what can happen if someone takes a naive approach to network data. It also highlights what is, to my mind, the greatly underappreciated t-distribution, which is typically only used in connection with frequentist Student … Continue reading

## Climate Denial Fails Pepsi Challenge

Originally posted on Climate Denial Crock of the Week:

Stephen Lewandowsky specializes in conducting research that pulls back the curtain climate denial psychology. He’s done it again. Washington Post: Researchers have designed an inventive test suggesting that the arguments commonly used…

Posted in American Association for the Advancement of Science, American Statistical Association, card draws, card games, chance, climate, climate change, climate data, climate education, confirmation bias, data science, denial, disingenuity, education, false advertising, fear uncertainty and doubt, fossil fuels, games of chance, geophysics, global warming, ignorance, mathematics, mathematics education, maths, obfuscating data, rationality, reasonableness, risk, science, science education, sociology, the right to know
Leave a comment

## Of my favorite things …

(Clarifying language added 4 Apr 2016, 12:26 EDT.) I just watched an episode from the last season of Star Trek: The Next Generation entitled “Force of Nature.” As anyone who pays the least attention to this blog knows, opposing human … Continue reading

Posted in Anthropocene, bridge to somewhere, bucket list, Buckminster Fuller, Carl Sagan, climate, climate change, climate disruption, climate education, compassion, data science, Earle Wilson, ecology, Ecology Action, environment, evolution, geophysics, George Sughihara, global warming, Hyper Anthropocene, life purpose, mathematics, mathematics education, maths, numerical analysis, optimization, philosophy, physical materialism, physics, population biology, population dynamics, proud dad, quantitative biology, quantitative ecology, rationality, reasonableness, science, sociology, statistics, stochastic algorithms
5 Comments

## HadCRUT4 and GISTEMP series filtered and estimated with simple RTS model

Happy Vernal Equinox! This post has been updated today with some of the equations which correspond to the models. An assessment of whether or not there was a meaningful slowdown or “hiatus” in global warming, was recently discussed by Tamino … Continue reading

Posted in AMETSOC, anemic data, Bayesian, boosting, bridge to somewhere, cat1, changepoint detection, climate, climate change, climate data, climate disruption, climate models, complex systems, computation, data science, dynamical systems, geophysics, George Sughihara, global warming, hiatus, information theoretic statistics, machine learning, maths, meteorology, MIchael Mann, multivariate statistics, physics, prediction, Principles of Planetary Climate, rationality, reasonableness, regime shifts, sea level rise, time series
2 Comments

## p-values and hypothesis tests: the Bayesian(s) rule

The American Statistical Association of which I am a longtime member issued an important statement today which will hopefully move statistical practice in engineering and especially in the sciences away from the misleading practice of using p-values and hypothesis tests. … Continue reading

Posted in approximate Bayesian computation, arXiv, Bayes, Bayesian, Bayesian inversion, bollocks, Christian Robert, climate, complex systems, data science, Frequentist, information theoretic statistics, likelihood-free, Markov Chain Monte Carlo, MCMC, Monte Carlo Statistical Methods, population biology, rationality, reasonableness, science, scientific publishing, statistical dependence, statistics, stochastics, Student t distribution
Leave a comment

## K-Nearest Neighbors: dangerously simple

Originally posted on mathbabe:

I spend my time at work nowadays thinking about how to start a company in data science. Since there are tons of companies now collecting tons of data, and they don’t know what do to do…

Posted in big data, data science, evidence, machine learning
Leave a comment