Hat tip to Paul Lauenstein, and his physician brother, suggesting the great insights of the late Dr Larry Weed:
Great lines, great quotes, a lot of humor:
Unfortunately, it’s not clear medicine — or statistics — has progressed much beyond 1971. Note the 1999 report from the National Academy of Sciences,
To Err is Human.
The last quote reminds me of something I was taught in graduate school (in 1973, noting the above video is from 1971), when I took 6.871, Knowledge-based application systems, and medical decision support was covered as a subject. I distinctly recall that problems of software-physician interaction during the patient interview centered about the comparatively unstructured way which physicians gathered information, that they could not be constrained to using a diagnostic or taxonomic key as is popular in, say, Botany. The thought crossed my mind at the time, that “How do we know if the approach the physicians are using are the most effective?” However, being a student, and knowing next to nothing about diagnostic medicine, I suppressed my doubts and took the advice as definitive. Given Dr Weed’s comments, I should have been more assertive with my doubts. And, unfortunately and apparently, these methods of practice which Dr Weed criticized have gotten ingrained in decision support software for medicine. See E. H. Shortliffe, “Computer programs to support clinical decision making”, Journal of the American Medical Association, 258, 61-66, for their status as of 1986, some 15 years after Weed’s talk.
Incidentally, while I have found two additional references by medical authors to the title phrase of this post attributed by them to Alfred North Whitehead, that is, claims that Whitehead mentioned a “capacity for sustained muddle-headedness”. My online research has failed to turn up that reference. The closest I can find is a mention by Lomax in an article in the Journal of Psychiatric Practice, 17(1), January 2011, on page 46 where he quotes Whitehead from Whitehead’s 1938 book Modes of Thought where Lomax writes
Alfred Whitehead said that the job of the philosopher was “living with sustained muddle-headedness”.
Whitehead liked the term muddle-headed, even applying it to himself. I doubt he ever meant it, however, in exactly the way Dr Weed and colleagues used it.
The American Petroleum Institute has trotted out a commissioned study claiming an increase of two million jobs in 2040 off a base of four million (in 2015).
First, these are not people working with development or distribution of natural gas. 44% are “end user” workers, that is, someone someplace who works to “… convert natural gas and its associated liquids to electricity, petrochemical and other products and the industries that manufacture, sell, install and maintain gas-fired appliances and equipment used in the residential, commercial, vehicle and industrial sectors”. (API, “Key Observations and Findings”) The implication is that if natural gas went away, so would these 44% of jobs. That is not correct. The report further defines these as
The largest NAICS codes associated with the end-use segment are Chemical Manufacturing, Gas-fired Electric Power Generation, Power Boiler and Heat Exchanger Manufacturing, Household Appliance Repair and Maintenance, and Industrial Process Furnace and Oven Manufacturing. The end-use segment also includes portions of the jobs related to Industrial Equipment and Machinery Repair and Maintenance, Industrial Construction, Freight Trucks, Turbine and Turbine Generator Set Manufacturing, Iron and Steel Pipe and Tube Manufacturing, and Freight Rail.
While it is not clear what “largest” means here, nevertheless there is no notion of substitution or displacement. The natural gas industry is really responsible for jobs in “Household Appliance Repair and Maintenance” and “Chemical Manufacturing”?
Second, 30% of the claimed jobs are in fact directly associated with natural gas mining, production and distribution. These are fully tied to natural gas.
Third, 25% of the claimed jobs consist “…of oil and gas production companies and their suppliers of goods and services…”, or, to quote their NAICS descriptions,
The largest NAICS Codes primarily associated with the production segment include Support Activities for Oil and Gas Operations, Crude Petroleum and Natural Gas Extraction, Drilling Oil and Gas Wells, and Oil and Gas Field Machinery Manufacturing.
Natural gas can be credited with all of these jobs?
Fourth, the study limited itself to 2015-2016 natural gas growth scenarios, and cherry-picked data from the U.S. EIA Annual Energy Outlook for those years. In particular, here are the case studies the report chose:
Presumably because there is no increase in use of natural gas, there is also no increase in numbers of jobs.
So, I conclude, in a best case scenario, about 600,000 jobs could be added by 2040 for which the natural gas industry is responsible, and perhaps another 200,000, dependending upon how the other categories are counted. There is no allocation that I can see to jobs fixing existing pipeline infrastructure, which don’t increase as production does. The method used to make these projections, as far as I can tell, is using linear fits based upon historical employment numbers.
Incidentally, the API report contains an interesting Appendix B which compares the jobs intensity for construction of natural gas plants, nuclear, coal, onshore wind, and two types of solar, sourcing modules from the U.S. or China. While there are plenty of examples of unfair comparison in that Appendix, including failing to account for additional decrease in price for solar modules between now and 2040, the study produces the result that for new wind, solar, and natural gas construction, the job creation intensity is about the same. For some reason they excluded offshore wind.
Finally, as I’ve noted before, there is nothing natural about natural gas. It is explosive methane. Natural gas ain’t granola.
Also see Bigger is Not Better: Grid Modernization and the Antiquated Concept of ‘Baseload’, and in particular the comment by Gene Grindle to that post.
As some of the coal and nuclear plants face retirement decisions, focusing on their status as “baseload” generation is not a useful perspective for ensuring the cost-effective and reliable supply of electricity. Instead, system planners, market administrators such as regional effectively and efficiently defines and measures system needs and (b) develops planning tools, scheduling processes, and market mechanisms to elicit and compensate broad range of resources
that have become available to meet those needs. Fortunately, planners and operators have been hard at work at such innovations and have moved past the concept of “baseload” to focus on the attributes of resources and the services they provide to the system that help the modernized electricity system operate more reliably, efficiently, and nimbly. While coal and nuclear power plants—as well as a broad range of other resource types—are recognized for providing a wide range of reliability services to the grid, the traditional definition of power supply resource adequacy is being revisited by some system operators and planners. Still, additional work is needed in planning and markets to better recognize and compensate resources for the value they provide to the system, and to incorporate the environmental impacts of electricity generation, including resources’ ability to reduce the system’s greenhouse gas emissions, consistent with public policy goals.
Coal and nuclear plants do not provide unique operational services that are specifically identified by or correlated with the term “baseload” generation. The term does not reflect the broader range of services that various resources can provide. As system planning and electricity market design are modernized, it is becoming increasingly clear that the services and attributes most under-recognized by today’s markets are greenhouse gas emissions in some jurisdictions and operational flexibility. A resource is considered flexible when it can react to operational signals to ramp its power generation up and down to help meet the needs of the system over multiple hours and minute-to-minute. Flexible resources can cost-effectively assist with meeting changing system loads and integrating the variable output of renewable resources. These flexibility needs are rapidly expanding as a result of numerous industry trends: (a) recognition by policymakers that renewable energy resources are needed to meet long-term emissions reductions goals; (b) customers’ increasing desire to voluntarily procure renewable energy or generate electricity on-site; and (c) substantial technological improvements that have driven down the cost of renewable resources to the point where, even before accounting for tax incentives, they are the lowest-cost option for new generating plants in some regions of the country.
(From Advancing Past “Baseload” to a Flexible Grid)
The creatures from Trumpland are planning an Energy Week in the upcoming, probably to lead up to the Fourth of July celebrations. Our Orange Leader
… will tout surging U.S. exports of oil and natural gas during a week of events aimed at highlighting the country’s growing energy dominance.
[He] also plans to emphasize that after decades of relying on foreign energy supplies, the U.S. is on the brink of becoming a net exporter of oil, gas, coal and other energy resources.
(Brief excerpt from the Bloomberg article on the subject)
Trouble is, this defies trends and the cost curves of wind and especially solar technology, as noted by Bloomberg New Energy Finance and Lazard. Worse, as energy supplies are more constrained in their sources and demand more exotic methods to extract, either the price per unit needs to increase, or there needs to be a greater subsidy from the federal government. Fossil fuels in the USA already receive huge subsidies: Consider the FERC-directed eminent domain takings which pipeline companies receive, rather than having to buy the land their pipes cross and despoil. (In contrast, consider what Amtrak has to do for its rights of way.) As additional supplies of fossil fuel are dumped into the marketplace, prices are depressed, especially explosive methane (“Natural gas ain’t granola”).
Worse, their prices are at best constant, and, over time, increase are the reserves get rarer, whereas renewables — without subsidies — will be cheaper than costs of transmission for electricity in the early 2020s.
This is matching a linear curve with an upslope against exponentially decaying curves. Trumpland wants overseas consumption, but at what price? Cheaper than, say, coal dug in China with fewer health and workplace protections? With lower transportation costs?
What do the markets think? Trumpland is happy to quote, take credit for, and lie about increases in jobs (e.g., increases in numbers of jobs in coal as cited by EPA head Pruitt), but how have energy sources performed since the junta was in office?
(Click figures to see larger images, and use browser Back Button to return to blog.)
Not too well.
In contrast, have a look at wind and solar investment (*):
I daresay that solar via TAN:ETF has perked up recently, after a time of doldrums.
By Trumplands criteria, they aren’t doing too well. The order is tall to provide convincing evidence how the United States is going to defy headwinds, develop markets, avoid having climate damage reparations set against it, let alone financially succeed. It’s possible that Trumpland is trying to set up a protection racket, consistent with organized crime, where if the rest of Earth doesn’t want their planet trashed, then the USA should get paid off. But, if that is the objective, it speaks for itself.
And it won’t work. I’ve detailed many times elsewhere here why.
From Kevin Book of the Center for Strategic & International Studies, in his “An energy policy of dominance”, 28th June 2017:
Governments in the United States can position the country for dominance by rationalizing disparate policies that muddy price signals for private industry. For example, the $1/gallon federal biodiesel tax credit implies a CO2 price of ~$196 per metric ton. By contrast, the ¢24.4/gallon federal diesel tax corresponds to a CO2 price of ~$24 per metric ton. Federal wind energy tax credits of $24/megawatt hour imply ~$54 per metric ton, but the nine-state Regional Greenhouse Gas Initiative auctioned carbon allowances in June for ~$2.80 per metric ton. Paying green energy producers 10 to 20 times more to abate greenhouse gases than we charge fossil fuel consumers for emitting them is distortion, not dominance.
That, incidentally, shows how silly the claim of government subsidies support renewables and that’s why they’re winning is.
Also, outgoing FERC member Collete Honorable says “I don’t see any problems with reliability, and I say bring on more renewables”.
Today, the City of Boston released ClimateChangeData.Boston.gov. The City website features scrubbed information from the U.S. EPA Climate Change website.
Which bit of what Dikran said do you disagree with? It certainly seems reasonable to me; if you want to explain how something could cause something else, you need to use more than just statistics.
After consideration, I posted a long explanation, worthy of a blog post on its own. But I’m leaving it there, and just putting a link to it here.
From The Economist, 25th February 2017:
FROM his office window, Philipp Schröder points out over the Bavarian countryside and issues a Bond villain’s laugh: “In front of you, you can see the death of the conventional utility, all financed by Mr and Mrs Schmidt. It’s a beautiful sight.”
Someone’s going to cannibalize our business — it may as well be us. Someone’s going to eat our lunch. They’re lining up to do it. I think repositioning, redefining the grid from what it is as sort of a passive purveyor of energy to one that is an enabler of transactive energy at the grassroots level is a much more sustainable, profitable strategy.
Led by our own UU Needham Reverand Catie Scudera, with Reverand Daryn Bunce Stylianopoulos of the First Baptist Church of Needham, and Reverend Jim Mitulski of the Congregational Church of Needham, Sunday, 4th June 2017 saw a vigil of members of their combined congregations, singing songs of reflection and protest in response to Thursday’s announcement by President 45, that he would withdraw the United States from the Paris Climate agreement.
In response …
Why? Oh, why not try something from an episode on Tyson’s excellent Cosmos:
(properly, for pay) https://www.amazon.com/The-Lost-Worlds-Planet-Earth/dp/B00K5M962G
(YouTube, for pay) https://www.youtube.com/watch?v=Vcz_NYRQWEw
And, for background to this, read Ogden and Sleep.
In the world in excess of , dragons prowl. The most obvious are methane clathrates, but who knows what else.
And, while solemn, the songs, the sun, and the moment were memorable.
And there is some good news, even if it is partial.
“Protecting our planet and driving economic growth are critical to our future, and they aren’t mutually exclusive,” he said in a statement. “I deeply disagree with the decision to withdraw from the Paris Agreement and, as a matter of principle, I’ve resigned from the President’s advisory council.”
— Disney’s CEO, Robert Iger
Story from Variety.
Flash from InsideClimate News:
ExxonMobil shareholders voted Wednesday to require the world’s largest oil and gas company to report on the impacts of climate change to its business—defying management, and marking a milestone in a 28-year effort by activist investors.
Sixty-two percent of shareholders voted for Exxon to begin producing an annual report that explains how the company will be affected by global efforts to reduce greenhouse gas emissions under the Paris climate agreement. The analysis should address the financial risks the company faces as nations slash fossil fuel use in an effort to prevent worldwide temperatures from rising more than 2 degrees Celsius.
… [I]nstitutional investors argue that climate risk is a long-term financial risk that should be integrated into financial reporting.
BlackRock, the world’s largest investment firm, with $5.1 trillion in assets under management, and several major global investors—including State Street, Aviva, and Legal & General—have signaled that they want more transparency on climate change risk. BlackRock’s first vote against corporate management on climate came this year against Occidental, where it was the largest institutional investor.
Patrick Doherty, director of corporate governance for the New York State Office of the State Comptroller, which spearheaded the Exxon resolution along with the Church of England, said that climate is a very real financial concern for the employees paying into state pension funds and looking to payouts decades into the future. The New York State Common Retirement Fund, one of the world’s largest public employees investment funds, holds more than $1 billion in Exxon stock.
“We have a very, very strong financial interest in the long-term health of the company,” Doherty said.
“The average CEO has a tenure of five years, and hedge funds are looking to maybe the next quarter,” he said. “Only institutional investors have this longer view. And one of the reasons that support for climate disclosure has been increasing over the years is more and more institutional shareholders are saying, hey, there can be large long-term risk and long-term damage.”
(The above is the Carbon Tracker 2014 Unburnable Carbon report.)
And regarding the claim that
Oil companies cannot predict the long-term impact of climate and climate policy with enough precision to provide the kind of risk analysis that shareholders are seeking, IHS Markit said. Financial disclosure under securities regulation looks ahead over a much shorter time frame.
what Markit is saying is that the management of fossil fuel companies does not know how to do their job. If that is correct, which I doubt, they should step aside and allow someone who knows how to do it, do it.
See the following links for additional news on this development:
Frankly, I wish some geophysicists and climate scientists wrote more as if they thoroughly understood this, let alone deniers to try to discredit climate disruption. See “What does statistically significant actually mean?”.
Of course, while statistical power of a test is important to keep in mind, as well as the effects of arbitrary alterations or recodings of data upon it (see also Andrew Gelman’s comment on this), people should really look at this from a purely Bayesian perspective, and there’s no longer a computational excuse to ignore that approach.
I testified at the Weymouth, Massachusetts hearing for the MA Senate Climate and Clean Energy Tour.
Here’s Senator Marc R Pacheco introducing the Tour:
The Weymouth hearing was recorded and is available on YouTube in three parts:
My written testimony is attached below:
A better version having hyperlinks intact is available here, but be sure to download before reading in order to get access to the links.
The President’s Corner in the May 2017 issue of Amstat News, the monthly newsletter of the American Statistical Association (“ASA”), features the interesting exposition by environmental statistician and President of the ASA, Barry Nussbaum, called “Bigger isn’t always better when it comes to data.” Key paragraph:
Notice a subtle nuance here. Normally, you have a population and you sample elements from the population. Here, we really didn’t know if the vehicle’s emissions belonged to the population, due to the maintenance and use restrictions, until we administered the questionnaire after the vehicle had been randomly selected.
Akamai (NASDAQ: AKAM) said it is making a 20-year investment in the planned Seymour Hills Wind Farm, which will be based outside of Dallas and is expected to begin operating next year. The project is being developed by Infinity Renewables, and the plan is to construct 38 wind turbines across about 8,000 acres, Akamai said in a news release. Akamai said it intends to pull enough energy from the wind farm to offset its aggregate data center operations based in Texas, which account for about 7 percent of Akamai’s global power load.
This is part of Akamai’s commitment to reduce Carbon emissions and cover 50% of its operating requirements for electrical energy by 2020. See the details in Akamai’s press release.
… By 2030, the report predicts that oil demand will drop to 70 million barrels per day. The resulting collapse in prices will be catastrophic for the industry, and these effects are likely to be felt as early as 2021.
The report suggests that oil demand from passenger road transport will drop by 90 percent by 2030; demand from the trucking industry will drop by 7 million barrels per day globally. This is, as the report says, an existential crisis for the industry. Current share prices and projections are based on the presumption of a system of individually owned vehicles.
As far as I’m concerned, it couldn’t happen to a “nicer” bunch of people, this economic catastrophe. And it can’t happen soon enough!
I see nearly every week in the comedy called progressive plans for energy sources in the Commonwealth of Massachusetts. Progressives, it seems, eschew cooperation with business and attorneys and, as a result, never get anything respectable done. They are, as I’ve sometimes remarked, in practice, liberal climate deniers, because they rate the survival of their collective political power more important than that of civilization.
(Hat tip to Climate Denial Crock of the Week)
I posted a response to a comment from the blog author at the ellipsis-loving … and Then There’s Physics. The figures didn’t make it into the comment, and, so, I am reproducing the intended comment in its entirety here.
ATTP, you were correctly pointing out I was partly incorrect, and certainly incomplete. Kudos to you, and apologies, and to the readers.
I hadn’t read Armour 2017. I have now. I did read ATTP’s assessment and, yes, it does mention Armour deals with nonlinearity. And, yes, it does mention that the histogram is from CMIP runs, but I interpreted it differently than it should have been interpreted. I have not read Richardson, and probably won’t. I also assumed that the Armour figure was something Stephens was using in his “criticism of excessive certainty” but have gone back and seen that there is another parse to this post which is consistent with Stephens not mentioning Armour at all.
I also have not read Stephens, and perhaps I should before commenting, but I won’t.
The point I tried to make was essentially that uncertainty and ignorance in a place where a decision ought to be made and when the consequences could be enormous is not the place to claim “It’s okay to remain ignorant.” Essentially, this is enshrining the “Do nothing until someone proves you have to do so” which might work for some common decisions, but taking a big ship into an iceberg-strewn sea because it hasn’t hit anything yet hardly seems prudent.
I also am not convinced, commenting with respect for Armour, that the adjustment for nonlinearity they attempt helps the argument much, and ATTP hinted at that in his previous post (beginning “… A few additional points. We don’t know that these adjustments are correct. However, we do have a situation where there is a mismatch between different climate sensitivity estimates …”). In the public discussion of climate change, highlighting these kinds of papers tends, I think, to convince people there’s more arbitrariness to this process than is correct. After all, there have been similar papers published by Meraner, Mauritsen, and Voigt, as well as Caballero and Huber, the latter focussing upon nonlinearity in ECS and having a good introduction. These emphasize Pierrehumbert’s comment “Here there (may) be dragons”, and, as of 2013,
…there have already been great strides in understanding the magnitude and pattern of warmth in hothouse climates, which have helped resolve some earlier modeling paradoxes, but much remains to be done. In particular, narrowing the broad error bars on past atmospheric CO2 is crucial to relating these climates to what is going on at present.
More recently there is the published work of Friedrich, Timmermann, Tigchelaar, Timm, and Ganopolski.
Consider Pierrehumbert’s equation (3.14) for temperature sensitivity (specifically mean surface temperature) with respect to some parameter, , where might be, as Pierrehumbert suggests, albedo, or CO2 concentration, or the solar constant:
Here is the top-of-atmosphere flux, and is outgoing longwave radiation at the surface (*). This is pretty standard, even if it is very general, much more general than, say, Armour’s equations (1)-(3). From a statistical perspective what’s striking about the above is that if
are each interpreted to be random variables worthy of estimation by whatever means, then that implies is a random variable which is drawn from a ratio distribution. And should the Highest Density Probability Interval for include zero, whatever the physical reason, the distribution of is pretty meaningless. A good physical imagination offers any number of ways this could happen, but Professor Pierrehumbert’s discussions in Section 3.4 of his book describes the possible (mathematical) range, irrespective of the geophysical details. And because what we are about is as a function of all relevant , that being a total differential, the excessive variability in any one such will dominate that of the rest. Note extreme variability is not our friend, no matter what vision of a cultural or economic future we might have.
If ECS is going to continue to be used as the basis of argument and policy, it seems to need to be made far more robust than it is. That’s the point of my argument for much more additional work. If we are to keep this troubled concept in the planning stables, we desperately need to understand the bounds on its applicability. Armour is a start, but Armour simply says there might be problems when we already know there are problems from theory. What we need are constraints. Otherwise, ECS is a “nice to have if the world were a different place.” But then we don’t really have it, except knowing that there could be “dragons” out there.
I think there are much better arguments, and there are much better problems to chase. For instance, here is the definitive plot from Fyfe, Gillett, and Zwiers:
I have noted (**; Section 7) that what’s wrong with this presentation is not that that the Highest Density Probability Interval for the climate models fails to overlap the observational mean and cloud, it’s that there is such a big difference between the observational variance and that of the model ensemble. The specifics of the discrepancy seen as a t-test based upon a difference in means led to the later explanation by Cowtan and Way and then a rebuttal by Fyfe and Gillett. I say, rather, that the reason for the discrepancy is deep, having to do more with the difference in variances (***), and probably not something we can expect most public or most policymakers to understand, at least without understanding something like Leonard Smith’s Chaos: A Very Short Introduction. The climate ensemble simulates all possible futures, and Earth takes one future at a time. I have read all around this in the literature, and there seems to be a confusion about what internal variability means. Yes, there’s unexplained internal variability, but there’s a lot of evidence for stochastic variability even if all the phenomena in internal variability were deeply understood. That’s important, because it makes what Bret Stephens and others like Judith Curry want to do a fundamentally flawed project. This stochastic variability on top of everything could be enough to send us all over some kind of potential cliff, even if emissions were managed to some precalculated minimax loss-versus-economic benefit point.
Here’s a rhetorical question when dealing with the public and policymakers: Why not go back to simple conservation of energy arguments, and point out that radiative forcing from CO2 is indisputable? The excess energy from forcing is going to go somewhere, and where it’s gone in the past may not be where it continues to go, ditto CO2 itself. Sure, this frustrates people who want a cost put on the phenomenon. But making up a cost is arguably worse than saying “We don’t have one.” Will the latter produce inaction? Possibly. But that’s what’s happening now, and people are trying to produce cost estimates.
Oh, and indeed, there are but 21 single socks in the Broman climate collection, per Armour’s count of the number of GCMs used reported at the top right of the second page of their article.
Hat tip to Climate Denial Crock of the Week, in their “Florida slowly confronting sea level nightmare.”
Dr Neil deGrasse Tyson. I think he’s awesome. Marvelous. I saw him in Boston. He and I did not get off well, at the start, because of my being awestruck, and feeling very awkward, and the short time we had in his meeting us backstage in Boston. I regret that, but I could not be other than what I was.
And he would be the first to challenge that.
Because of Science. And its values. “Prove it,” I think he’d say.
This is much better than Religion, although those are my feelings and thoughts, not Dr Tyson’s.
“This is Science. It’s not something to toy with.”
All this is about people, and the human situation. Science is a means of getting beyond that.
“Recognize what Science is, and allow it to be and what it can be in the service of civilization.”
… A person who is able to write code using Hadoop and the associated frameworks is not necessarily someone who can understand the underlying patterns in that data and come up with actionable insights. That is what a data scientist is supposed to do. Again, data scientists might not be able to write the code to convert “Big Data” into “actionable” data. That’s what a Hadoop practitioner does. These are very distinct job descriptions.
While the term analytics has become a catch-all phrase used across the entire value chain, I personally prefer to use it more for the job of actually working with the data to get analytical insights. That separates out upstream and downstream elements of the entire data mining workflow.
I have repeatedly observed practitioners and especially managers who treat — or would very much like to treat — tools and techniques from this area as if they were Magical Boxes, to which you can send arbitrary data and obtain wonderful results, like the elixir of the Alchemists. There is also a cynical aspect to the attitude of some managers — some seem indoctrinated by the old “Internet time“ and “agile sprint” notions — that if something does not show tangible and substantial progress over the short term (on the order of a week or two), there is something fundamentally wrong with the process. Sure, progress needs to be shown and reportable, but some problems, especially those involving data which are not obviously meaningful (*), demand a deep familiarization with the data and good deal of data cleansing (**). This is hard, especially when the data are large. And not all worthwhile problems can be solved in two weeks, even for a corporation. Consider the project and planning timelines which a Walt Disney Company does for their parks or a energy company like DONG does for their offshore wind projects.
This is unfortunate, and it is more than simply a matter of personal style. Projects which proceed with the magical thinking that the right tool or algorithm is going to solve all their issues typically fail, after expending large resources on computing assets, data licenses, and labor. When they do, they give analytics and “Big Data” a tarnished reputation, especially among upper management who blame and distrust new things rather than incompetent engineers or, perhaps, engineers without the integrity of explaining to their management that these tools have promise, but the project schedules for venturing into new sources of data are long, and best done with a very small team for the first portion.
In fact, one severe failing of the current suite of “Big Data” tools I see is that, while they are strong on certain modeling algorithms, and representational devices like Python panadas-esque and R-esque data frames, they offer little in the way of advanced data cleaning tools, ones which can marshall clusters to completely rewrite data in order for it to be useful for analysis and machine learning.
It is even harder to know what to do with semi-structured textual data, such as the headers of IETF RFC 2616. In these cases, while there is official guidance, there is no effective enforcement mechanism and, so, instances of these headers are, by the criteria of the RFC, malformed, even if there dialects in Internet communities which are self-consistent and practiced in breach of the RFC. The trouble is that, here, there is no computable definition of malformed, so what is meaningful is something which needs to be learned from the corpora available. This is not an easy task, and may be dependent not only upon the communities in question, but upon geographic origins and takeup, as well as Internet protocol and netblocks.
A quarterly series dedicated to thought leadership on adaptation to climate change.
Learning to Swim in the Data Deluge
Topics in Computational Neuroscience & Machine Learning
Lecture notes for CPSC 536N "Randomized Algorithms"
Random thoughts on Machine Learning, Deep Learning and (sometimes) Computer Vision
stories we tell ourselves
Research that Informs Business and Public Policy
Guyana News and news from Guyanese Associations worldwide
"For here we are not afraid to follow truth wherever it may lead, nor to tolerate any error so long as reason is left free to combat it." – Thomas Jefferson
Energy, Environment and Policy
likhipa inhlanzi emanzini
... for when you can't solve life's problems with statistics alone.
Exploring and venting about quantitative issues
Newscasts on Global Warming, Its Consequences & Solutions
A Welcoming Congregation & A Green Sanctuary
Stop the war; stop the warming.
e-commerce data science and analytics
Boston Area Sustainability Group
with Peter Sinclair
I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton
Engineering, Oceanography, and Innovation in Environmental Science
Changing the world and its people, one post at a time
notes on mathematical finance, algo trading and derivatives
Discover change, together
Tips and tricks on programming, evolutionary algorithms, and doing research
Experiments & Experiences in R
Critical perspectives on technology, sustainability, and the future
Astronomy, space and space travel for the non scientist
How one atheist sees life
Forum for Climate Engineering Assessment
Noah Deich's blog on all things Carbon Dioxide Removal (CDR)
resources on approximate Bayesian computational methods
The Science, Economics, and Politics of Climate Change
Multa novit vulpes
Once you have finished counting diatoms, the real fun begins
weather musings for a wider audience
Let's talk about how we do research in the weather and climate sciences
science and current events d1p
``Eppur si muove.'' -- Galilei
To inform, inspire and involve
FastMail system status announcements
Science, Politics, Life, the Universe, and Everything
Ponderings of science, philosophy, history, politics, and many other topics
Statistical Computing + Bayesian Modelling
Science - Simplified