, , , , , , , , , ,

Intriguing theories regarding all kinds of natural and social phenomena abound, but few if any of those theories can be proven with certainty or even validated at a high level of statistical significance. Yet we constantly see reports in the media about scientific studies purporting to prove one thing or another. Naturally, journalists pounce on interesting stories, and they can hardly be blamed when scientists themselves peddle “findings” that are essentially worthless. Unfortunately, the scientific community is doing little to police this kind of malpractice. And incredible as it seems, even principled scientists can be so taken with their devices that they promote uncertain results with few caveats.

Warren Meyer coined the term “certainty laundering” to describe a common form of scientific malpractice. Observational data is often uncontrolled and/or too thin to test theories with any degree of confidence. What’s a researcher to do in the presence of such great uncertainties? Start with a theoretical model in which X is true by assumption and choose parameter values that seem plausible. In all likelihood, the sparse data that exist cannot be used to reject the model on statistical grounds. The data are therefore “consistent with a model in which X is true”. Dramatic headlines are then within reach. Bingo!

The parallel drawn by Meyer between “certainty laundering” and the concept of money laundering is quite suggestive. The latter is a process by which economic gains from illegal activities are funneled through legal entities in order to conceal their subterranean origins. Certainty laundering is a process that may encompass the design of the research exercise, its documentation, and its promotion in the media. It conceals from attention the noise inherent in the data upon which the theory of X presumably bears.

Another tempting exercise that facilitates certainty laundering is to ask how much a certain outcome would have changed under some counterfactual circumstance, call it Z. For example, while atmospheric CO2 concentration increased by roughly one part per 10,000 (0.01%) over the past 60 years, Z might posit that the change did not take place. Then, given a model that embodies a “plausible” degree of global temperature sensitivity to CO2, one can calculate how different global temperatures would be today under that counterfactual. This creates a juicy but often misleading form of attribution. Meyer refers to this process as a way of “writing history”:

Most of us are familiar with using computer models to predict the future, but this use of complex models to write history is relatively new. Researchers have begun to use computer models for this sort of retrospective analysis because they struggle to isolate the effect of a single variable … in their observational data.”

These “what-if-instead” exercises generally apply ceteris paribus assumptions inappropriately, presuming the dominant influence of a single variable while ignoring other empirical correlations which might have countervailing effects. The exercise usually culminates in a point estimate of the change “implied” by X, without any mention of possible errors in the estimated sensitivity nor any mention of the possible range of outcomes implied by model uncertainty. In many such cases, the actual model and its parameters have not been validated under strict statistical criteria.

Meyer goes on to describe a climate study from 2011 that was quite blatant about its certainty laundering approach. He provides the following quote from the study:

These question cannot be answered using observations alone, as the available time series are too short and the data not accurate enough. We therefore used climate model output generated in the ESSENCE project, a collaboration of KNMI and Utrecht University that generated 17 simulations of the climate with the ECHAM5/MPI-OM model to sample the natural variability of the climate system. When compared to the available observations, the model describes the ocean temperature rise and variability well.”

At the time, Meyer wrote the following critique:

[Note the first and last sentences of this paragraph] First, that there is not sufficiently extensive and accurate observational data to test a hypothesis. BUT, then we will create a model, and this model is validated against this same observational data. Then the model is used to draw all kinds of conclusions about the problem being studied.

This is the clearest, simplest example of certainty laundering I have ever seen. If there is not sufficient data to draw conclusions about how a system operates, then how can there be enough data to validate a computer model which, in code, just embodies a series of hypotheses about how a system operates?”

In “Imprecision and Unsettled Science“, I wrote about the process of calculating global surface temperatures. That process is plagued by poor quality and uncertainties, yet many climate scientists and the media seem completely unaware of these problems. They view global and regional temperature data as infallible, but in reality these aggregated readings should be recognized as point estimates with wide error bands. Those bands imply that the conclusions of any research utilizing aggregate temperature data are subject to tremendous uncertainty. Unfortunately, that fact doesn’t get much play.

As Ashe Schow explains, junk science is nothing new. Successful replication rates of study results in most fields are low, and the increasing domination of funding sources by government tends to promote research efforts supporting the preferred narratives of government bureaucrats.

But perhaps we’re not being fair to the scientists, or most scientists at any rate. One hopes that the vast majority theorize with the legitimate intention of explaining phenomena. The unfortunate truth is that adequate data for testing theories is hard to come by in many fields. Fair enough, but Meyer puts his finger on a bigger problem: One simply cannot count on the media to apply appropriate statistical standards in vetting such reports. Here’s his diagnosis of the problem in the context of the Fourth National Climate Assessment and its estimate of the impact of climate change on wildfires:

The problem comes further down the food chain:

  1. When the media, and in this case the US government, uses this analysis completely uncritically and without any error bars to pretend at certainty — in this case that half of the recent wildfire damage is due to climate change — that simply does not exist
  2. And when anything that supports the general theory that man-made climate change is catastrophic immediately becomes — without challenge or further analysis — part of the ‘consensus’ and therefore immune from criticism.”

That is a big problem for science and society. A striking point estimate is often presented without adequate emphasis on the degree of noise that surrounds it. Indeed, even given a range of estimates, the top number is almost certain to be stressed more heavily. Unfortunately, the incentives facing researchers and journalists are skewed toward this sort of misplaced emphasis. Scientists and other researchers are not immune to the lure of publicity and the promise of policy influence. Sensational point estimates have additional value if they support an agenda that is of interest to those making decisions about research funding. And journalists, who generally are not qualified to make judgements about the quality of scientific research, are always eager for a good story. Today, the spread of bad science, and bad science journalism, is all the more virulent as it is propagated by social media.

The degree of uncertainty underlying a research result just doesn’t sell, but it is every bit as crucial to policy debate as a point estimate of the effect. Policy decisions have expected costs and benefits, but the costs are often front-loaded and more certain than the hoped-for benefits. Any valid cost-benefit analysis must account for uncertainties, but once a narrative gains steam, this sort of rationality is too often cast to the wind. Cascades in public opinion and political momentum are all too vulnerable to the guiles of certainty laundering. Trends of this kind are difficult to reverse and are especially costly if the laundered conclusions are wrong.