Author’s Note: This post represents a follow-up to an earlier post on Type I and Type II Error Avoidance and its Possible Role in the Climate Change Debate. For those of you returning to this posting you will note that I have removed several technical paragraphs discussing Type I and Type II errors. In the comments thread, here and elsewhere, it has been pointed out that the introduction was overlong, potentially had issues and most importantly distracted from the topic at hand. The critical point being made was simply that Type I and Type II errors do not operate strictly in an “or” relationship because they address different hypotheses (null versus alternative) and that the tools used to avoid making Type I and Type II errors differ significantly.
The tools used in Type I error avoidance are centered around an understanding of the nature and characteristics of the populations under study and a general acceptance (by the scientific community) of what represents an acceptable risk of making an error with respect to hypotheses about those populations. The current gold standard in science is the 95% confidence level with a p-value of 0.05. In order to derive an acceptable p-value, certain characteristic of the population must be understood. Is the population best described using the normal distribution? the lognormal distribution? is the population dynamic insufficiently well understood that nonparametric statistics are necessary to generate a p-value? I know at this point of the discussion my statistician friends are pretty much writhing on the floor in agony and for that I apologize. The details of how statisticians evaluate populations involve mathematics that make my eyes bleed and for the purposes of this blog posting are excessive to the task.
The tools of Type II error avoidance are less well-refined (but are getting better every day). They depend more on process knowledge which can include providing a clearer separation between the null and alternative hypotheses and improving our understanding of the nature of the distribution that is being tested. Lacking good process knowledge an increase in sample size will increase the power of an analysis. In order to avoid a Type II error one needs to understand the nature of the problem being investigated and must have developed some reasonable body of process knowledge that allows one to be sure one is looking in the right places. Lacking detailed process knowledge Type II error avoidance depends on the brute force of replication and increased sampling density. As an analogy to my business, while not pretty, if you punch enough holes in the ground you can pretty much find any contamination that is out there.
So how does this explanation relate to my original thesis? It comes down to the nature of the communities under discussion. My initial hypothesis was not about how hypotheses are tested but rather about the nature of individuals who live and work in a world where the differing modes of error avoidance dominate risk-evaluation and resource allocation decisions.
As noted previously, the scientifically acceptable p-value is a feature of a scientific consensus. It involves a degree of cooperation and acceptance of scientific norms that is a part of the DNA of the academic community. In the general academic community there is an inherent trust in the process. Academics trust that the scientific process, when carried out according to the norms of science, will result in the most reliable outcome/predictions. Perfection is not possible but once you reach a certain level of certainty, perfection is not necessary. This level of internal trust (trust that has been tested through peer review) results in a habit of trusting in a group consensus and, some might argue, insular thinking. It unfortunately also can accommodate individuals who take on the mantle of authority from the group to make predictions, even when the predictions may not be fully supported by the findings of the group. In essence the group will often protect their own and keep their arguments behind closed doors (or between the editor and writer). I readily admit that I am painting with a very broad brush and there are mavericks in every group but my observations are based on general characteristics of the community.
Sceptics, on the other hand, work on the underlying Type II error avoidance ethos that says that until you have a good grasp of process knowledge then you had better have a lot of data to back up your pronouncements. They are averse to trusting any process where they cannot see the sausages being made. They need to be able to test the ingredients and even then will want to evaluate the outputs. They view global climate models as “black boxes” into which data are fed and out of which predictions are made. Not being able to assess the contents of the black boxes they demand “more data” in the form of outputs that reflect actual conditions in the world. As we know, in the recent past the models’ predictions have been on increasingly shaky ground. This has triggered a risk-aversion response in the Type II community to ask for the collection of more data and not to act precipitously.
A colleague at work describes the difference as roughly the “trust me crowd” versus the “show me crowd”. The trust me crowd can show that some anthropogenic climate change has happened in the past and that models suggest that future conditions are going to get worse. They produce their documentation via the peer reviewed press and in doing so address all the touchstones of the scientific method. Having met the high bar of “good science” they anticipate that their word will be taken as good.
The show me crowd looks at the “good science” and points out that many historical predictions of doom and gloom (that previously met the test of good science) have been shown to be overheated or just plain wrong. They also point out that the best models have not done a very good job with respect to the “pause”. Given this they ask for a demonstration that the next prediction is going to be better than the last one. This does not mean that they deny the reality of anthropogenic global warming. Rather they are not comfortable with cataclysmic predictions and calls for immediate action prior to a demonstration that those predictions can be supported with something approaching real data.
So once again it comes down to communication. The groups have to step out of their comfort zones and start re-learning how to communicate with each other. Warmists have to emerge from their back rooms and acknowledge publicly what they have been acknowledging privately all along. That these predictions represent just that: predictions. The best predictions possible given the limitations of the system and tools available, but not the certain outcomes suggested by many. They have to make a case why in a world with finite resources, that substantial resources should be allocated to prevent low-probability, high-cost outcomes. Sceptics on the other hand have to trust that fairly reasonable predictions can be made of a complex and chaotic system. They have to listen to the case made by the warmists and maybe even give them the benefit of the doubt. Having read the comments at a number of blogs, that last part may well be the hardest but it is necessary if we are going to re-establish a reasonable dialogue and seek to address this impasse.