Thinking a bit this morning about the hoo-hah in the comments thread to my post on climate change meta-issues below, I started to surf around looking for people who were thinking about the same meta-issues. I don’t have a conclusion yet, but tripped over two interesting things.
The first was a blog supporting AGW and opposing the people who challenge it as ‘cranks, deniers, etc.’ and generally taking on Creationists, 9/11 Truthers, AIDS deniers, and AGW skeptics. Generally, it supports mainstream thinking and – indirectly – arguments from authority.
Here we will discuss the problem of denialists, their standard arguing techniques, how to identify denialists and/or cranks, and discuss topics of general interest such as skepticism, medicine, law and science. I’ll be taking on denialists in the sciences, while my brother, Chris, will be geared more towards the legal and policy implications of industry groups using denialist arguments to prevent sound policies.
First of all, we have to get some basic terms defined for all of our new readers.
Denialism is the employment of rhetorical tactics to give the appearance of argument or legitimate debate, when in actuality there is none. These false arguments are used when one has few or no facts to support one’s viewpoint against a scientific consensus or against overwhelming evidence to the contrary. They are effective in distracting from actual useful debate using emotionally appealing, but ultimately empty and illogical assertions.
I’m generally sympathetic to this view; I hear from 9/11 truthers periodically on an email list I’m on, and I don’t have a lot of time for their claims.
But then, I also found an article on the importance of fact-checking scientific claims (pdf).
Yes, it’s sponsored by a libertarian, corporatist Canadian think tank – but discounting for that, the claims made, and the conclusion of the article made lots and lots of sense to me.
In recent years, there has been considerable attention paid to the question of whether financial statements and other data from corporations are adequately reviewed prior to release. An analogous question concerns the data and findings in academic papers which sometimes influence public sector decisions. Disclosure of data and code for the purpose of permitting independent replication in no way intrudes on or imperils academic freedom; instead, it should be seen as essential to good scientific practice, as well as a contribution to better public decisionmaking.
The article cites a litany of scientific and research error and malpractice, all shielded by stonewalling. Over the next few days, I’m going to dig into the ones I don’t know about, and see what I can find (I’d welcome assistance…); if the point of this article is that we need to check the math in research that’s given to us, we need to extend the same level of scrutiny to the claims in the article itself.
But here are the stories it tells, and a few comments of my own.
The ‘Harvard Six Cities’ study
In 1993, a team of researchers led by D.W. Dockery and C.A. Pope published a study in the New England Journal of Medicine supposedly showing a statistically significant correlation between atmospheric fine particulate levels and premature mortality in six US cities (Dockery, Pope, et al., 1993). The “Harvard Six Cities” (HSC) study, as it came to be called, attracted considerable attention and has since been repeatedly cited in assessment reports, including those prepared for the Ontario government, the Toronto board of public health and the Ontario medical association. In each case the reports have used the HSC study to recommend tighter air quality standards or other costly pollution control measures.
…after continuing pressure, Dockery and Pope gave their data to a third party research group called the Health Effects Institute (HEI), which agreed to conduct an audit of the findings. In 2000, fully six years after the CASAC request, and three years after the new air quality regulations had been introduced, the HEI completed its reanalysis. The audit of the HSC data reported no material problems in replicating the original results, though there were a few coding errors (Health Effects Institute, 2000). However, their sensitivity analysis showed the risk originally attributed to particles became insignificant when sulphur dioxide was included in the model, and the estimated health effects differed by educational attainment and region, weakening the plausibility of the original findings (Heuss and Wolff, 2006).
The Boston Fed Study
Although there had been political pressure on banks to increase lending to minorities, there was no legitimate justification for doing so until the Federal Reserve Bank of Boston released a now-famous working paper in 1992 entitled Mortgage Lending in Boston: InterpretingHMDAData, which purported to show widespread discrimination against minorities in the Boston mortgage market. This led to a series of rapid rule changes affecting bank lending practices. These coincided with passage of the 1992 Federal Housing Enterprises Financial Safety and Soundness Act, which forced Fannie Mae and Freddie Mac to accept sub-prime loans, thus removing from the banks the risks associated with making bad loans.
Day and Liebowitz (1998) filed a Freedom of Information Act request to obtain identifiers for these observations so they could re-run the analysis without them. They also noted that the Boston Fed did not use the applicant’s credit score as generated by the bank, but had replaced it with three alternate indicators they themselves constructed, which Day and Liebowitz found had omitted many standard indicators of creditworthiness. Day and Liebowitz showed that simply reverting to the bank’s own credit score and correcting the 26 misclassified observations caused the discrimination coefficient to drop to zero.
I’ve looked a little bit into this one, and there are a set of newer papers that suggest that there is some impact of race on loan approvals for marginally qualified candidates, as well as other newer papers that suggest that there is no impact.
The “hockey stick” graph
OK, now I’m sure it’ll get ugly.
The Mann, Bradley, and Hughes (1998; 1999) “hockey stick” graph, shown in figure 1, was a key piece of evidence used by the Intergovernmental Panel on Climate Change in its 2001 Third Assessment Report to conclude that humans are causing climate change (Working Group I, IPCC, 2001, ch. 2, fig. 2.7c and ch. 2, fig. 2.20). The graph has a striking visual effect, suggesting the Earth’s climate (represented by the average northern hemisphere temperature) was stable for nine centuries prior to industrialization, then underwent a rapid warming in the 20th century. The hockey stick graph appeared five times in the Third Assessment Report, each time in an unusually large and colorful format compared to other data series. It was widely reproduced on government web sites around the world and played an influential role in the debates that took place in many countries between 2001 and 2004 over whether to ratify the Kyoto Protocol.
In 2005, the House Science Committee asked the National Research Council (NRC) to investigate the controversy over the hockey stick. Prior to beginning its work, the NRC revised its terms of reference to exclude any specific assessment of Mann’s work. The Energy and Commerce Committee then asked Edward Wegman, Professor of Statistics at George Mason University and Chairman of the National Academy of Sciences Committee on Theoretical and Applied Statistics, to assemble a separate panel to assess Mann’s methods and results. The NRC report ended up critiquing the hockey stick anyway, noting that it failed key statistical significance tests (National Research Council, 2006: 91), relied on invalid bristlecone data for its shape (pp. 50, 106-7), used a PC technique that biased the shape (p. 106), and, like other proxy reconstructions that followed it, systematically underestimated the associated uncertainties (p. 107). The Wegman panel report was published in July 2006 (Wegman et al., 2006). It upheld the findings of McIntyre and McKitrick (p. 4). Among other things, the panel reported that, despite downloading the materials from Mann’s web site, they were unable to replicate the hockey stick results (p. 29).
Given the controversy around this issue, it’s important to note the modesty of their concluding paragraph:
The hockey stick episode illustrates, among other things, the inability or unwillingness of granting agencies, academic societies, and journals to enforce disclosure to a degree sufficient for the purposes of replication. Government intervention in this case resulted in release of essential code. Unless granting agencies and journals deal with this issue forcefully, policy makers should be prepared to accept a responsibility to act if their decisions are going to be based on the findings of unreplicated academic research.
The US obesity epidemic
In March 2004, the Journal of the American Medical Association published a paper by Dr. Julie Gerberding, Director of the Centers for Disease Control and Prevention (CDC), and three other staff scientists, claiming that being overweight caused the deaths of 400,000 Americans annually, up from 300,000 in 1990 (Mokdad, Marks, Stroup, and Gerberding, 2004). This study, and the 400,000 deaths figure, was the subject of considerable media attention and was immediately cited by then-US Health and Human Services Secretary Tommy Thompson in a March 9, 2004 press release announcing a major new public policy initiative on obesity, a $20 million increase in funding for obesity-related programs and a further $40 million increase the following year (US Department of Health and Human Services, 2004).
The CDC soon found itself under intense criticism over the chaotic statistics and the issue of whether internal dissent was suppressed. In response, it appointed an internal review panel to investigate, but the resulting report has never been made public. Some portions were released after Freedom of Information requests were made. The report makes scathing comments about the poor quality of the Gerberding study, the lack of expertise of the authors, the use of outdated data, and the political overtones to the paper (Couzin, 2005). The report also found that the authors knew their work was flawed prior to publication but that since all the authors were attached to the Office of the Director, internal reviewers did not press for revisions.
The Arctic Climate Impact Assessment
In late 2004, a summary report entitled the Arctic Climate Impact Assessment (ACIA) was released by the Arctic Council, an intergovernmental organization formed to discuss policy issues related to the Arctic region. The council had convened a team of scientists to survey available scientific information related to climate change and the Arctic. Impacts of a Warming Arctic: Highlights (Arctic Council, 2004) was released to considerable international media fanfare, and prompted hearings before a US Senate committee on November 16, 2004 (the full report did not appear until August 2005). Among other things, the Highlights document stated that the Arctic region was warming faster than the rest of the world, that the Arctic was now warmer than at any time since the late 19th century, that sea-ice extent had declined 15 to 20 percent over the past 30 years and that the area of Greenland susceptible to melting had increased by 16 percent in the past 30 years.
Shortly after its publication, critics started noting on web sites that the main summary graph (Arctic Council, 2004, Highlights: 4) showing unprecedented warmth in the Arctic had never appeared in a peer-reviewed journal (Taylor, 2004; Soon, Baliunas, Legates, and Taylor, 2004), and the claims of unprecedented warming were at odds with numerous published Arctic climate histories in the peer-reviewed literature (Michaels, 2004). Neither the data used nor an explanation of the graph’s methodology were made available (Taylor, 2004; Soon, Baliunas, Legates, and Taylor, 2004). When the final report was released eight months later, it explained that they had used only land-based weather stations, even though the region is two-thirds ocean, and had re-defined the boundaries of the Arctic southwards to 60N, thereby including some regions of Siberia with poor quality data and anomalously strong warming trends. Other recently published climatology papers that used land- and ocean-based data had concluded that the Arctic was, on average, cooler than it had been in the late 1930s (Polyakov et al., 2002). But while these studies were cited in the full report, their findings were not mentioned as caveats against the dramatic conclusions of the ACIA summary, nor were their data sets presented graphically.
The Donato study of post-fire logging and forest regeneration
On January 5, 2006, an article entitled “Post-wildfire logging hinders regeneration and increases fire risk” appeared in Science Express, the pre-publication venue for accepted articles in Science (Donato, Fontaine, Campbell, Robinson, Kauffman, and Law, 2006a). The paper examined logging activity in Oregon’s Biscuit Forest following a 2002 fire. It argued that logging reduced by 71 percent the density of viable seedlings during the recovery period, and led to an accumulation of slash on the ground, increasing potential fuel levels for future fires. The article drew attention to legislation pending before the US Congress, H.R. 4200, which mandated rapid salvage logging on federal lands following a fire. The authors concluded that post-fire logging “can be counterproductive to stated goals of post-fire forest regeneration.” The article was quickly cited by opponents of H.R. 4200 as authoritative scientific evidence against it (eg., Earth Justice, 2006).
In their response, Donato, Fontaine, Campbell, Robinson, Kauffman, and Law (2006c) acknowledged that their findings were less general than their title suggested, but they defended their sampling methodology and conclusions. At this point their critics asked to inspect the data and the sites where the data were gathered. The authors refused to disclose this information. Following publication of the exchange in Science, Newton and coauthors have repeatedly requested the underlying data collected at the measurement sites, as well as the locations of the specific sample sites, so they can examine how the seedling density measurements were done. These requests have been refused by Donato and coauthors (J. Sessions, pers. comm.), as have been similar data requests from Congressman Baird (Skinner, 2006).
The Bellesiles affair
Here’s one I have some pretty intimate knowledge of.
In 2000, to great fanfare, Knopf Publishing released Arming America: The Origins of a National Gun Culture. Written by Michael A. Bellesiles, then a professor of history at Emory University, the book purported to show that prior to the Civil War, guns were rare in America and Americans had little interest in owning guns. Other history professors wrote glowing reviews of the book: Garry Wills in the New York Times Review of Books, Edmund Morgan in the New York Review of Books, and Fred Anderson in the Los Angeles Times. The Washington Post did publish a critical review (Chambers, October 29, 2000), but it was a rarity. The book was promptly awarded Columbia University’s prestigious “Bancroft Prize” for its contribution to American history.
Despite the political importance of the topic, professional historians did not actively scrutinize Bellesiles’ thesis. Instead it was non-historians who began the process of due diligence. Stephen Halbrook, a lawyer, checked the probate records for Thomas Jefferson’s three estates (Halbrook, 2000). He found no record of any firearm, despite the fact that Jefferson is known to have been a lifelong owner of firearms, putting into question the usefulness of probate records for the purpose. Soon after, a software engineer named Clayton Cramer began checking Bellesiles’ sources. Cramer, who has a master’s degree in history, found dates changed and quotations substantively altered. However, Cramer was unable to get academic journals to publish his findings. Instead he began sending articles to magazines such as the National Review Online and Shotgun News. He compiled an extensive list of errors, numbering in the hundreds, and went so far as to scan original documents and post them on his website so historians would check the original documents against the text of Bellesiles’ book (Cramer, 2006).
Here’s a case where the initial critics were backhanded and dismissed as cranks by the author, and by many in the field – until the weight of evidence simply collapsed Bellesiles’ case completely.
Referring to Clayton Cramer, Bellesiles said, “It is not my intention to give an introductory history lesson, but as a non-historian, Mr. Cramer may not appreciate that historians do not just chronicle the past, but attempt to analyze events and ideas while providing contexts for documents” (Bellesiles, 2001).
They cite other studies, and miss citing still others – like that of Dr. Andrew Wakefield, who apparently falsified data about a study linking thimerosol in vaccines to autism – which study led to an unknown number of children not getting potentially life-saving vaccinations.
Personally, I side with both the anti-crank blog authors and the challengers in the paper pointing out the deficiencies in widely publicized, honored, mainstream science that has become the root of policy.
But as a matter of principle and action, I think I side with the paper’s authors core agenda – which is that papers which purport to tell scientific truths through statistical analysis need to release both raw data and code or pseudocode so that others can validate it.
My first serious science class – high school physics as a freshman – taught me that science was the art of making repeatable observations and drawing conclusions from them. Repeatable is a key word here, because it implies that science is, above all, empirical and intersubjective.
We need to base our policy decisions on science; that is science that is, above all, repeatable. This implies a level of transparency in the scientific and academic establishment which is often lacking.
Let’s fix that. And then we can make decisions based on something at least somewhat empirical, and hold the cranks and denialists up to the light of the sun.