The ‘Nate Silver Effect’ Is Changing Journalism. Is That Good? – POLITICO Magazine
The near-year since Donald Trump’s surprise electoral victory has been filled with soul-searching and recriminations among those who research public opinion and those who write about it. A conversation around whether polls failed has hardened into two main camps: one blaming the data, the other blaming the media.
But this version of the debate misses the point. The problem isn’t simply flawed data or the media’s misuse of it; these problems cannot be separated. Political journalism has become infatuated with opinion polls—what some have called a “Nate Silver Effect”—and yet news organizations remain ill-equipped to make sense of the flood of data.
Story Continued Below
Aggregators and forecasting websites such as RealClearPolitics and FiveThirtyEight, which attracted a combined 200 million visits in October 2016, have altered the way political reporters cover American politics, but among journalists and survey researchers, considerable ambivalence remains over whether these changes have, on balance, been for the better.
Several years before the 2016 election, I set out to better understand the changes in how news organizations were using opinion data by interviewing journalists, polling analysts and research practitioners across a range of institutions, including major national news outlets and private industry. Some of my findings were recently published online in the journal Journalism, and have considerable bearing on debates over what went wrong last year.
The shock results of 2016 were not an aberration. In talking with people who study and report on public opinion, it was apparent that there have been major shifts in how data are evaluated for quality and disseminated publicly. Even if the 2016 polls were not nearly as far off as their detractors sometimes assume (and they weren’t, at least if you compare national polling averages in the closing days of the 2016 race to Clinton’s margin of victory in the popular vote), methodologies are changing rapidly, newsroom resources are shrinking, and it has become easier than ever for anyone to sponsor their own junk survey, pass it off as social science and disseminate results to sympathetic audiences. (Much of the public already believes that this is what pollsters do: A Marist poll from March found that sixin 10 registered voters do not trust opinion surveys.)
And this is where the Nate Silver Effect gets complicated. Polls are popular fodder, but discretion when it comes to them is all too rare. The quality of the data underlying aggregators’ models is, in fact, more questionable than in the past. Was a new survey conducted using acceptable methodological rigor? Are assumptions about non-response and voting likelihoods defensible? Did question-wording or question-order put a thumb on the scale? Weighing these matters—let alone accounting for the vagaries of ordinary sampling error—requires a level of institutional knowledge and resources that most news organizations simply cannot afford.
Sites like FiveThirtyEight have drilled into readers the importance of averaging across polls as a corrective to people’s tendencies to “pick the poll numbers they like and disregard the rest,” as one reporter I interviewed put it. This very concern over outlier polls led this same reporter’s news organization to avoid citing individual poll results (“I always tell everyone I want to see three polls before I’ll quote them”). But in doing so, there’s a risk of learning the wrong lessons. Averaging across polls helps guard against some sources of error, such as those due to ordinary sampling error, but it does little to address the underlying problem of poor-quality data. It’s a garbage-in, garbage-out scenario: Averaging polls can be useful, but if the data being input are bad, then the averages will be tainted, too. In fact, in its postmortem on the performance of the 2016 polls, the American Association for Public Opinion Research found that the “large, problematic errors” observed in “key battleground states”—which fueled many forecasters’ overconfident models—were due in large part to a lack of high-quality state-level surveys in the final weeks of the race. Averaging using outdated or flawed data might have contributed to perceptions that Clinton’s lead was insurmountable.
Many reporters and editors have taken to heart the importance of paying appropriate attention to polling data when handicapping races or describing candidates’ chances, but parsing and dissecting data are rarely as straightforward as plugging numbers into an algorithm. It requires making judgment calls about a range of factors that are difficult to quantify. To his credit, Silver and his colleagues have tried to guide journalists with easily digestible tips for reading polls “like a pro” in an attempt to guard against the trap of false confidence, but the numbers themselves are often more compelling than the caveats.
Even in 2014 and 2015, I heard repeated concerns about whether the Nate Silver Effect on newsrooms might be causing some to embrace polling averages and forecasts as gospel, with election outcomes presumed to be preordained by the data weeks or months before votes are cast. One editor I spoke with blamed the “incessant desire of social scientists to pretend they’re physicists” when human behavior is “never going to be that precise.” But the expectation of “pinpoint” precision also comes with the territory; as one survey researcher pointed out: “If it’s a number, it’s precise—it’s $1.39; it’s 34 percent.” Ultimately, elections themselves are precise counts, creating a demand for decimal-point accuracy that no amount of aggregated survey data can responsibly offer.
This Nate Silver Effect is not merely a failure of interpretation, innumeracy or a misreading of probability, as Silver himself emphasized in an 11-part post-election series. Newsrooms do struggle with all of these things, but journalists are in the business of communicating, and as it turns out, it’s hard to characterize degrees of uncertainty without confusing an average reader. For example, one polling analyst I spoke with described having “fights with editors” over whether a “2-point lead” for one candidate constituted an actual lead, or a virtual dead heat due to normal polling error. Survey data are not newsworthy if all they ever suggest is that either candidate has a decent chance of winning.
To many of those I interviewed, a still more troubling development tied to the advent of the aggregators has been the media’s diminishing role as gatekeepers of opinion data. In an earlier era, leading media organizations established editorial standards intended to weed out shoddy polls from their coverage. Critics charge that these policies contributed to myopic coverage that focused only on polls sponsored by news organizations themselves, but standards differentiating between firms of ill-repute and those using sound and transparent methods were meant to guard against the reporting of dubious data. Now, forecasting and aggregator sites, with the aid of social media, have provided survey firms a powerful platform for reaching readers hungry for their results—often regardless of the firms’ rigor or reputation.
In effect, gatekeeping around opinion polls has quietly shifted away from legacy media newsrooms altogether and into the hands of the aggregators and forecasters. Even media organizations that continue to employ strict polling standards cited numerous examples in recent elections in which polls otherwise deemed unfit for coverage could not be ignored because they drove larger campaign news cycles. The tendency of aggregator sites to “throw everything in” without distinguishing among firms, as Iowa pollster Ann Selzer pointed out in a 2015 interview with the Columbia Journalism Review, has contributed to a culture where, generally, fewer are passing judgments about data quality and saying, “This is a bad poll; we’re not going to mention it.”
My own interviews echoed Selzer’s lament that few reporters are “doing the work of looking at the methodology.” Many instead professed to relying on “brand names” and personal relationships with pollsters as a proxy for data quality. One reporter at a leading national newspaper admitted, “If you wanted to hoodwink me and you had an institution and a trusted name behind it, you probably could.”
Many younger, more digitally minded journalists I spoke with did not even necessarily believe they should serve as gatekeepers of polling information: Information was likely to circulate one way or another online; better to be involved in the conversation than to be on the outside of it. “In a lot of ways, Twitter is our ombudsman,” one such reporter told me. “You just want to get stuff out quickly” and rely on readers to critique the data and help decipher its reliability. “We have left-wingers and right-wingers who follow us, and they’ll call us out.”
Others suggested there is real value in having these debates about polling methods out in the open. When faced with competing results from firms with differing methods or approaches, one analyst told me his news organization would seek to “make sure that the reader knew about all of them” while helping to “guide the readers to understand why these two polls differ and which one we think may be … more accurate.” Or as another analyst put it, “It’s like with sources, I mean some of them are sketchy, and that’s true with polls, too. You know, we quote people sometimes who are sketchy people who have agendas. And still we quote them saying, ‘Look, this is a kind of sketchy dude, and he has a dog in this fight. And this is what he says.’” (A more senior reporter maintained a different view: “If you doubt the data, why would you tweet it? Why would you use it in any way, shape or form?”)
Polling has exploded even as the media has become less equipped to process it and convey it accurately. According to one estimate, by the end of 2012, more than 1,200 unique firms and institutions had conducted 37,000 separate public opinion polls in the United States, mostly since the 1990s. By my count, that number ballooned to 48,600 by the end of 2016. Yet many newsrooms now lack the expertise to evaluate and analyze raw polling data, particularly the underlying weighting and modeling assumptions employed, which can significantly shift results. While some organizations maintain a small team of staffers to plan and coordinate surveys and write up results, the process of sampling, fielding and analyzing survey data no longer occurs in-house at any American news organizations. As one pollster said, “In this explosion of data, the irony is the resources to make sense of it aren’t there.”
Some cash-strapped newsrooms continue to fund expensive, in-depth telephone surveys using trained, live interviewers—once the undisputed gold standard in opinion research. But these polls are becoming rarer amid declining response rates, rising cellphone use and a yawning cost differential with online surveys. While some internet pollsters have notched impressive results, the statistical models they use to compensate for unrepresentative samples often remain shrouded in secrecy, making it difficult even for experts to distinguish between them—much less journalists. These complications make the explanatory journalism offered by sites such as FiveThirtyEight that much more valuable, but it’s their forecasts and the numbers (71.4 percent!) that get all the attention.
It’s not just pollsters and journalists who have a problem when people lose trust in survey research—the rest of us do, too. While the conversation over the media’s use of data has centered narrowly around horse-race coverage, polls remain among the most valuable tools available to systematically gauge public opinion and make it matter—to give all segments of the public an equal chance to state their preferences concerning how the country ought to be governed. Fair or not, the perceived failure of the polls in 2016 makes it that much easier for elected officials to dismiss and ignore the public’s expressed concerns altogether—a result that should worry all Americans.