Why the World Can’t Have a Nate Silver

Why the World Can’t Have a Nate Silver

After a presidential election that Nate Silver and a smattering of other statistical modelers forecast with remarkable accuracy, quantitative enthusiasts — quants — are talking some hard-earned smack. "This is about the triumph of machines and software over gut instinct," Dan Lyons extolled at the tech blog ReadWrite. "The age of voodoo is over. The era of talking about something as a ‘dark art’ is done. In a world with big computers and big data, there are no dark arts."

If only. As a practicing forecaster who prefers algorithms to expert judgment, I’m thrilled to see statistical forecasting so publicly vindicated, but I’d also like to engage in a bit of expectations management about how quickly these methods might transform international politics. As sci-fi writer William Gibson famously said, "The future is already here — it’s just not very evenly distributed." As imperfect as they still are, statistical forecasts of U.S. elections are on the leading edge of that distribution. Meanwhile, most things foreign-policymakers care about are closer to the far edge.

To see why, it’s important to understand that Silver and his ilk didn’t succeed simply by using "math" instead of "gut." Yes, the method matters, but statistics isn’t alchemy. To build forecasting models that work well, you need reliable measures of things that are usefully predictive. Even tougher is that you need those measures not just for today, but also for a long- and broad-enough swath of history to be able to test your beliefs about what predicts what against some hard evidence before diving into prognostication.

Routine elections in rich countries like the United States are some of the softest targets in political forecasting. Rules are transparent; high-quality data, including surveys of would-be voters, are often available; and the connection between those data and the outcome of interest is fairly straightforward.

Even in these relatively easy cases, though, forecasting can still be challenging. In 2010, Silver — the man the Economist called "the finest soothsayer this side of Nostradamus" — tried to predict the outcome of parliamentary elections in Britain and missed pretty badly.

Of course, elections in obviously authoritarian regimes are even easier to forecast. Until Mikhail Gorbachev rolled around, no one needed a model to predict who was going to win election to the Supreme Soviet of the USSR. The task is much tougher in competitive authoritarian regimes, where subtler forms of coercion tilt the field in favor of one party, but don’t quite guarantee a specific outcome.

Take October’s legislative election in Georgia, where the Georgian Dream coalition upset President Mikheil Saakashvili’s ruling United National Movement after late opinion polls appeared to show a solid lead for the incumbents. As Mark Mullen, the chairman of Transparency International Georgia, pointed out, what simple readings of those pre-election polls overlooked was the large share of respondents — a whopping 46 percent — who refused to pick a favorite. According to Mullen, that refusal was probably driven by fear of "taking risks that could have put [respondents] on the wrong side of the authorities." In an atmosphere of fraud or intimidation, it is a lot harder to make accurate forecasts, even in the rare cases for which we have professional polling data.

When it comes to predicting major political crises like wars, coups, and popular uprisings, there are many plausible predictors for which we don’t have any data at all, and much of what we do have is too sparse or too noisy to incorporate into carefully designed forecasting models. In a perfect world, forecasters would routinely receive survey data that would shed light on the sentiments and intentions of the people who might engage in these activities. In the real world, it’s tough to get honest answers to questions about people’s willingness to participate in extralegal activities like protests or rebellion — and that’s assuming they could even be reached in the first place.

Absent direct measures of interests and intentions, we’re forced to rely on measures of structural conditions that might shape political behavior. This is what some forecasters of presidential elections do, using things like incumbency, job growth, and changes in income to generate predictions months ahead of the vote. These kinds of models perform pretty well, but the forecasts they produce are typically less accurate than their poll-averaging counterparts.

The same logic holds in international affairs. Pretty much every theory of domestic political instability starts from the assumption that, other things being equal, poorer countries are more susceptible to crisis than wealthier ones. Simple, right? Just toss per capita GDP in your algorithm and move on to the next predictor.

Not so fast. As it happens, GDP estimates are produced by government agencies whose data-making capacity is directly related to the thing they’re trying to measure. Some countries, including Cuba and North Korea, don’t even report national economic statistics to the international bodies that collect them. And that’s close to the best-case scenario. Reliable measures of many other oft-mentioned risk factors, like unemployment and income inequality, were simply unavailable for almost all countries until very recently, and coverage is still largely confined to richer parts of the world.

These gaping holes in the historical record don’t make it impossible to generate useful statistical forecasts of international affairs. They do mean, however, that the forecasts we can make are much less accurate than the ones the poll-averaging modelers can produce for U.S. elections.

For rare events like coups or outbreaks of civil war — in most years, only a few of these events will occur worldwide — it’s easy to be right almost all the time by saying nothing will happen anywhere, but that’s also not particularly useful. The harder task is identifying where and when the occasional exceptions will occur without crying wolf too often.

This problem bears some resemblance to forecasting U.S. presidential elections, in which most of the 50 states dependably vote Democrat or Republican; the hard part is predicting the dozen or so swing states. In international politics, there are many cases that seem reliably "immune" to certain crises, and there’s often also a small but self-evident set of usual suspects. It’s the small but critical set of cases in between those two extremes that make us work to earn our paychecks.

Again, though, difficult does not mean impossible. As Pennsylvania State University political scientist Philip Schrodt has pointed out, well-designed models have achieved a respectable level of accuracy on a range of forecasting problems, including outbreaks of civil war and mass atrocities and the occurrence of coups d’état. Still, these models usually aren’t as precise as we’d like. For every high-risk case that suffers a crisis, there is usually at least a handful of them that don’t, and occasionally a supposedly low-risk case just plain surprises us.

Data gleaned from the deluge of information now pouring over the Internet may soon help fill some of these gaps, but we’re not there yet. In the meantime, we must create forecasts with the data we have, not the data we want. It’s great that statistical forecasters won wider respect for their methods by nailing the outcome of this year’s U.S. presidential election. But it’s important for people to appreciate that not every forecasting problem can be solved by sprinkling it with math and silicon.