Big data show that history does indeed repeat itself. What does that mean for foreign policymaking -- and tackling crises from Ukraine to Syria?
- By Kalev LeetaruKalev H. Leetaru is the Yahoo! Fellow in Residence at the Institute for the Study of Diplomacy in the Edmund A. Walsh School of Foreign Service at Georgetown University. His work centers on the application of "big data" towards understanding global human society in new ways and he is the creator of the GDELT Project.
Working in policymaking is a lot like being a real-life time traveler: It is an inherently forward-looking process in which decisions are made today based on an estimate of what tomorrow might look like. Yet as the finger-pointing that accompanies every global crisis, from Ukraine to Syria, makes clear, humanity’s ability to forecast the future is poor at best.
This doesn’t have to be the case. From the ancient Egyptian symbol of the Ouroboros (a serpent eating its own tail) to the Hindu concept of the Yuga (cyclical epochs) to the science-fiction world of Isaac Asimov’s “psychohistory,” thinkers have long reasoned that history repeats itself and can thus be measured and predicted. In the case of psychohistory, it was posited that, much like weather forecasts, the future actions of human society could be estimated at high accuracy if there was just sufficient data and computing power.
So has the failure to accurately forecast what will happen in the world — and to make policies to prevent catastrophe — merely been due to a lack of information about historical cycles? The answer, it seems, is “yes.”
The Global Database of Events, Language, and Tone (GDELT) project, which I founded, monitors the world’s broadcast, print, and web news from nearly every country in more than 100 languages and compiles a daily list of more than 300 categories of events — from riots to appeals for peace — down to the city level. It currently has more than a quarter-billion records from 1979 to the present, all of them available in Google’s BigQuery analysis platform. The resolution, time scale, and geographic coverage of this database makes it ideal for exploring whether history really does fall into regular patterns, and if those patterns are sufficiently robust to forecast the future.
Turn back the clock to Jan. 27, 2011: Egypt is wracked with anti-government protests, and analysts are furiously trying to determine whether there will be a quick resolution or whether the protests will bring months of turmoil to the country. Using GDELT and BigQuery to answer this question, a timeline could be created measuring the intensity of unrest (defined as material, nonverbal conflict) in Egypt by day over the preceding two months. Then, the history — from 1979 to 2010 — of unrest in every country in the world could be searched and compared to it. This could be done using a rolling, 60-day window. Take Afghanistan, for instance: The timeline of that country’s unrest from Jan. 1, 1979, to Mar. 2, 1979, could be compared against the selected period in Egypt. Then, Afghanistan’s unrest from Jan. 2, 1979, to Mar. 3, 1979, could be analyzed — and so on. This would allow researchers to identify past periods in history that are most similar to Egypt’s tumultuous two months.
Then, the top two most similar historical periods could be selected, and the average of what happened within them — what events transpired, how, and in what time frame — could be computed as a possible forecast of what might happen in Egypt in the immediate future.
Figure 1 here shows the results of this exact process, plotting the relative intensity of unrest on the y-axis. To the left of the vertical black bar is Egypt’s selected two months in red, overlaid on top of one of the most similar periods of past world history: events in Sweden, from Oct. 4, 2010 through Dec. 3, 2010, which are shown in green. Sweden’s unrest during this period stemmed from the (ultimately disproved) anti-Semitic swastika attack against politician David von Arnold Antoni, the Dec. 7 arrest of Julian Assange, and the Dec. 11 Stockholm bombings. Despite entirely different circumstances, the two countries show nearly identical relative intensities of unrest over their respective 60-day periods.
To the right of the vertical black bar is what then happened in Sweden over the two months after Dec. 3, 2010, compared with what ended up happening in Egypt in the two months after Jan. 27, 2011. Immediately it is clear that, while there are significant differences, the trend lines bear strong resemblance to each other. If an analyst had been running this system in 2011, he or she could have used the Sweden data to forecast what was going to unfold in Egypt: that is, roughly two weeks of notable unrest, then a return to (relative) levels of calm.
Figure 1 – Sweden 10/4/2010 – 12/3/2010 (green left of black line) and 12/3/2010 – 2/1/2011 (green right of black line), compared with Egypt (red). (Click to enlarge)
In other words, by searching for periods of history all over the world similar to what is happening now in a single country, one can look at what happened after each of those periods as a measurable forecast of what might happen in the target country in the future. And as this graph indicates, history indeed seems to repeat itself — at least as seen through the eyes of the world’s news media. (Those mathematically inclined can read the rest of the details of the forecasting, including the underlying code.)
Replicating this process for Ukraine starting on Feb. 22, 2014 (the day then-President Viktor Yanukovych fled the country), and averaging the two most similar historical periods — from Turkey and Lebanon — ultimately yields the following forecast (green in Figure 2) compared with what actually happened in Ukraine over the following two months (red). This is particularly noteworthy because, unlike with Egypt, the post-period here exhibits far more complex behavior, breaking from traditional “media fatigue” and other journalistic effects.
Figure 2 – Ukraine (red) compared with average of Turkey and Lebanon (green) in the following 60 days, after the correlated time periods. (Click to enlarge)
Of course, where there is forecasting there is a need to see whether or not it is accurately predicting events. So in addition to timelines, GDELT has developed a daily, visual update that shows the current state of the world, letting analysts watch events as they unfold and compare them to projections. In collaboration with the U.S. Institute of Peace, the new GDELT Global Dashboard offers a rolling, 180-day map of conflict and protests across the globe. The Islamic State’s march across Iraq is instantly visible, civil war in Ukraine bursts into view, and the clustering of protests and violence in Nigeria also leaps from the screen. This map is updated every morning, offering both a clickable layer of the latest events and a six-month animated context of what preceded the present state of affairs.
Figure 3 –
The GDELT Global Dashboard. (Click to enlarge)
For thousands of years philosophers and historians have argued that the world moves in broad, predictable patterns that provide insight into the future. Humanity has finally reached a critical juncture where it has sufficient data and computing power to begin to tease apart these patterns and understand them in a way that was simply impossible before. Forecasting the future of Egypt or Ukraine, by looking to all of the history of the last three decades, required more than 2.5 million comparisons — yet it took Google’s BigQuery system just 2.5 minutes to run them.
Whether the patterns identified capture some kind of new mathematical equation that governs all of human life or — perhaps far more likely — a more precise mathematical definition of how journalism shapes our understanding of global events, they demonstrate the unprecedented power of the new generation of “big-data” tools. The world may not quite be at the level of Isaac Asimov’s psychohistory, able to predict global events like meteorologists do the weather. But it is now possible to glimpse the future — and better understand how to shape it.