Half a Billion Clicks Can’t Be Wrong
What big data tells us about next year’s crisis zones.
Everyone likes to close the outgoing year with lists and rankings of the year gone by, and a particular favorite of the foreign policy world is the fragility index, ranking every nation in the world by how much it destabilized or re-stabilized over the previous year — then estimating when and if 2014 might be the year it finally unravels. In a city where it seems you can’t sit down to lunch without hearing the neighboring table’s prognostications on the fate of world, Washington is of course no stranger to such rankings, where it seems every think tank, academic, and policy wonk around town has their picks. What then could big data possibly contribute to this mix?
Most country instability rankings blend a combination of structural indicators like GDP or infant mortality with lists of conflict events like attacks, coups, and protests — and perhaps even some subject-matter expert scores tossed in for good measure. The more sophisticated rankings usually incorporate a list of major conflicts from the preceding year compiled from news coverage, often curated by hand and including just the largest events deemed by the compilers as the most important. Yet, while the lists of events fed to these models are traditionally drawn from news coverage, most current event databases view the news as merely a “daybook” of physical occurrences to be cataloged into a spreadsheet. In doing so, they ignore one of the media’s greatest information channels: the differences in the volume of coverage each event receives, which proxies the news media’s view as to how broadly “significant” or “newsworthy” the event was.
The 2011 Egyptian Tahrir Square protest or the 2013 Euromaidan protest in Kiev would count as a single “protest event” in most datasets (or at least each day of the protest would count as an event), meaning there is no way to distinguish these two protests from the tens of thousands of other protests around the world that occurred at the same time. Yet, it takes only a single glance at the world’s headlines on those days to notice the near-global discussion centering on Tahrir Square or the Maidan, offering a rapid assessment that the world viewed them as particularly significant. Even in the case of more routine unrest, the global visibility an action receives through an avalanche of media coverage escalates its profile and potential impact, irrespective of what theory or past events might suggest. Indeed, my 2011 “Culturomics 2.0” study demonstrated the unique insights gleaned from looking at how the media covers an event, rather than just what it covers.
The Global Database of Events, Language, and Tone (GDELT) project is the largest event database in the world, capturing over a quarter-billion events in every country, down to the city level — across 300 categories, from 1979 to the present, and with daily updates of 100,000 events a day.
Moreover, GDELT is 100 percent free, with the entire dataset available for immediate download and a growing ecosystem of tools available to work with it. It uses a fully automated system to monitor the world’s news media across every country and compile a daily database not only of what’s happening, but, of greatest interest here, of how much media attention it is receiving. Thus, using GDELT’s “Material Conflict” category, which encompasses the wide range of conflict activities that nations undergo, one can quite easily compile a massive database of conflict in 2013 and how many articles covered each event. Then we can instantly compare it with 2012’s unrest to create a ranking of the biggest trends of 2013.
Of course, the notion of measuring news attention is not new to the field of instability, and there are other rankings that attempt to integrate media volume in some way. What is unique here, however, is the sheer scale and global coverage that GDELT enables. In total, 675 million references to 69 million events were processed to locate all Material Conflict events worldwide in 2012-2013 and recorded by GDELT, scoring each event by the total number of media mentions it received.
The resulting report is likely the largest event-based annual country ranking ever created, massively dwarfing the mere 3 million events captured by the U.S. Department of Defense’s Integrated Crisis Early Warning System (ICEWS)database during this period, despite ICEWS’ price tag now exceeding upwards of $50 million. GDELT’s size and scope offer the unique opportunity to essentially passively “crowdsource” the global news media and see what the most newsworthy conflict stories of 2013 were compared with 2012, while its open nature allows others to build on the ranking presented here and create their own rankings and deep-dives of the data.
In summary, to create the map you see here, every Material Conflict event in 2013 was compiled by country and the total volume of news coverage each received was tallied. Countries which had a reduction in the volume of global news coverage of their conflict in 2013 compared with 2012 are colored green, while those with increases are colored in red. Yellow countries are those which did not experience a significant change in volume about their conflict between the two years.
It is important to note two critical things about this ranking. The first is that it focuses on change in coverage of conflict, not the raw volume of that coverage itself. It is obviously not news that Afghanistan, Iraq, Libya, and Syria are all still undergoing intense conflict — the policymaking question is whether they are getting any better (at least in the eyes of the news media). Thus, Syria, which ranks No. 2 out of all countries in terms of total raw volume of conflict, actually had the greatest decrease in coverage of that conflict in 2013 (despite a major chemical weapons attack in August 2013) and thus is green in the map above. The second thing to keep in mind is that this ranking combines all forms of conflict, both domestic and foreign. France’s significant increase in conflict comes from a combination of domestic strife from increasing immigrant unrest, anti-Semitism, class wars, and societal fractionalization — but also its foreign military interventions in Africa, from Mali to Central African Republic.
The top entry on the list, Egypt, will likely surprise few. After 2011’s revolution, 2012 was a relatively calm year so to speak, but 2013 saw the country unravel at the seams. India has had a tumultuous year, from sexual violence to protests, while al-Shabab and radicalism continued to take root in Kenya, including the Westgate shopping mall attack. Terrorism is also back in the spotlight in Russia, including links to April’s Boston Marathon bombing and a growing domestic danger, while anti-Putin crackdowns, increasing anti-gay and anti-immigrant violence, and a rising neo-Nazi and nationalistic fervor cement it firmly in the top five of 2013. On the other end of the spectrum, for all of the talk about Iran this past year, in terms of actual material conflict, it had the fifth-greatest shift away from coverage of its unrest, and 2013 was a relatively quiet year for Israel, as well, compared to 2012’s military action against Hamas in Gaza.
Of course, the best part is being able to dive in yourself, so without further ado, download the complete 172-page report and take a look at the countries you are most interested in or check out how they compare in the master ranking table at the end of the report. Take a look through what a big data view of 675 million mentions of conflict tell us about how the world is changing. When you’re done, sign up for the new GDELT Daily Trends Report email and get a miniature version of this delivered to your inbox each morning. Big data is giving us our first glimpse of a world in which we can map the Earth’s riots as well as we can its earthquakes and hurricanes — and all from just reading the news a little more carefully.