How much can a superfast algorithm tell us about Iran? Quite a lot, actually.
- By Kalev LeetaruKalev H. Leetaru is the Yahoo! Fellow in Residence at the Institute for the Study of Diplomacy in the Edmund A. Walsh School of Foreign Service at Georgetown University. His work centers on the application of "big data" towards understanding global human society in new ways and he is the creator of the GDELT Project.
Iran’s nuclear program has been one of the hottest topics in foreign policy for years, and attention has only intensified over the past few days, as an interim agreement was reached in Geneva to limit enrichment activity in pursuit of a more comprehensive deal. The details of the deal itself are of course interesting, but in aggregate the news stories about Iran can tell us far more than we can learn simply by reading each story on its own. By using “big data” analytics of the world’s media coverage, combined with network visualization techniques, we can study the web of relationships around any given storyline — whether it focuses on an individual, a topic, or even an entire country. Using these powerful techniques, we can move beyond specifics to patterns — and the patterns tell us that our understanding of Iran is both sharp and sharply limited.
In the diagram below, every global English-language news article monitored by the GDELT Global Knowledge Graph — a massive compilation of the world’s people, organizations, locations, themes, emotions, and events — has been analyzed to identify all people mentioned in articles referencing any location in Iran between April and October 2013. A list was compiled of every person mentioned in each article, and all names mentioned in an article together were connected. The end result was a network diagram of all of the people mentioned in the global news coverage of Iran over the last seven months and who has appeared with whom in that coverage.
This network diagram was then visualized using a technique that groups individuals who are more closely connected with each other, placing them physically more proximate in the diagram, while placing individuals with fewer connections farther apart. Then, using an approach known as “community finding,” clusters of people who are more closely connected with each other than with the rest of the network were drawn in the same color. The specific color assigned to each group is not meaningful, only that people drawn in the same color are more closely connected to one another. Together, these two approaches make the overall macro-level structure of the network instantly clear, teasing apart the clusters and connections among the newsmakers defining Iranian news coverage.
(For the technical readers, the software used was Gephi, the layout algorithm was “Force Atlas 2,” and the community-finding tool was Blondel et al.’s implementation of modularity finding.)
Because most names in the news occur in just a handful of articles, the visual above shows the result of filtering the network to show only those names that occurred in 15 or more articles. This eliminates the vast majority of names, while preserving names that are more likely to be directly related to Iranian affairs and still capturing a broad swath of the discourse around Iran. The purple cluster is largely the United States and its allies, with Barack Obama right in the center, while the dark blue node towards of the lower center of the entire network is Edward Snowden, capturing the way in which he has become one of the most prominent figures in discussion of U.S. foreign policy. This is a fascinating finding: While Snowden obviously has no part in the Iranian-U.S. nuclear talks, his outsized role in the global conversation about U.S. foreign policy has made him part of the context in which those talks are discussed. In particular, there has been substantial media coverage connecting the approaches Snowden used to defeat the NSA’s internal security procedures with some of those used by the United States in its attempts to sabotage Iran’s nuclear efforts. The media has also used the materials Snowden has released to reconstruct how U.S. spy agencies may have been involved in the Stuxnet attack on Iran.
The blue-green cluster in the bottom right largely consists of Israeli reporters and commentators, while the light blue cluster at top left consists of international reporters. The yellow cluster along the left side of the graph is where all of the Iranian names appear, with key figures like Hassan Rouhani, Ali Khamenei, Mohammad Javad Zarif, and Mahmoud Ahmadinejad all playing prominent roles in bridging Iran to the other clusters. Iranian politicians like Esfandiar Rahim Mashaei, Mohammad-Reza Aref, and Gholam Ali Haddad-Adel play central roles internally to the cluster, representing their important roles within Iran, but their limited engagement and contextualization over the last several months with the rest of the world.
The fact that this network accurately distinguishes internal and external leaders is a critical finding. Such resolving power means that this approach of externally mapping the newsmaker network around a country using public news coverage is sufficiently accurate to capture the nuance between newsmakers who operate largely within a country and those who have a more external role, and the external newsmakers with whom they are most closely connected. That such a news-based network would be capable of perceiving such nuanced detail suggests this approach may have powerful applications for mapping the internal structure of countries and organizations that receive considerable media coverage, but for which policymakers lack the detailed leadership diagrams compiled for higher-profile subjects like Iran.
The visual also makes it clear that the discourse around Iran does not focus on Iran itself or its internal politics, but rather on its nuclear ambitions and how they fits into the rest of the world. In particular, there is a strong Western-centric narrative to the English-language coverage around Iran, emphasizing U.S. interests, with Iranian leaders mentioned only in passing as they relate to those interests. In other words, news coverage across the world focuses on what the United States wants from Iran and what Iran needs to do to satisfy those demands, rather than the Iranian perspective on its role in the world. This is a key finding, as it reflects Iran’s intense marginalization over its nuclear program and is in contrast to other nations like Egypt. (An interactive version of this network is available here.)
The visualization below displays the same network as above, but this time filters to include only names and connections appearing in at least 50 articles, reflecting the most dominant newsmakers in global coverage of Iran. As one might expect, this graph reflects a much simpler structure, with Iranian figures occupying the lower green segment and the United States, its allies, and related countries like Russia occupying the top yellow cluster. The orange cluster at far lower left consists of a set of major reporters and a few politicians connected back to the broader network through Edward Snowden. In a nod to Israel’s key influence, Benjamin Netanyahu is the central pink node that connects the U.S. and Iranian clusters, while other key European figures like Catherine Ashton and William Hague also reside in this interface layer. (An interactive version of this network is available here.)
In Iran, it is notable that the actual nexus of power, Supreme Leader Ali Khamenei, does not occupy any more central of a network role than current President Hassan Rouhani or former President Mahmoud Ahmadinejad. This reflects the fact that despite his actual ultimate authority over Iranian politics, he maintains a relatively low externa
l profile, delegating most interactions with foreign dignitaries and formal public statements of policy. This can be used to better understand how a nation’s political elite view themselves and their internal and external roles, and especially how this may be changing over time.
Perhaps the most interesting finding is that when these “newsmaker networks” are constructed for a nation, the resulting layers of the network appear to at least largely match the broad contours of the political environment in that nation. Leaders most closely connected with external nations like the United States are those representing that nation’s foreign policy efforts, while those in successively inward layers are those who occupy progressively more internal roles in domestic politics.
This, however, raises the critical question: If data mining only tells us what we already know, is it actually useful? The ability of a network diagram, constructed automatically by computers and entirely of open global news coverage, to capture at least a semblance of the internal political structure of a nation, especially the separation of internal and external layers, is remarkable in what it enables. Our deep understanding of Iran’s internal power structure comes only through the breathtaking investment by the U.S. foreign policy community of decades of intense study by vast teams of analysts. The ability of a computer algorithm to arrive at even a most remotely similar diagram in a matter of seconds based only on open news coverage represents a fundamental transformation of our ability to rapidly understand a world in constant change.