Germany’s World Cup win over Argentina on Sunday wasn’t just a victory for the German national team — it was also a victory for Microsoft’s big-data team. Over the course of the World Cup’s knockout stage, Microsoft correctly predicted the outcome of every match in the tournament’s final rounds, including picking the Germans to win it all.
For Microsoft’s big-data team, soccer is just one of the many fields that it hopes number crunching can help it dominate. The Microsoft team has trained its analytical powers on a variety of events, from determining selections in the NBA draft to the outcome of reality TV shows like American Idol and The Voice. The predictions have been impressive and in some cases perfect. In a prediction for the 2014 Oscars, the model correctly picked 21 out of 24 award winners, including those honored in all of the major categories.
Meanwhile, the World Cup model has turned heads, beating rivals like Nate Silver’s FiveThirtyEight blog, which had hometown Brazil taking home the trophy. "I approach modeling the World Cup the same I would any other event," David Rothschild, an economist at Microsoft and an architect of its World Cup model, said in an interview. "The trick is to make a forecast that cuts out subjectivity and lets the data do the talking."
Predictive mathematical models are hardly new, but they are becoming increasingly more accurate. As big data in sports continues to gain notoriety, crossover into other avenues is inevitable. "The exact same infrastructure that goes into sports forecasts is the same stuff that will allow us to answer business and international policy questions," argues Rothschild. Already, similar models to Microsoft’s World Cup predictor are being employed in other areas. "Sports are fun, but we can use the same techniques to predict elections or watch stocks. It’s all about analyzing the raw data," says Rothschild.
Rothschild was deliberately vague about how the algorithm works, but attributed its success to the mountains of data Microsoft was able to sponge up as the World Cup dragged on. As data snowballed by the time of the knockout round, Rothschild had enough information on player and team performance to properly calibrate his model and adjust his forecasts for the coming matches. While other World Cup models remained fixed on pre-tournament statistics, Rothschild’s was constantly being updated with each match.
In many ways, this type of modeling is the inevitable byproduct of the current information age, where the ability to analyze data is finally catching up to the ability to collect it. Not only do analysts have more information than ever to work with for their models, but they also have the technology to compute it into something coherent — in a fraction of the time. "A few years ago I would have to wait until each game is over to access all the stats," recalls Rothschild. "Now, it is being sent automatically in real time, which makes our models better adjusted and more accurate."
That brings Microsoft into competition with other stats wonks like Silver. Although his World Cup model didn’t fare too well, Silver shot to fame through his startlingly accurate predictions of the 2012 elections. Similar big-data analyses are being deployed in business, government, economics, and social science.
But the quest for perfection in predictive analytics is elusive. Microsoft’s track record during the World Cup is a testament to the strength of the team’s model, and its success is making a compelling case for the power of big data.
Though as Silver has learned, models are great only until they aren’t.