Should Researchers Trust the Mechanical Turk?

Mechanical Turk (MT) is an online labor marketplace run by Amazon in which large groups of online workers ("Turkers") can be recruited to carry out routine tasks that for whatever reason can’t be automated — tagging content in online videos for instance. (The service is named after a famous 18th-century fraud in which people were ...

By , a former associate editor at Foreign Policy.
Wikipedia
Wikipedia
Wikipedia

Mechanical Turk (MT) is an online labor marketplace run by Amazon in which large groups of online workers ("Turkers") can be recruited to carry out routine tasks that for whatever reason can't be automated -- tagging content in online videos for instance. (The service is named after a famous 18th-century fraud in which people were fooled into believing they were playing chess against a life-size supposedly automated "Turk" who was in fact controlled by a skilled human player.)

Mechanical Turk (MT) is an online labor marketplace run by Amazon in which large groups of online workers ("Turkers") can be recruited to carry out routine tasks that for whatever reason can’t be automated — tagging content in online videos for instance. (The service is named after a famous 18th-century fraud in which people were fooled into believing they were playing chess against a life-size supposedly automated "Turk" who was in fact controlled by a skilled human player.)

In recent years, MT has become a popular way for social scientists, particularly psychologists, to recruit test subjects. Turkers come cheap after all — they’re often paid as little as $1.50 an hour — and are available in abundance. A number of the papers I’ve discussed on this site have relied in whole or in part on MT samples.

In a two-part blog post, Dan Kahan — a professor of law and psychology at Yale University — takes researchers to task for relying on Turk subjects in their research. Kahan discusses three primary flaws with Turk samples:

1. Selection bias. Given the types of tasks performed by MT workers, there’s good reason to suspect subjects recruited via MT differ in material ways from the people in the world whose dispositions we are interested in measuring, particularly conservative males.

2. Prior, repeated exposure to study measures. Many MT workers have participated multiple times in studies that use performance-based measures of cognition and have discussed among themselves what the answers are. Their scores are thus not valid.

3. MT subjects misrepresent their nationality. Some fraction of the MT work force participating in studies that are limited to "U.S. residents only" aren’t in fact U.S. residents, thereby defeating inferences about how psychological dynamics distinctive of U.S. citizens of diverse ideologies operate.

On the first point, Kahan notes that MT workers are 62 percent female, only 5 percent African-American (the general U.S. population is 12 percent), and 53 percent self-identified liberals (only 20 percent of the general population.)

The problem of representativeness in test samples isn’t limited to MT of course. A 2011 University of British Columbia Study noted that "in the top international journals in six fields of psychology from 2003 to 2007, 68 percent of subjects came from the United States — and a whopping 96 percent from Western, industrialized countries. In one journal, 67 percent of American subjects and 80 percent of non-American subjects were undergraduates in psychology courses." This means that a good deal of what we know about the human brain comes from research on so-called WEIRD — Western, educated, industrialized, rich, and democratic societies — test subjects.

Still, if Turk-recruited subjects are essentially becoming professional test-takers, misrepresenting their qualifications and even their nationality in order to be eligible for studies, that’s a bigger problem than making generalizations based on the responses of UC undergrads.

And the bad press continues for psych research.

Joshua Keating is a former associate editor at Foreign Policy. Twitter: @joshuakeating

More from Foreign Policy

Children are hooked up to IV drips on the stairs at a children's hospital in Beijing.
Children are hooked up to IV drips on the stairs at a children's hospital in Beijing.

Chinese Hospitals Are Housing Another Deadly Outbreak

Authorities are covering up the spread of antibiotic-resistant pneumonia.

Henry Kissinger during an interview in Washington in August 1980.
Henry Kissinger during an interview in Washington in August 1980.

Henry Kissinger, Colossus on the World Stage

The late statesman was a master of realpolitik—whom some regarded as a war criminal.

A Ukrainian soldier in helmet and fatigues holds a cell phone and looks up at the night sky as an explosion lights up the horizon behind him.
A Ukrainian soldier in helmet and fatigues holds a cell phone and looks up at the night sky as an explosion lights up the horizon behind him.

The West’s False Choice in Ukraine

The crossroads is not between war and compromise, but between victory and defeat.

Illustrated portraits of Reps. MIke Gallagher, right, and Raja Krishnamoorthi
Illustrated portraits of Reps. MIke Gallagher, right, and Raja Krishnamoorthi

The Masterminds

Washington wants to get tough on China, and the leaders of the House China Committee are in the driver’s seat.