Early demographic studies by Ipeirotis revealed that while the majority of Turkers were located in the US, India accounted for a strong 30% of Turkers. Follow up research by Ross et al. suggested that the international presence on MTurk has been growing over time, with India accounting for 36% of workers at the time of the study. While there has not yet been a thorough investigation of Turkers' language abilities, Munro compiled survey responses of 2000 Turkers, revealing that four of the six most represented languages come from India (top six being Hindi, Malayalam, Tamil, Spanish, French, and Telugu).
NLP and ML researchers have shown an increasing interest in using Mturk as a means of data collection. Snow et al. describes the success of using redundant non-expert labels to substitute for professional annotations, acheiving comparable quality for much lower cost. The tasks performed by Snow, however, are kept simple and accessible for the average English-speaking Turker. As NLP research advances, the level of expertise required from annotators advances as well. Callison-Burch et al. report success using MTurk to build parallel corpora for Machine Translation, a task which requires Turkers to speak two languages with a high level of proficiency. As the number of international and bilingual Turkers grows, particularly Turkers speaking low-resource languages, it is natural to ask to what extent we can rely on MTurk for accurate translations, and how confidently can we screen Turkers to ensure high-quality results.
- Callison-Burch, Chris, and Mark Dredze. "Creating speech and language data with Amazon's Mechanical Turk." Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. Association for Computational Linguistics, 2010.
- Downs, Julie S., et al. "Are your participants gaming the system?: screening mechanical turk workers." Proceedings of the 28th international conference on Human factors in computing systems. ACM, 2010.
- Munro, Robert and Tily, Hal. "The Start of the Art: An Introduction to Crowdsourcing Technologies for Language and Cognition Studies."
- Ross, Joel, et al. "Who are the crowdworkers?: shifting demographics in mechanical turk." Proceedings of the 28th of the international conference extended abstracts on Human factors in computing systems. ACM, 2010.
- Snow, Rion, et al. "Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks." Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2008.
And then, some scatter plots of quality against number of assignments submitted, by country and reported native language. In the plots on the right, the points are resized proportional to the number of Turkers contributing. (More specifically, size is the number of assignments-per-Turker, so bigger circles mean that a few Turkers were performing a lot of HITs).
(The x axis is number of individual controls graded, so directly proportional to the number of assignments, but blown up by about an order of magnitude.)
Each point represents a country
|
|
Each point represents a language
|































