Empirical Methodology for Crowdsourcing Ground Truth

Tracking #: 2004-3217

Authors: 
Anca Dumitrache
Oana Inel
Benjamin Timmermans
Carlos Ortiz
Robert-Jan Sips
Lora Aroyo
Chris Welty

Responsible editor: 
Guest Editors Human Computation and Crowdsourcing

Submission type: 
Full Paper
Abstract: 
The process of gathering ground truth data through human annotation is a major bottleneck in the use of information extraction methods for populating the Semantic Web. Crowdsourcing-based approaches are gaining popularity in the attempt to solve the issues related to volume of data and lack of annotators. Typically these practices use inter-annotator agreement as a measure of quality. However, in many domains, such as event detection, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. We present an empirically derived methodology for efficiently gathering of ground truth data in a diverse set of use cases covering a variety of domains and annotation tasks. Central to our approach is the use of CrowdTruth metrics that capture inter-annotator disagreement. We show that measuring disagreement is essential for acquiring a high quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical Relation Extraction, Twitter Event Identification, News Event Extraction and Sound Interpretation. We also show that an increased number of crowd workers leads to growth and stabilization in the quality of annotations, going against the usual practice of employing a small number of annotators.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 21/Nov/2018
Suggestion:
Accept
Review Comment:

I thank the authors for their effort in addressing my comments and criticisms. Overall, I think they can be happy because the introduced modifications definitely improved the readability and the understanding of their methods and experiments. I have no further comments to add.

Review #2
By Maribel Acosta submitted on 16/Jan/2019
Suggestion:
Accept
Review Comment:

I would like to thank the authors for including the task timestamps in the *_raw.csv files in GitHub.