Crowdsourcing high-quality structured data

Publication date

2019

Authors

Halpin, Harry
Lykourentzou, IoannaISNI 000000049291073X

Editors

Muñante, Denisse
Alatrista-Salas, Hugo
Lossio-Ventura, Juan Antonio

Advisors

Supervisors

Document Type

Part of book
Open Access logo

License

cc_by_nc_nd

Abstract

One of the most difficult problems faced by consumers of semi-structured and structured data on the Web is how to discover or create the data they need. On the other hand, the producers of Web data do not have any (semi)automated way to align their data production with consumer needs. In this paper we formalize the problem of a data marketplace, hypothesize that one can quantify the value of semi-structured and structured data given a set of consumers, and that this quantification can be applied on both existing data-sets and data-sets that need to be created. Furthermore, we provide an algorithm for showing how the production of this data can be crowd-sourced while assuring the consumer a certain level of quality. Using real-world empirical data collected via data producers and consumers, we simulate a crowd-sourced data marketplace with quality guarantees.

Keywords

Crowdsourcing, Human computation, Resource allocation, Structured data, Taverne, General Computer Science, General Mathematics

Citation

Halpin, H & Lykourentzou, I 2019, Crowdsourcing high-quality structured data. in D Muñante, H Alatrista-Salas & J A Lossio-Ventura (eds), Information Management and Big Data - 5th International Conference, SIMBig 2018, Proceedings. Communications in Computer and Information Science, vol. 898, Springer, pp. 304-319, 5th International Conference on Information Management and Big Data, SIMBig 2018, Lima, Peru, 3/09/18. https://doi.org/10.1007/978-3-030-11680-4_29, conference