Crowdsourcing high-quality structured data
Publication date
2019
Editors
Muñante, Denisse
Alatrista-Salas, Hugo
Lossio-Ventura, Juan Antonio
Advisors
Supervisors
Document Type
Part of book
Metadata
Show full item recordCollections
License
cc_by_nc_nd
Abstract
One of the most difficult problems faced by consumers of semi-structured and structured data on the Web is how to discover or create the data they need. On the other hand, the producers of Web data do not have any (semi)automated way to align their data production with consumer needs. In this paper we formalize the problem of a data marketplace, hypothesize that one can quantify the value of semi-structured and structured data given a set of consumers, and that this quantification can be applied on both existing data-sets and data-sets that need to be created. Furthermore, we provide an algorithm for showing how the production of this data can be crowd-sourced while assuring the consumer a certain level of quality. Using real-world empirical data collected via data producers and consumers, we simulate a crowd-sourced data marketplace with quality guarantees.
Keywords
Crowdsourcing, Human computation, Resource allocation, Structured data, Taverne, General Computer Science, General Mathematics
Citation
Halpin, H & Lykourentzou, I 2019, Crowdsourcing high-quality structured data. in D Muñante, H Alatrista-Salas & J A Lossio-Ventura (eds), Information Management and Big Data - 5th International Conference, SIMBig 2018, Proceedings. Communications in Computer and Information Science, vol. 898, Springer, pp. 304-319, 5th International Conference on Information Management and Big Data, SIMBig 2018, Lima, Peru, 3/09/18. https://doi.org/10.1007/978-3-030-11680-4_29, conference