TuneR: Fine tuning of rule-based entity matchers
Publication date
2019-11-03
Editors
Advisors
Supervisors
Document Type
Part of book
Metadata
Show full item recordCollections
License
Abstract
A rule-based entity matching task requires the definition of an effective set of rules, which is a time-consuming and error-prone process. The typical approach adopted for its resolution is a trial and error method, where the rules are incrementally added and modified until satisfactory results are obtained. This approach requires significant human intervention, since a typical dataset needs the definition of a large number of rules and possible interconnections that cannot be manually managed. In this paper, we propose TuneR, a software library supporting developers (i.e., coders, scientists, and domain experts) in tuning sets of matching rules. It aims to reduce human intervention by offering a tool for the optimization of rule sets based on user-defined criteria (such as effectiveness, interpretability, etc.). Our goal is to integrate the framework in the Magellan ecosystem, thus completing the functionalities required by the developers for performing Entity Matching tasks.
Keywords
Data deduplication, Data integration, Entity resolution, General Business,Management and Accounting, General Decision Sciences
Citation
Paganelli, M, Guerra, F, Sottovia, P & Velegrakis, Y 2019, TuneR : Fine tuning of rule-based entity matchers. in CIKM 2019 - Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, pp. 2945-2948, 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, 3/11/19. https://doi.org/10.1145/3357384.3357854, conference