Arbeitspapier

Entity matching with similarity encoding: A supervised learning recommendation framework for linking (big) data

In this study, we introduce a novel entity matching (EM) framework. It com-bines state-of-the-art EM approaches based on Artificial Neural Networks (ANN) with a new similarity encoding derived from matching techniques that are preva-lent in finance and economics. Our framework is on-par or outperforms alternative end-to-end frameworks in standard benchmark cases. Because similarity encod-ing is constructed using (edit) distances instead of semantic similarities, it avoids out-of-vocabulary problems when matching dirty data. We highlight this property by applying an EM application to dirty financial firm-level data extracted from historical archives.

Language
Englisch

Bibliographic citation
Series: SAFE Working Paper ; No. 398

Classification
Wirtschaft
Subject
Entity matching
Entity resolution
Database linking
Machine learning
Record resolution
Similarity encoding

Event
Geistige Schöpfung
(who)
Karapanagiotis, Pantelis
Liebald, Marius
Event
Veröffentlichung
(who)
Leibniz Institute for Financial Research SAFE
(where)
Frankfurt a. M.
(when)
2023

Handle
Last update
10.03.2025, 11:46 AM CET

Data provider

This object is provided by:
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. If you have any questions about the object, please contact the data provider.

Object type

  • Arbeitspapier

Associated

  • Karapanagiotis, Pantelis
  • Liebald, Marius
  • Leibniz Institute for Financial Research SAFE

Time of origin

  • 2023

Other Objects (12)