Arbeitspapier
Entity matching with similarity encoding: A supervised learning recommendation framework for linking (big) data
In this study, we introduce a novel entity matching (EM) framework. It com-bines state-of-the-art EM approaches based on Artificial Neural Networks (ANN) with a new similarity encoding derived from matching techniques that are preva-lent in finance and economics. Our framework is on-par or outperforms alternative end-to-end frameworks in standard benchmark cases. Because similarity encod-ing is constructed using (edit) distances instead of semantic similarities, it avoids out-of-vocabulary problems when matching dirty data. We highlight this property by applying an EM application to dirty financial firm-level data extracted from historical archives.
- Language
-
Englisch
- Bibliographic citation
-
Series: SAFE Working Paper ; No. 398
- Classification
-
Wirtschaft
- Subject
-
Entity matching
Entity resolution
Database linking
Machine learning
Record resolution
Similarity encoding
- Event
-
Geistige Schöpfung
- (who)
-
Karapanagiotis, Pantelis
Liebald, Marius
- Event
-
Veröffentlichung
- (who)
-
Leibniz Institute for Financial Research SAFE
- (where)
-
Frankfurt a. M.
- (when)
-
2023
- Handle
- Last update
-
10.03.2025, 11:46 AM CET
Data provider
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. If you have any questions about the object, please contact the data provider.
Object type
- Arbeitspapier
Associated
- Karapanagiotis, Pantelis
- Liebald, Marius
- Leibniz Institute for Financial Research SAFE
Time of origin
- 2023