Arbeitspapier
O desafio do pareamento de grandes bases de dados: Mapeamento de métodos de record linkage probabilístico e diagnóstico de sua viabilidade empírica
This paper verified the predictive performance of probabilistic record linkage algorithms for the integration big sized real databases, evaluating the effects of the blocking key definition, as well as string metric functions and phonetic code pairing algorithms with respect to the prediction's quality and computational complexity. A bibliographical survey of the main deterministic and probabilistic record linkage methods was carried out, as well as of recent advances combining machine learning techniques and main packages and implementations available in open-source R language. The results can provide heuristics for problems of administrative records integration at national level and have potential value for the formulation and evaluation of public policies
- Sprache
-
Portugiesisch
- Erschienen in
-
Series: Texto para Discussão ; No. 2420
- Klassifikation
-
Wirtschaft
Model Evaluation, Validation, and Selection
Large Data Sets: Modeling and Analysis
Miscellaneous Mathematical Tools
Data Collection and Data Estimation Methodology; Computer Programs: General
Data Collection and Data Estimation Methodology; Computer Programs: Other Computer Software
- Thema
-
pairs linking
blocking
administrative records
Big Data
- Ereignis
-
Geistige Schöpfung
- (wer)
-
Peng, Yaohao
Mation, Lucas Ferreira
- Ereignis
-
Veröffentlichung
- (wer)
-
Instituto de Pesquisa Econômica Aplicada (IPEA)
- (wo)
-
Brasília
- (wann)
-
2018
- Handle
- Letzte Aktualisierung
-
10.03.2025, 11:43 MEZ
Datenpartner
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.
Objekttyp
- Arbeitspapier
Beteiligte
- Peng, Yaohao
- Mation, Lucas Ferreira
- Instituto de Pesquisa Econômica Aplicada (IPEA)
Entstanden
- 2018