Arbeitspapier

O desafio do pareamento de grandes bases de dados: Mapeamento de métodos de record linkage probabilístico e diagnóstico de sua viabilidade empírica

This paper verified the predictive performance of probabilistic record linkage algorithms for the integration big sized real databases, evaluating the effects of the blocking key definition, as well as string metric functions and phonetic code pairing algorithms with respect to the prediction's quality and computational complexity. A bibliographical survey of the main deterministic and probabilistic record linkage methods was carried out, as well as of recent advances combining machine learning techniques and main packages and implementations available in open-source R language. The results can provide heuristics for problems of administrative records integration at national level and have potential value for the formulation and evaluation of public policies

Sprache
Portugiesisch

Erschienen in
Series: Texto para Discussão ; No. 2420

Klassifikation
Wirtschaft
Model Evaluation, Validation, and Selection
Large Data Sets: Modeling and Analysis
Miscellaneous Mathematical Tools
Data Collection and Data Estimation Methodology; Computer Programs: General
Data Collection and Data Estimation Methodology; Computer Programs: Other Computer Software
Thema
pairs linking
blocking
administrative records
Big Data

Ereignis
Geistige Schöpfung
(wer)
Peng, Yaohao
Mation, Lucas Ferreira
Ereignis
Veröffentlichung
(wer)
Instituto de Pesquisa Econômica Aplicada (IPEA)
(wo)
Brasília
(wann)
2018

Handle
Letzte Aktualisierung
10.03.2025, 11:43 MEZ

Datenpartner

Dieses Objekt wird bereitgestellt von:
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.

Objekttyp

  • Arbeitspapier

Beteiligte

  • Peng, Yaohao
  • Mation, Lucas Ferreira
  • Instituto de Pesquisa Econômica Aplicada (IPEA)

Entstanden

  • 2018

Ähnliche Objekte (12)