A Systematic Approach to Reconciling Data Quality Failures: Investigation Using Spinal Cord Injury Data

Abstract: Background Secondary use of electronic health record's (EHR) data requires evaluation of data quality (DQ) for fitness of use. While multiple frameworks exist for quantifying DQ, there are no guidelines for the evaluation of DQ failures identified through such frameworks. Objectives This study proposes a systematic approach to evaluate DQ failures through the understanding of data provenance to support exploratory modeling in machine learning. Methods Our study is based on the EHR of spinal cord injury inpatients in a state spinal care center in Australia, admitted between 2011 and 2018 (inclusive), and aged over 17 years. DQ was measured in our prerequisite step of applying a DQ framework on the EHR data through rules that quantified DQ dimensions. DQ was measured as the percentage of values per field that meet the criteria or Krippendorff's α for agreement between variables. These failures were then assessed using semistructured interviews with purposively sampled domain experts. Results The DQ of the fields in our dataset was measured to be from 0% adherent up to 100%. Understanding the data provenance of fields with DQ failures enabled us to ascertain if each DQ failure was fatal, recoverable, or not relevant to the field's inclusion in our study. We also identify the themes of data provenance from a DQ perspective as systems, processes, and actors. Conclusion A systematic approach to understanding data provenance through the context of data generation helps in the reconciliation or repair of DQ failures and is a necessary step in the preparation of data for secondary use.

Location
Deutsche Nationalbibliothek Frankfurt am Main
Extent
Online-Ressource
Language
Englisch

Bibliographic citation
A Systematic Approach to Reconciling Data Quality Failures: Investigation Using Spinal Cord Injury Data ; volume:05 ; number:02 ; year:2021 ; pages:e94-e103
ACI Open ; 05, Heft 02 (2021), e94-e103

Contributor
Anantharama, Nandini
Buntine, Wray
Nunn, Andrew

DOI
10.1055/s-0041-1735975
URN
urn:nbn:de:101:1-2021120211250930517477
Rights
Open Access; Der Zugriff auf das Objekt ist unbeschränkt möglich.
Last update
15.08.2025, 7:35 AM CEST

Data provider

This object is provided by:
Deutsche Nationalbibliothek. If you have any questions about the object, please contact the data provider.

Associated

  • Anantharama, Nandini
  • Buntine, Wray
  • Nunn, Andrew

Other Objects (12)