How to Handle Health-Related Small Imbalanced Data in Machine Learning?

Abstract: When discussing interpretable machine learning results, researchers need to compare them and check for reliability, especially for health-related data. The reason is the negative impact of wrong results on a person, such as in wrong prediction of cancer, incorrect assessment of the COVID-19 pandemic situation, or missing early screening of dyslexia. Often only small data exists for these complex interdisciplinary research projects. Hence, it is essential that this type of research understands different methodologies and mindsets such as the Design Science Methodology, Human-Centered Design or Data Science approaches to ensure interpretable and reliable results. Therefore, we present various recommendations and design considerations for experiments that help to avoid over-fitting and biased interpretation of results when having small imbalanced data related to health. We also present two very different use cases: early screening of dyslexia and event prediction in multiple sclerosis.

Location
Deutsche Nationalbibliothek Frankfurt am Main
Extent
Online-Ressource
Language
Englisch

Bibliographic citation
How to Handle Health-Related Small Imbalanced Data in Machine Learning? ; volume:19 ; number:3 ; year:2021 ; pages:215-226 ; extent:12
i-com ; 19, Heft 3 (2021), 215-226 (gesamt 12)

Creator
Rauschenberger, Maria
Baeza-Yates, Ricardo

DOI
10.1515/icom-2020-0018
URN
urn:nbn:de:101:1-2023032814454273690275
Rights
Open Access; Der Zugriff auf das Objekt ist unbeschränkt möglich.
Last update
14.08.2025, 10:47 AM CEST

Data provider

This object is provided by:
Deutsche Nationalbibliothek. If you have any questions about the object, please contact the data provider.

Associated

Other Objects (12)