Arbeitspapier
Building predictive models for feature selection in genomic mining
Building predictive models for genomic mining requires feature selection, as an essential preliminary step to reduce the large number of variable available. Feature selection is a process to select a subset of features which is the most essential for the intended tasks such as classification, clustering or regression analysis. In gene expression microarray data, being able to select a few genes not only makes data analysis efficient but also helps their biological interpretation. Microarray data has typically several thousands of genes (features) but only tens of samples. Problems which can occur due to the small sample size have not been addressed well in the literature. Our aim is to discuss some issues on feature selection in microarray data in order to select the most predictive genes. We compare classical approaches based on statistical tests with a new approach based on marker selection. Finally, we compare the best predictive model with a model derived from a boosting method.
- Sprache
-
Englisch
- Erschienen in
-
Series: Quaderni di Dipartimento - EPMQ ; No. 184
- Klassifikation
-
Wirtschaft
- Thema
-
Association models
Boosting
Feature selection
Gene expression
Marker Selection
Model Assessment
Predictive models
Chi-square selection
Lernen
Prognoseverfahren
Theorie
Data Mining
- Ereignis
-
Geistige Schöpfung
- (wer)
-
Figini, Silvia
Giudici, Paolo
- Ereignis
-
Veröffentlichung
- (wer)
-
Università degli Studi di Pavia, Dipartimento di Economia Politica e Metodi Quantitativi (EPMQ)
- (wo)
-
Pavia
- (wann)
-
2006
- Handle
- Letzte Aktualisierung
-
10.03.2025, 11:43 MEZ
Datenpartner
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.
Objekttyp
- Arbeitspapier
Beteiligte
- Figini, Silvia
- Giudici, Paolo
- Università degli Studi di Pavia, Dipartimento di Economia Politica e Metodi Quantitativi (EPMQ)
Entstanden
- 2006