Arbeitspapier

Variable selection bias in classification trees based on imprecise probabilities

Classification trees based on imprecise probabilities provide an advancement of classical classification trees. The Gini Index is the default splitting criterion in classical classification trees, while in classification trees based on imprecise probabilities, an extension of the Shannon entropy has been introduced as the splitting criterion. However, the use of these empirical entropy measures as split selection criteria can lead to a bias in variable selection, such that variables are preferred for features other than their information content. This bias is not eliminated by the imprecise probability approach. The source of variable selection bias for the estimated Shannon entropy, as well as possible corrections, are outlined. The variable selection performance of the biased and corrected estimators are evaluated in a simulation study. Additional results from research on variable selection bias in classical classification trees are incorporated, implying further investigation of alternative split selection criteria in classification trees based on imprecise probabilities. Keywords. Classification trees ; credal classification ; variable selection bias ; attribute selection error ; Shannon entropy ; entropy estimation

Sprache
Englisch

Erschienen in
Series: Discussion Paper ; No. 419

Ereignis
Geistige Schöpfung
(wer)
Strobl, Carolin
Ereignis
Veröffentlichung
(wer)
Ludwig-Maximilians-Universität München, Sonderforschungsbereich 386 - Statistische Analyse diskreter Strukturen
(wo)
München
(wann)
2005

DOI
doi:10.5282/ubm/epub.1788
Handle
URN
urn:nbn:de:bvb:19-epub-1788-0
Letzte Aktualisierung
10.03.2025, 11:46 MEZ

Datenpartner

Dieses Objekt wird bereitgestellt von:
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.

Objekttyp

  • Arbeitspapier

Beteiligte

  • Strobl, Carolin
  • Ludwig-Maximilians-Universität München, Sonderforschungsbereich 386 - Statistische Analyse diskreter Strukturen

Entstanden

  • 2005

Ähnliche Objekte (12)