Konferenzbeitrag

Evaluating the Impact of Coder Errors on Active Learning

Active Learning (AL) has been proposed as a technique to reduce the amount of annotated data needed in the context of supervised classification. While various simulation studies for a number of NLP tasks have shown that AL works well on goldstandard data, there is some doubt whether the approach can be successful when applied to noisy, real-world data sets. This paper presents a thorough evaluation of the impact of annotation noise on AL and shows that systematic noise resulting from biased coder decisions can seriously harm the AL process. We present a method to filter out inconsistent annotations during AL and show that this makes AL far more robust when applied to noisy data.

Urheber*in: Rehbein, Ines; Ruppenhofer, Josef

Urheberrechtsschutz

Sprache: Englisch

Thema: Linguistik

Ereignis: Geistige Schöpfung

(wer): Rehbein, Ines
Ruppenhofer, Josef

Ereignis: Veröffentlichung

(wer): Stroudsburg : Association for Computational Linguistics

(wann): 2016-09-22

URN: urn:nbn:de:bsz:mh39-52929

Letzte Aktualisierung: 06.03.2025, 09:00 MEZ

Datenpartner

Dieses Objekt wird bereitgestellt von:
Leibniz-Institut für Deutsche Sprache - Bibliothek. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.

Original beim Datenpartner anzeigen

Objekttyp

Konferenzbeitrag

Beteiligte

Rehbein, Ines
Ruppenhofer, Josef
Stroudsburg : Association for Computational Linguistics

Entstanden

2016-09-22

Ähnliche Objekte (12)

Konferenzbeitrag

A New Resource for German Causal Language

Konferenzbeitrag

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

Buchbeitrag

Detecting annotation noise in automatically labelled data

Buchbeitrag

I’ve got a construction looks funny – representing and recovering non-standard constructions in UD

Buchbeitrag

Sprucing up the trees – error detection in treebanks

Konferenzbeitrag

Detecting the boundaries of sentence-like units on spoken German

Konferenzbeitrag

Semantic frames as an anchor representation for sentiment analysis

Konferenzbeitrag

Catching the common cause: extraction and annotation of causal relations and their participants

Konferenzbeitrag

Yes we can!? Annotating the senses of English modal verbs

Konferenzbeitrag

Bringing Active Learning to Life

Konferenzbeitrag

MaJo - A Toolkit for Supervised Word Sense Disambiguation and Active Learning

Konferenzbeitrag

Improving Sentence Boundary Detection for Spoken Language Transcripts

Konferenzbeitrag

A New Resource for German Causal Language

Konferenzbeitrag

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

Buchbeitrag

Detecting annotation noise in automatically labelled data

Buchbeitrag

I’ve got a construction looks funny – representing and recovering non-standard constructions in UD

Buchbeitrag

Sprucing up the trees – error detection in treebanks

Konferenzbeitrag

Detecting the boundaries of sentence-like units on spoken German

Konferenzbeitrag

Semantic frames as an anchor representation for sentiment analysis

Konferenzbeitrag

Catching the common cause: extraction and annotation of causal relations and their participants

Konferenzbeitrag

Yes we can!? Annotating the senses of English modal verbs

Konferenzbeitrag

Bringing Active Learning to Life

Konferenzbeitrag

MaJo - A Toolkit for Supervised Word Sense Disambiguation and Active Learning

Konferenzbeitrag

Improving Sentence Boundary Detection for Spoken Language Transcripts

Konferenzbeitrag

A New Resource for German Causal Language

Konferenzbeitrag

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

Buchbeitrag

Detecting annotation noise in automatically labelled data

Buchbeitrag

I’ve got a construction looks funny – representing and recovering non-standard constructions in UD

Buchbeitrag

Sprucing up the trees – error detection in treebanks

Konferenzbeitrag

Detecting the boundaries of sentence-like units on spoken German

Konferenzbeitrag

Semantic frames as an anchor representation for sentiment analysis

Konferenzbeitrag

Catching the common cause: extraction and annotation of causal relations and their participants

Konferenzbeitrag

Yes we can!? Annotating the senses of English modal verbs

Konferenzbeitrag

Bringing Active Learning to Life

Konferenzbeitrag

MaJo - A Toolkit for Supervised Word Sense Disambiguation and Active Learning

Konferenzbeitrag

Improving Sentence Boundary Detection for Spoken Language Transcripts

Informationen zur Registrierung von Kultur- und Wissenseinrichtungen finden Sie hier.

Felder mit * müssen ausgefüllt werden.

Benutzername*

Bitte geben Sie Ihren Benutzernamen ein

E-Mail*

Bitte geben Sie Ihre E-Mail ein

Bitte füllen Sie dieses Feld nicht aus

Vorname

Nachname

Passwort*

Bitte geben Sie Ihr Passwort ein

Passwort bestätigen*

Bitte geben Sie das gleiche Passwort ein

Ich habe die Nutzungsbedingungen und die Datenschutzerklärung zur Erhebung persönlicher Daten gelesen und stimme ihnen zu. *

Dieses Feld ist ein Pflichtfeld.

Ich möchte den Newsletter der Deutschen Digitalen Bibliothek abonnieren. Siehe Informationen zum Newsletter-Abonnement.

Benutzerkonto angelegt

Ihr „Meine DDB“-Konto wurde erfolgreich angelegt. Bevor Sie sich in Ihrem Konto anmelden können, müssen Sie auf den Bestätigungslink in der Nachricht klicken, die wir gerade an die von Ihnen angegebene E-Mail-Adresse geschickt haben

Evaluating the Impact of Coder Errors on Active Learning

Download

Angaben zum Objekt

Klassifikation und Themen

Beteiligte, Orts- und Zeitangaben

Weitere Informationen

Datenpartner

Objekttyp

Beteiligte

Entstanden

Ähnliche Objekte (12)

A New Resource for German Causal Language

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

Detecting annotation noise in automatically labelled data

I’ve got a construction looks funny – representing and recovering non-standard constructions in UD

Sprucing up the trees – error detection in treebanks

Detecting the boundaries of sentence-like units on spoken German

Semantic frames as an anchor representation for sentiment analysis

Catching the common cause: extraction and annotation of causal relations and their participants

Yes we can!? Annotating the senses of English modal verbs

Bringing Active Learning to Life

MaJo - A Toolkit for Supervised Word Sense Disambiguation and Active Learning

Improving Sentence Boundary Detection for Spoken Language Transcripts

A New Resource for German Causal Language

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

Detecting annotation noise in automatically labelled data

I’ve got a construction looks funny – representing and recovering non-standard constructions in UD

Sprucing up the trees – error detection in treebanks

Detecting the boundaries of sentence-like units on spoken German

Semantic frames as an anchor representation for sentiment analysis

Catching the common cause: extraction and annotation of causal relations and their participants

Yes we can!? Annotating the senses of English modal verbs

Bringing Active Learning to Life

MaJo - A Toolkit for Supervised Word Sense Disambiguation and Active Learning

Improving Sentence Boundary Detection for Spoken Language Transcripts

A New Resource for German Causal Language

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

Detecting annotation noise in automatically labelled data

I’ve got a construction looks funny – representing and recovering non-standard constructions in UD

Sprucing up the trees – error detection in treebanks

Detecting the boundaries of sentence-like units on spoken German

Semantic frames as an anchor representation for sentiment analysis

Catching the common cause: extraction and annotation of causal relations and their participants

Yes we can!? Annotating the senses of English modal verbs

Bringing Active Learning to Life

MaJo - A Toolkit for Supervised Word Sense Disambiguation and Active Learning

Improving Sentence Boundary Detection for Spoken Language Transcripts

Verbundene Objekte

Passwort zurücksetzen