Konferenzbeitrag

POS error detection in automatically annotated corpora

Recent work on error detection has shown that the quality of manually annotated corpora can be substantially improved by applying consistency checks to the data and automatically identifying incorrectly labelled instances. These methods, however, can not be used for automatically annotated corpora where errors are systematic and cannot easily be identified by looking at the variance in the data. This paper targets the detection of POS errors in automatically annotated corpora, so-called silver standards, showing that by combining different measures sensitive to annotation quality we can identify a large part of the errors and obtain a substantial increase in accuracy.

Urheber*in: Rehbein, Ines

Attribution 4.0 International

Language: Englisch

Subject: Korpus <Linguistik>
Automatische Sprachanalyse
Annotation
Sprache

Event: Geistige Schöpfung

(who): Rehbein, Ines

Event: Veröffentlichung

(who): Stroudsburg, PA : ACL

(when): 2016-11-21

URN: urn:nbn:de:bsz:mh39-55986

Last update: 06.03.2025, 9:00 AM CET

Data provider

This object is provided by:
Leibniz-Institut für Deutsche Sprache - Bibliothek. If you have any questions about the object, please contact the data provider.

Show original at data provider

Object type

Konferenzbeitrag

Associated

Rehbein, Ines
Stroudsburg, PA : ACL

Time of origin

2016-11-21

Other Objects (12)

POS error detection in automatically annotated corpora

Buchbeitrag

Detecting annotation noise in automatically labelled data

Buchbeitrag

Sprucing up the trees – error detection in treebanks

Dissertation o. Habilitation

Treebank-Based Grammar Acquisition for German

Konferenzbeitrag

Data point selection for self-training

Artikel

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Data point selection for self-training

Konferenzbeitrag

A New Resource for German Causal Language

Konferenzbeitrag

Treebank Annotation Schemes and Parser Evaluation for German

Buchbeitrag

Metaphor detection for German poetry

Konferenzbeitrag

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

POS error detection in automatically annotated corpora

Buchbeitrag

Detecting annotation noise in automatically labelled data

Buchbeitrag

Sprucing up the trees – error detection in treebanks

Dissertation o. Habilitation

Treebank-Based Grammar Acquisition for German

Konferenzbeitrag

Data point selection for self-training

Artikel

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Data point selection for self-training

Konferenzbeitrag

A New Resource for German Causal Language

Konferenzbeitrag

Treebank Annotation Schemes and Parser Evaluation for German

Buchbeitrag

Metaphor detection for German poetry

Konferenzbeitrag

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

POS error detection in automatically annotated corpora

Buchbeitrag

Detecting annotation noise in automatically labelled data

Buchbeitrag

Sprucing up the trees – error detection in treebanks

Dissertation o. Habilitation

Treebank-Based Grammar Acquisition for German

Konferenzbeitrag

Data point selection for self-training

Artikel

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Data point selection for self-training

Konferenzbeitrag

A New Resource for German Causal Language

Konferenzbeitrag

Treebank Annotation Schemes and Parser Evaluation for German

Buchbeitrag

Metaphor detection for German poetry

Konferenzbeitrag

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

Cultural heritage institutions wishing to register will find more information here.

Fields marked * need to be filled in.

Username*

Please enter your username

Email*

Please enter your email address

Please do not fill this field

First name

Last name

Password*

Please enter your password

Confirm password*

Please enter the same password

I have read the terms of use and the privacy policy for the collection of personal data and accept them. *

This field is required.

I would like to subscribe to the newsletter of the Deutsche Digitale Bibliothek. See newsletter subscription info.

Account created

Your "My DDB" account has been successfully created. Before you can log in to your account, you must click the confirmation link in the message we just sent to the email address you provided.

POS error detection in automatically annotated corpora

Download

Object Details

Classification and Topics

Contributors, Places and Time

Further information

Data provider

Object type

Associated

Time of origin

Other Objects (12)

POS error detection in automatically annotated corpora

Detecting annotation noise in automatically labelled data

Sprucing up the trees – error detection in treebanks

Treebank-Based Grammar Acquisition for German

Data point selection for self-training

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Data point selection for self-training

A New Resource for German Causal Language

Treebank Annotation Schemes and Parser Evaluation for German

Metaphor detection for German poetry

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

POS error detection in automatically annotated corpora

Detecting annotation noise in automatically labelled data

Sprucing up the trees – error detection in treebanks

Treebank-Based Grammar Acquisition for German

Data point selection for self-training

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Data point selection for self-training

A New Resource for German Causal Language

Treebank Annotation Schemes and Parser Evaluation for German

Metaphor detection for German poetry

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

POS error detection in automatically annotated corpora

Detecting annotation noise in automatically labelled data

Sprucing up the trees – error detection in treebanks

Treebank-Based Grammar Acquisition for German

Data point selection for self-training

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Der Einfluss der Dependenzgrammatik auf die Computerlinguistik

Data point selection for self-training

A New Resource for German Causal Language

Treebank Annotation Schemes and Parser Evaluation for German

Metaphor detection for German poetry

There’s no Data like More Data? Revisiting the Impact of Data Size on a Classification Task

Related objects

Reset password