Konferenzbeitrag

Language Independent Named Entity Recognition using Distant Supervision

While good results have been achieved for named entity recognition (NER) in supervised settings, it remains a problem that for low resource languages and less studied domains little or no labelled data is available. As NER is a crucial preprocessing step for many natural language processing tasks, finding a way to overcome this deficit in data remains of great interest. We propose a distant supervision approach to NER that is both language and domain independent where we automatically generate labelled training data using gazetteers that we previously extracted from Wikipedia. We test our approach on English, German and Estonian data sets and contribute further by introducing several successful methods to reduce the noise in the generated training data. The tested models beat baseline systems and our results show that distant supervision can be a promising approach for NER when no labelled data is available. For the English model we also show that the distant supervision model is better at generalizing within the same domain of news texts by comparing it against a supervised model on a different test set.

Language Independent Named Entity Recognition using Distant Supervision

Urheber*in: Dembowski, Julia; Wiegand, Michael; Klakow, Dietrich

Urheberrechtsschutz

0
/
0

Sprache
Englisch

Thema
Maschinelles Lernen
Information Extraction
Computerlinguistik
Text Mining
Name
Sprache

Ereignis
Geistige Schöpfung
(wer)
Dembowski, Julia
Wiegand, Michael
Klakow, Dietrich
Ereignis
Veröffentlichung
(wer)
Poznań : Fundacja Uniwersytetu im. Adama Mickiewicza
(wann)
2019-03-19

URN
urn:nbn:de:bsz:mh39-86198
Letzte Aktualisierung
06.03.2025, 09:00 MEZ

Datenpartner

Dieses Objekt wird bereitgestellt von:
Leibniz-Institut für Deutsche Sprache - Bibliothek. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.

Objekttyp

  • Konferenzbeitrag

Beteiligte

  • Dembowski, Julia
  • Wiegand, Michael
  • Klakow, Dietrich
  • Poznań : Fundacja Uniwersytetu im. Adama Mickiewicza

Entstanden

  • 2019-03-19

Ähnliche Objekte (12)