Konferenzbeitrag
Discovering Subtle Word Relations in Large German Corpora
With an increasing amount of text data available it is possible to automatically extract a variety of information about language. One way to obtain knowledge about subtle relations and analogies between words is to observe words which are used in the same context. Recently, Mikolov et al. proposed a method to efficiently compute Euclidean word representations which seem to capture subtle relations and analogies between words in the English language. We demonstrate that this method also captures analogies in the German language. Furthermore, we show that we can transfer information extracted from large non-annotated corpora into small annotated corpora, which are then, in turn, used for training NLP systems.
- Language
-
Englisch
- Subject
-
Korpus <Linguistik>
Datenbanksystem
Annotation
Linguistik
- Event
-
Geistige Schöpfung
- (who)
-
Buschjäger, Sebastian
Pfahler, Lukas
Morik, Katharina
- Event
-
Veröffentlichung
- (who)
-
Mannheim : Institut für Deutsche Sprache
- (when)
-
2015-07-02
- URN
-
urn:nbn:de:bsz:mh39-38317
- Last update
-
06.03.2025, 9:00 AM CET
Data provider
Leibniz-Institut für Deutsche Sprache - Bibliothek. If you have any questions about the object, please contact the data provider.
Object type
- Konferenzbeitrag
Associated
- Buschjäger, Sebastian
- Pfahler, Lukas
- Morik, Katharina
- Mannheim : Institut für Deutsche Sprache
Time of origin
- 2015-07-02