Buchbeitrag
Towards a multilingual dictionary of discourse markers. Automatic extraction of units from parallel corpus
This paper presents a multilingual dictionary project of discourse markers. During its first stage, consisting of collecting the list of headwords, we used a parallel corpus to automatically extract units from texts written in Spanish, Catalan, English, French and German. We also applied a method to create a taxonomy structure for automatically organising the markers in clusters. As a result, we obtain an extensive, corpus-driven list of headwords. We present a prototype of the microstructure of the dictionary in the form of a standard XML database and describe the procedure to automatically fill in most of its fields (e.g., the type of DM, the equivalents in other languages, etc.), before human intervention.
- Language
-
Englisch
- Subject
-
Korpus <Linguistik>
Lexikographie
Elektronisches Wörterbuch
Diskursmarker
Mehrsprachiges Wörterbuch
Englisch, Altenglisch
- Event
-
Geistige Schöpfung
- (who)
-
Renau, Irene
Nazar, Rogelio
- Event
-
Veröffentlichung
- (who)
-
Mannheim : IDS-Verlag
Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)
- (when)
-
2022-08-18
- URN
-
urn:nbn:de:bsz:mh39-111830
- Last update
-
06.03.2025, 9:00 AM CET
Data provider
Leibniz-Institut für Deutsche Sprache - Bibliothek. If you have any questions about the object, please contact the data provider.
Object type
- Buchbeitrag
Associated
- Renau, Irene
- Nazar, Rogelio
- Mannheim : IDS-Verlag
- Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)
Time of origin
- 2022-08-18