Konferenzbeitrag

Recent developments in DeReKo

This paper gives an overview of recent developments in the German Reference Corpus DeReKo in terms of growth, maximising relevant corpus strata, metadata, legal issues, and its current and future research interface. Due to the recent acquisition of new licenses, DeReKo has grown by a factor of four in the first half of 2014, mostly in the area of newspaper text, and presently contains over 24 billion word tokens. Other strata, like fictional texts, web corpora, in particular CMC texts, and spoken but conceptually written texts have also increased significantly. We report on the newly acquired corpora that led to the major increase, on the principles and strategies behind our corpus acquisition activities, and on our solutions for the emerging legal, organisational, and technical challenges.

Recent developments in DeReKo

Urheber*in: Kupietz, Marc; Lüngen, Harald

In copyright

0
/
0

Language
Englisch

Subject
Deutsch
Korpus <Linguistik>
Textkorpus

Event
Geistige Schöpfung
(who)
Kupietz, Marc
Lüngen, Harald
Event
Veröffentlichung
(who)
Reykjavik : European Language Resources Association (ELRA)
(when)
2014-10-13

URN
urn:nbn:de:bsz:mh39-31353
Last update
06.03.2025, 9:00 AM CET

Data provider

This object is provided by:
Leibniz-Institut für Deutsche Sprache - Bibliothek. If you have any questions about the object, please contact the data provider.

Object type

  • Konferenzbeitrag

Associated

  • Kupietz, Marc
  • Lüngen, Harald
  • Reykjavik : European Language Resources Association (ELRA)

Time of origin

  • 2014-10-13

Other Objects (12)