Konferenzbeitrag

The German Reference Corpus: New developments building on almost 50 years of experience

This paper describes the efforts in the field of sustainability of the Institut für Deutsche Sprache (IDS) in Mannheim with respect to DEREKO (Deutsches Referenzkorpus) the Archive of General Reference Corpora of Contemporary Written German. With focus on re-usability and sustainability, we discuss its history and our future plans. We describe legal challenges related to the creation of a large and sustainable resource; sketch out the pipeline used to convert raw texts to the final corpus format and outline migration plans to TEI P5. Due to the fact, that the current version of the corpus management and query system is pushed towards its limits, we discuss the requirements for a new version which will be able to handle current and future DEREKO releases. Furthermore, we outline the institute’s plans in the field of digital preservation.

The German Reference Corpus: New developments building on almost 50 years of experience

Urheber*in: Kupietz, Marc; Schonefeld, Oliver; Witt, Andreas

Urheberrechtsschutz

0
/
0

Sprache
Englisch

Thema
Korpus <Linguistik>
Langzeitarchivierung
Linguistik

Ereignis
Geistige Schöpfung
(wer)
Kupietz, Marc
Schonefeld, Oliver
Witt, Andreas
Ereignis
Veröffentlichung
(wer)
Paris : European Language Resources Association (ELRA)
(wann)
2015-12-18

URN
urn:nbn:de:bsz:mh39-45002
Letzte Aktualisierung
06.03.2025, 09:00 MEZ

Datenpartner

Dieses Objekt wird bereitgestellt von:
Leibniz-Institut für Deutsche Sprache - Bibliothek. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.

Objekttyp

  • Konferenzbeitrag

Beteiligte

  • Kupietz, Marc
  • Schonefeld, Oliver
  • Witt, Andreas
  • Paris : European Language Resources Association (ELRA)

Entstanden

  • 2015-12-18

Ähnliche Objekte (12)