Konferenzbeitrag

Interaction of technology and methodology in building and sharing an annotated learner corpus of spoken German

This paper discusses the technological and methodological challenges in creating and sharing HAMATAC, the Hamburg Map Task Corpus. The first version of the corpus, consisting of 24 recordings with orthographic transcriptions and metadata, is publicly available. A second version featuring different types of linguistic annotation is in progress. I will describe how the various software tools and data formats of the EXMARaLDA system were used for transcription and multi-level annotation, to compile recordings and transcriptions into a corpus and manage metadata, to publish the corpus, and how they can be used for carrying out corpus queries (KWIC) and analyses. Some recurrent issues in corpus building and sharing and the interaction of technological and methodological aspects will be illustrated using HAMATAC.

Interaction of technology and methodology in building and sharing an annotated learner corpus of spoken German

Urheber*in: Hedeland, Hanna

Urheberrechtsschutz

Sprache
Englisch

Thema
Gesprochene Sprache
Annotation
Transkription
Korpus <Linguistik>
Methodologie
Sprache

Ereignis
Geistige Schöpfung
(wer)
Hedeland, Hanna
Ereignis
Veröffentlichung
(wer)
València : Editorial Universitat Politècnica de València
(wann)
2020-03-22

URN
urn:nbn:de:bsz:mh39-97212
Letzte Aktualisierung
06.03.2025, 09:00 MEZ

Datenpartner

Dieses Objekt wird bereitgestellt von:
Leibniz-Institut für Deutsche Sprache - Bibliothek. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.

Objekttyp

  • Konferenzbeitrag

Beteiligte

  • Hedeland, Hanna
  • València : Editorial Universitat Politècnica de València

Entstanden

  • 2020-03-22

Ähnliche Objekte (12)