Konferenzbeitrag

Modelling large parallel corpora. The Zurich Parallel Corpus Collection

Text corpora come in many different shapes and sizes and carry heterogeneous annotations, depending on their purpose and design. The true benefit of corpora is rooted in their annotation and the method by which this data is encoded is an important factor in their interoperability. We have accumulated a large collection of multilingual and parallel corpora and encoded it in a unified format which is compatible with a broad range of NLP tools and corpus linguistic applications. In this paper, we present our corpus collection and describe a data model and the extensions to the popular CoNLL-U format that enable us to encode it.

Urheber*in: Graën, Johannes; Kew, Tannon; Shaitarova, Anastassia; Volk, Martin

Namensnennung 4.0 International

Sprache: Englisch

Thema: Korpus <Linguistik>
Sprache

Ereignis: Geistige Schöpfung

(wer): Graën, Johannes
Kew, Tannon
Shaitarova, Anastassia
Volk, Martin

Ereignis: Veröffentlichung

(wer): Mannheim : Leibniz-Institut für Deutsche Sprache

(wann): 2019-07-04

URN: urn:nbn:de:bsz:mh39-90207

Letzte Aktualisierung: 06.03.2025, 09:00 MEZ

Datenpartner

Dieses Objekt wird bereitgestellt von:
Leibniz-Institut für Deutsche Sprache - Bibliothek. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.

Original beim Datenpartner anzeigen

Objekttyp

Konferenzbeitrag

Beteiligte

Graën, Johannes
Kew, Tannon
Shaitarova, Anastassia
Volk, Martin
Mannheim : Leibniz-Institut für Deutsche Sprache

Entstanden

2019-07-04

Ähnliche Objekte (12)

Modelling large parallel corpora. The Zurich Parallel Corpus Collection

Annotation, exploitation and evaluation of parallel corpora: TC3 I

Morphological knowledge and alignment of English-German parallel corpora

Aufsatzsammlung

Annotation, exploitation and evaluation of parallel corpora : TC3 I

ISO-based annotated multilingual parallel corpus for discourse markers

Hochschulschrift

A framework for processing and presenting parallel text corpora

Konferenzbeitrag

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

Konferenzschrift | Kongress 1999

Large scale parallel data mining

CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered

Hochschulschrift

Modelling parallel information processing in the retina

Parallel FDTD modelling of nonlocality in plasmonics

Modelling large parallel corpora. The Zurich Parallel Corpus Collection

Annotation, exploitation and evaluation of parallel corpora: TC3 I

Morphological knowledge and alignment of English-German parallel corpora

Aufsatzsammlung

Annotation, exploitation and evaluation of parallel corpora : TC3 I

ISO-based annotated multilingual parallel corpus for discourse markers

Hochschulschrift

A framework for processing and presenting parallel text corpora

Konferenzbeitrag

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

Konferenzschrift | Kongress 1999

Large scale parallel data mining

CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered

Hochschulschrift

Modelling parallel information processing in the retina

Parallel FDTD modelling of nonlocality in plasmonics

Modelling large parallel corpora. The Zurich Parallel Corpus Collection

Annotation, exploitation and evaluation of parallel corpora: TC3 I

Morphological knowledge and alignment of English-German parallel corpora

Aufsatzsammlung

Annotation, exploitation and evaluation of parallel corpora : TC3 I

ISO-based annotated multilingual parallel corpus for discourse markers

Hochschulschrift

A framework for processing and presenting parallel text corpora

Konferenzbeitrag

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

Konferenzschrift | Kongress 1999

Large scale parallel data mining

CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered

Hochschulschrift

Modelling parallel information processing in the retina

Parallel FDTD modelling of nonlocality in plasmonics

Informationen zur Registrierung von Kultur- und Wissenseinrichtungen finden Sie hier.

Felder mit * müssen ausgefüllt werden.

Benutzername*

Bitte geben Sie Ihren Benutzernamen ein

E-Mail*

Bitte geben Sie Ihre E-Mail ein

Bitte füllen Sie dieses Feld nicht aus

Vorname

Nachname

Passwort*

Bitte geben Sie Ihr Passwort ein

Passwort bestätigen*

Bitte geben Sie das gleiche Passwort ein

Ich habe die Nutzungsbedingungen und die Datenschutzerklärung zur Erhebung persönlicher Daten gelesen und stimme ihnen zu. *

Dieses Feld ist ein Pflichtfeld.

Ich möchte den Newsletter der Deutschen Digitalen Bibliothek abonnieren. Siehe Informationen zum Newsletter-Abonnement.

Benutzerkonto angelegt

Ihr „Meine DDB“-Konto wurde erfolgreich angelegt. Bevor Sie sich in Ihrem Konto anmelden können, müssen Sie auf den Bestätigungslink in der Nachricht klicken, die wir gerade an die von Ihnen angegebene E-Mail-Adresse geschickt haben

Modelling large parallel corpora. The Zurich Parallel Corpus Collection

Download

Angaben zum Objekt

Klassifikation und Themen

Beteiligte, Orts- und Zeitangaben

Weitere Informationen

Datenpartner

Objekttyp

Beteiligte

Entstanden

Ähnliche Objekte (12)

Modelling large parallel corpora. The Zurich Parallel Corpus Collection

Annotation, exploitation and evaluation of parallel corpora: TC3 I

Morphological knowledge and alignment of English-German parallel corpora

Annotation, exploitation and evaluation of parallel corpora : TC3 I

ISO-based annotated multilingual parallel corpus for discourse markers

A framework for processing and presenting parallel text corpora

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

Large scale parallel data mining

CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered

Modelling parallel information processing in the retina

Parallel FDTD modelling of nonlocality in plasmonics

Modelling large parallel corpora. The Zurich Parallel Corpus Collection

Annotation, exploitation and evaluation of parallel corpora: TC3 I

Morphological knowledge and alignment of English-German parallel corpora

Annotation, exploitation and evaluation of parallel corpora : TC3 I

ISO-based annotated multilingual parallel corpus for discourse markers

A framework for processing and presenting parallel text corpora

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

Large scale parallel data mining

CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered

Modelling parallel information processing in the retina

Parallel FDTD modelling of nonlocality in plasmonics

Modelling large parallel corpora. The Zurich Parallel Corpus Collection

Annotation, exploitation and evaluation of parallel corpora: TC3 I

Morphological knowledge and alignment of English-German parallel corpora

Annotation, exploitation and evaluation of parallel corpora : TC3 I

ISO-based annotated multilingual parallel corpus for discourse markers

A framework for processing and presenting parallel text corpora

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

Large scale parallel data mining

CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered

Modelling parallel information processing in the retina

Parallel FDTD modelling of nonlocality in plasmonics

Verbundene Objekte

Passwort zurücksetzen