Konferenzbeitrag

Processing and querying large web corpora with the COW14 architecture

In this paper, I present the COW14 tool chain, which comprises a web corpus creation tool called texrex, wrappers for existing linguistic annotation tools as well as an online query software called Colibri2. By detailed descriptions of the implementation and systematic evaluations of the performance of the software on different types of systems, I show that the COW14 architecture is capable of handling the creation of corpora of up to at least 100 billion tokens. I also introduce our running demo system which currently serves corpora of up to roughly 20 billion tokens in Dutch, English, French, German, Spanish, and Swedish

Urheber*in: Schäfer, Roland

Attribution - NonCommercial - NoDerivates 4.0 International

Language: Englisch

Subject: Korpus <Linguistik>
Annotation
Datenbanksystem
Linguistik

Event: Geistige Schöpfung

(who): Schäfer, Roland

Event: Veröffentlichung

(who): Mannheim : Institut für Deutsche Sprache

(when): 2015-07-02

URN: urn:nbn:de:bsz:mh39-38367

Last update: 06.03.2025, 9:00 AM CET

Data provider

This object is provided by:
Leibniz-Institut für Deutsche Sprache - Bibliothek. If you have any questions about the object, please contact the data provider.

Show original at data provider

Object type

Konferenzbeitrag

Associated

Schäfer, Roland
Mannheim : Institut für Deutsche Sprache

Time of origin

2015-07-02

Other Objects (12)

Processing and querying large web corpora with the COW14 architecture

Hochschulschrift

Querying a Web of Linked Data

Querying Repetitions in Spoken Language Corpora

Querying semantic web resources using TRIPLE views

Artikel

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

A web-platform for preserving, exploring, visualising, and querying linguistic corpora and other resources

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

Buchbeitrag

Example-based querying for linguistic specialist corpora

Querying and Efficiently Searching Large, Temporal Text Corpora

SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases

Hochschulschrift

Querying a web of linked data : foundations and query execution

Processing and querying large web corpora with the COW14 architecture

Hochschulschrift

Querying a Web of Linked Data

Querying Repetitions in Spoken Language Corpora

Querying semantic web resources using TRIPLE views

Artikel

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

A web-platform for preserving, exploring, visualising, and querying linguistic corpora and other resources

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

Buchbeitrag

Example-based querying for linguistic specialist corpora

Querying and Efficiently Searching Large, Temporal Text Corpora

SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases

Hochschulschrift

Querying a web of linked data : foundations and query execution

Processing and querying large web corpora with the COW14 architecture

Hochschulschrift

Querying a Web of Linked Data

Querying Repetitions in Spoken Language Corpora

Querying semantic web resources using TRIPLE views

Artikel

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

A web-platform for preserving, exploring, visualising, and querying linguistic corpora and other resources

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

Buchbeitrag

Example-based querying for linguistic specialist corpora

Querying and Efficiently Searching Large, Temporal Text Corpora

SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases

Hochschulschrift

Querying a web of linked data : foundations and query execution

Cultural heritage institutions wishing to register will find more information here.

Fields marked * need to be filled in.

Username*

Please enter your username

Email*

Please enter your email address

Please do not fill this field

First name

Last name

Password*

Please enter your password

Confirm password*

Please enter the same password

I have read the terms of use and the privacy policy for the collection of personal data and accept them. *

This field is required.

I would like to subscribe to the newsletter of the Deutsche Digitale Bibliothek. See newsletter subscription info.

Account created

Your "My DDB" account has been successfully created. Before you can log in to your account, you must click the confirmation link in the message we just sent to the email address you provided.

Processing and querying large web corpora with the COW14 architecture

Download

Object Details

Classification and Topics

Contributors, Places and Time

Further information

Data provider

Object type

Associated

Time of origin

Other Objects (12)

Processing and querying large web corpora with the COW14 architecture

Querying a Web of Linked Data

Querying Repetitions in Spoken Language Corpora

Querying semantic web resources using TRIPLE views

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

A web-platform for preserving, exploring, visualising, and querying linguistic corpora and other resources

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

Example-based querying for linguistic specialist corpora

Example-based querying for linguistic specialist corpora

Querying and Efficiently Searching Large, Temporal Text Corpora

SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases

Querying a web of linked data : foundations and query execution

Processing and querying large web corpora with the COW14 architecture

Querying a Web of Linked Data

Querying Repetitions in Spoken Language Corpora

Querying semantic web resources using TRIPLE views

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

A web-platform for preserving, exploring, visualising, and querying linguistic corpora and other resources

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

Example-based querying for linguistic specialist corpora

Example-based querying for linguistic specialist corpora

Querying and Efficiently Searching Large, Temporal Text Corpora

SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases

Querying a web of linked data : foundations and query execution

Processing and querying large web corpora with the COW14 architecture

Querying a Web of Linked Data

Querying Repetitions in Spoken Language Corpora

Querying semantic web resources using TRIPLE views

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

A web-platform for preserving, exploring, visualising, and querying linguistic corpora and other resources

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources

Example-based querying for linguistic specialist corpora

Example-based querying for linguistic specialist corpora

Querying and Efficiently Searching Large, Temporal Text Corpora

SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases

Querying a web of linked data : foundations and query execution

Related objects

Reset password