Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

to related objects

Abstract: In this paper, we describe our effort to create a new corpus for the evaluation of detecting and linking so-called survey variables in social science publications (e.g., "Do you believe in Heaven?"). The task is to recognize survey variable mentions in a given text, disambiguate them, and link them to the corresponding variable within a knowledge base. Since there are generally hundreds of candidates to link to and due to the wide variety of forms they can take, this is a challenging task within NLP. The contribution of our work is the first gold standard corpus for the variable detection and linking task. We describe the annotation guidelines and the annotation process. The produced corpus is multilingual - German and English - and includes manually curated word and phrase alignments. Moreover, it includes text samples that could not be assigned to any variables, denoted as negative examples. Based on the new dataset, we conduct an evaluation of several state-of-the-art text class

Location: Deutsche Nationalbibliothek Frankfurt am Main

Extent: Online-Ressource

Language: Englisch

Notes: Veröffentlichungsversion
begutachtet (peer reviewed)
In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC). 2018. ISBN 979-10-95546-00-9

Classification: Sprache, Linguistik

Event: Veröffentlichung

(where): Mannheim

(when): 2018

Creator: Zielinski, Andrea
Mutschke, Peter

Contributor: European Language Resources Association (ELRA)

URN: urn:nbn:de:0168-ssoar-57723-2

Rights: Open Access; Open Access; Der Zugriff auf das Objekt ist unbeschränkt möglich.

Last update: 15.08.2025, 7:26 AM CEST

Data provider

This object is provided by:
Deutsche Nationalbibliothek. If you have any questions about the object, please contact the data provider.

Show original at data provider

Associated

Zielinski, Andrea
Mutschke, Peter
European Language Resources Association (ELRA)

Time of origin

2018

Other Objects (12)

Conference paper | Konferenzbeitrag

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

The Kassel Corpus of Clause Linking

Conference paper | Konferenzbeitrag

Mining Social Science Publications for Survey Variables

An argument-annotated corpus of scientific publications

Standard-relevant publications: evidence, processes and influencing factors

Linking the supersymmetric standard model to the cosmological constant

Abschnitt

New Publications and standard works in theology and miscellaneous literature

Ranking of disease gene associations from large corpora of scientific publications

Minimum Information Standards for Essential Biodiversity Variables

ISBD(A) : international standard bibliographic description for older monographic publications (antiquarian)

Arbeitspapier

MD*ReX: Linking XploRe to standard spread-sheet applications

Conference paper | Konferenzbeitrag

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

The Kassel Corpus of Clause Linking

Conference paper | Konferenzbeitrag

Mining Social Science Publications for Survey Variables

An argument-annotated corpus of scientific publications

Standard-relevant publications: evidence, processes and influencing factors

Linking the supersymmetric standard model to the cosmological constant

Abschnitt

New Publications and standard works in theology and miscellaneous literature

Ranking of disease gene associations from large corpora of scientific publications

Minimum Information Standards for Essential Biodiversity Variables

ISBD(A) : international standard bibliographic description for older monographic publications (antiquarian)

Arbeitspapier

MD*ReX: Linking XploRe to standard spread-sheet applications

Conference paper | Konferenzbeitrag

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

The Kassel Corpus of Clause Linking

Conference paper | Konferenzbeitrag

Mining Social Science Publications for Survey Variables

An argument-annotated corpus of scientific publications

Standard-relevant publications: evidence, processes and influencing factors

Linking the supersymmetric standard model to the cosmological constant

Abschnitt

New Publications and standard works in theology and miscellaneous literature

Ranking of disease gene associations from large corpora of scientific publications

Minimum Information Standards for Essential Biodiversity Variables

ISBD(A) : international standard bibliographic description for older monographic publications (antiquarian)

Arbeitspapier

MD*ReX: Linking XploRe to standard spread-sheet applications

Cultural heritage institutions wishing to register will find more information here.

Fields marked * need to be filled in.

Username*

Please enter your username

Email*

Please enter your email address

Please do not fill this field

First name

Last name

Password*

Please enter your password

Confirm password*

Please enter the same password

I have read the terms of use and the privacy policy for the collection of personal data and accept them. *

This field is required.

I would like to subscribe to the newsletter of the Deutsche Digitale Bibliothek. See newsletter subscription info.

Account created

Your "My DDB" account has been successfully created. Before you can log in to your account, you must click the confirmation link in the message we just sent to the email address you provided.

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

Object Details

Classification and Topics

Contributors, Places and Time

Further information

Data provider

Associated

Time of origin

Other Objects (12)

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

The Kassel Corpus of Clause Linking

Mining Social Science Publications for Survey Variables

Mining Social Science Publications for Survey Variables

An argument-annotated corpus of scientific publications

Standard-relevant publications: evidence, processes and influencing factors

Linking the supersymmetric standard model to the cosmological constant

New Publications and standard works in theology and miscellaneous literature

Ranking of disease gene associations from large corpora of scientific publications

Minimum Information Standards for Essential Biodiversity Variables

ISBD(A) : international standard bibliographic description for older monographic publications (antiquarian)

MD*ReX: Linking XploRe to standard spread-sheet applications

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

The Kassel Corpus of Clause Linking

Mining Social Science Publications for Survey Variables

Mining Social Science Publications for Survey Variables

An argument-annotated corpus of scientific publications

Standard-relevant publications: evidence, processes and influencing factors

Linking the supersymmetric standard model to the cosmological constant

New Publications and standard works in theology and miscellaneous literature

Ranking of disease gene associations from large corpora of scientific publications

Minimum Information Standards for Essential Biodiversity Variables

ISBD(A) : international standard bibliographic description for older monographic publications (antiquarian)

MD*ReX: Linking XploRe to standard spread-sheet applications

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

The Kassel Corpus of Clause Linking

Mining Social Science Publications for Survey Variables

Mining Social Science Publications for Survey Variables

An argument-annotated corpus of scientific publications

Standard-relevant publications: evidence, processes and influencing factors

Linking the supersymmetric standard model to the cosmological constant

New Publications and standard works in theology and miscellaneous literature

Ranking of disease gene associations from large corpora of scientific publications

Minimum Information Standards for Essential Biodiversity Variables

ISBD(A) : international standard bibliographic description for older monographic publications (antiquarian)

MD*ReX: Linking XploRe to standard spread-sheet applications

Related objects

Reset password