Buchbeitrag

Creating the lexicon of multi-word expressions for Slovene methodology and structure

This paper describes a method for automatic identification of sentences in the Gigafida corpus containing multi-word expressions (MWEs) from the list of 5,242 phraseological units, which was developed on the basis of several existing open-access lexical resources for Slovene. The method is based on a definition of MWEs, which includes information on two levels of corpus annotation: syntax (dependency parsing) and morphology (POS tagging), together with some additional statistical parameters. The resulting lexicon contains 12,358 sentences containing MWEs extracted from the corpus. The extracted sentences were analysed from the lexicographic point of view with the aim of establishing canonical forms of MWEs and semantic relations between them in terms of variation, synonymy, and antonymy.

Creating the lexicon of multi-word expressions for Slovene methodology and structure

Urheber*in: Gantar, Polona; Krek, Simon

Attribution - ShareAlike 4.0 International

0
/
0

Language
Englisch

Subject
Mehrworteinheit
Sorbisch
Minderheitensprache
historische Lexikographie
Englisch, Altenglisch

Event
Geistige Schöpfung
(who)
Gantar, Polona
Krek, Simon
Event
Veröffentlichung
(who)
Mannheim : Ids-Verlag
Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)
(when)
2022-09-08

URN
urn:nbn:de:bsz:mh39-112270
Last update
06.03.2025, 9:00 AM CET

Data provider

This object is provided by:
Leibniz-Institut für Deutsche Sprache - Bibliothek. If you have any questions about the object, please contact the data provider.

Object type

  • Buchbeitrag

Associated

  • Gantar, Polona
  • Krek, Simon
  • Mannheim : Ids-Verlag
  • Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)

Time of origin

  • 2022-09-08

Other Objects (12)