Konferenzbeitrag

Data Mining with Shallow vs. Linguistic Features to Study Diversification of Scientific Registers

We present a methodology to analyze the linguistic evolution of scientific registers with data mining techniques, comparing the insights gained from shallow vs. linguistic features. The focus is on selected scientific disciplines at the boundaries to computer science (computational linguistics, bioinformatics, digital construction, microelectronics). The data basis is the English Scientific Text Corpus (SCITEX) which covers a time range of roughly thirty years (1970/80s to early 2000s) (Degaetano-Ortlieb et al., 2013; Teich and Fankhauser, 2010). In particular, we investigate the diversification of scientific registers over time. Our theoretical basis is Systemic Functional Linguistics (SFL) and its specific incarnation of register theory (Halliday and Hasan, 1985). In terms of methods, we combine corpus-based methods of feature extraction and data mining techniques.

Data Mining with Shallow vs. Linguistic Features to Study Diversification of Scientific Registers

Urheber*in: Degaetano-Ortlieb, Stefania; Fankhauser, Peter; Kermes, Hannah; Lapshinova-Koltunski, Ekaterina; Ordan, Noam; Teich, Elke

In copyright

0
/
0

Language
Englisch

Subject
Korpus <Linguistik>
Linguistik

Event
Geistige Schöpfung
(who)
Degaetano-Ortlieb, Stefania
Fankhauser, Peter
Kermes, Hannah
Lapshinova-Koltunski, Ekaterina
Ordan, Noam
Teich, Elke
Event
Veröffentlichung
(who)
Reykjavik : European Language Resources Association (ELRA)
(when)
2014-06-13

URN
urn:nbn:de:bsz:mh39-26178
Last update
06.03.2025, 9:00 AM CET

Data provider

This object is provided by:
Leibniz-Institut für Deutsche Sprache - Bibliothek. If you have any questions about the object, please contact the data provider.

Object type

  • Konferenzbeitrag

Associated

  • Degaetano-Ortlieb, Stefania
  • Fankhauser, Peter
  • Kermes, Hannah
  • Lapshinova-Koltunski, Ekaterina
  • Ordan, Noam
  • Teich, Elke
  • Reykjavik : European Language Resources Association (ELRA)

Time of origin

  • 2014-06-13

Other Objects (12)