Dissertation o. Habilitation

Danish Stød and Automatic Speech Recognition

Stød is a prosodic feature in Danish spoken language that is able to distinguish lexemes. This distinction can also identify word class and has the potential to improve the performance of automatic speech recognisers for Danish spoken language. Stød manifestation exhibits a large amount of variability and may be perceptual in nature, because stød in some cases can be audibly perceived yet not be visible in a spectrogram. The variability is the primary reason there is currently no agreed upon acoustic or phonetic definition of stød. The working definition of stød is “. . . a kind of creaky voice” (Grønnum, 2005) and “stød is not just creak” (Hansen, 2015). In the present work, we investigate whether stød can be exploited in automatic speech recognition. To exploit stød without an acoustic or phonetic definition, we need to use a (almost) zero-knowledge datadriven approach which is based on a number of assumptions that we investigate prior to conducting ASR experimentation. We assume that stød can be detected in audio input, using acoustic features. To detect stød, we need to identify features that signal stød, which requires annotated data. To select the right features, the stød annotation must be reliable and accurate. We therefore conduct a reliability study of stød annotation with inter-annotator agreement measures, rank acoustic features for stød detection according to feature importance using a forest of randomised decision trees and experiment with stød detection as a binary and multi-class classification task. The experiments identify a set of features important or stød detection and confirms that we can detect stød in audio. Lastly, we model stød in automatic speech recognition and show that significant improvements in word error rate can be gained simply by annotating stød in the phonetic dictionary at the expense of decoding speed. Extending the acoustic feature vectors with pitch-related features and other features of voice quality also give significant performance improvement on both read-aloud speech and spontaneous speech. Decoding speed increases when we extend the acoustic feature vectors and actually improve decoding speed over the baseline where stød is not modelled.

ISBN: 9788793483132

Language: Englisch

Bibliographic citation: Series: PhD Series ; No. 24.2016

Classification: Management

Event: Geistige Schöpfung

(who): Kirkedal, Andreas Søeborg

Event: Veröffentlichung

(who): Copenhagen Business School (CBS)

(where): Frederiksberg

(when): 2016

Handle: http://hdl.handle.net/10419/208978

Last update: 10.03.2025, 11:41 AM CET

Data provider

This object is provided by:
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. If you have any questions about the object, please contact the data provider.

Show original at data provider

Object type

Dissertation o. Habilitation

Associated

Kirkedal, Andreas Søeborg
Copenhagen Business School (CBS)

Time of origin

2016

Other Objects (12)

Hochschulschrift

Automatic speech recognition for Amharic

Speech signal evaluation using automatic speech recognition systems

Self-conducted speech audiometry using automatic speech recognition

On Representation Learning in Speech Processing and Automatic Speech Recognition

Quantization of automatic speech recognition networks

Compensating hyperarticulation for automatic speech recognition

Hochschulschrift

Compensating hyperarticulation for automatic speech recognition

Hochschulschrift

Automatic speech recognition for dialectal Arabic

Hochschulschrift

Contributions to turbo automatic speech recognition

Hochschulschrift

Speech Recognition based Automatic Earthquake Detection and Classification

Prediction of Human Listeners' Speech Recognition Performance Based on Automatic Speech Recognition

Automatic speech recognition in adverse acoustic conditions

Hochschulschrift

Automatic speech recognition for Amharic

Speech signal evaluation using automatic speech recognition systems

Self-conducted speech audiometry using automatic speech recognition

On Representation Learning in Speech Processing and Automatic Speech Recognition

Quantization of automatic speech recognition networks

Compensating hyperarticulation for automatic speech recognition

Hochschulschrift

Compensating hyperarticulation for automatic speech recognition

Hochschulschrift

Automatic speech recognition for dialectal Arabic

Hochschulschrift

Contributions to turbo automatic speech recognition

Hochschulschrift

Speech Recognition based Automatic Earthquake Detection and Classification

Prediction of Human Listeners' Speech Recognition Performance Based on Automatic Speech Recognition

Automatic speech recognition in adverse acoustic conditions

Hochschulschrift

Automatic speech recognition for Amharic

Speech signal evaluation using automatic speech recognition systems

Self-conducted speech audiometry using automatic speech recognition

On Representation Learning in Speech Processing and Automatic Speech Recognition

Quantization of automatic speech recognition networks

Compensating hyperarticulation for automatic speech recognition

Hochschulschrift

Compensating hyperarticulation for automatic speech recognition

Hochschulschrift

Automatic speech recognition for dialectal Arabic

Hochschulschrift

Contributions to turbo automatic speech recognition

Hochschulschrift

Speech Recognition based Automatic Earthquake Detection and Classification

Prediction of Human Listeners' Speech Recognition Performance Based on Automatic Speech Recognition

Automatic speech recognition in adverse acoustic conditions

Cultural heritage institutions wishing to register will find more information here.

Fields marked * need to be filled in.

Username*

Please enter your username

Email*

Please enter your email address

Please do not fill this field

First name

Last name

Password*

Please enter your password

Confirm password*

Please enter the same password

I have read the terms of use and the privacy policy for the collection of personal data and accept them. *

This field is required.

I would like to subscribe to the newsletter of the Deutsche Digitale Bibliothek. See newsletter subscription info.

Account created

Your "My DDB" account has been successfully created. Before you can log in to your account, you must click the confirmation link in the message we just sent to the email address you provided.

Danish Stød and Automatic Speech Recognition

Object Details

References and Relationships

Classification and Topics

Contributors, Places and Time

Further information

Data provider

Object type

Associated

Time of origin

Other Objects (12)

Automatic speech recognition for Amharic

Speech signal evaluation using automatic speech recognition systems

Self-conducted speech audiometry using automatic speech recognition

On Representation Learning in Speech Processing and Automatic Speech Recognition

Quantization of automatic speech recognition networks

Compensating hyperarticulation for automatic speech recognition

Compensating hyperarticulation for automatic speech recognition

Automatic speech recognition for dialectal Arabic

Contributions to turbo automatic speech recognition

Speech Recognition based Automatic Earthquake Detection and Classification

Prediction of Human Listeners' Speech Recognition Performance Based on Automatic Speech Recognition

Automatic speech recognition in adverse acoustic conditions

Automatic speech recognition for Amharic

Speech signal evaluation using automatic speech recognition systems

Self-conducted speech audiometry using automatic speech recognition

On Representation Learning in Speech Processing and Automatic Speech Recognition

Quantization of automatic speech recognition networks

Compensating hyperarticulation for automatic speech recognition

Compensating hyperarticulation for automatic speech recognition

Automatic speech recognition for dialectal Arabic

Contributions to turbo automatic speech recognition

Speech Recognition based Automatic Earthquake Detection and Classification

Prediction of Human Listeners' Speech Recognition Performance Based on Automatic Speech Recognition

Automatic speech recognition in adverse acoustic conditions

Automatic speech recognition for Amharic

Speech signal evaluation using automatic speech recognition systems

Self-conducted speech audiometry using automatic speech recognition

On Representation Learning in Speech Processing and Automatic Speech Recognition

Quantization of automatic speech recognition networks

Compensating hyperarticulation for automatic speech recognition

Compensating hyperarticulation for automatic speech recognition

Automatic speech recognition for dialectal Arabic

Contributions to turbo automatic speech recognition

Speech Recognition based Automatic Earthquake Detection and Classification

Prediction of Human Listeners' Speech Recognition Performance Based on Automatic Speech Recognition

Automatic speech recognition in adverse acoustic conditions

Related objects

Reset password