Arbeitspapier

Topic Modeling for Analyzing Open-Ended Survey Responses

Open-ended responses are widely used in market research studies. Processing of such responses requires labor-intensive human coding. This paper focuses on unsupervised topic models and tests their ability to automate the analysis of open-ended responses. Since state-of-the-art topic models struggle with the shortness of open-ended responses, the paper considers three novel short text topic models: Latent Feature Latent Dirichlet Allocation, Biterm Topic Model and Word Network Topic Model. The models are fitted and evaluated on a set of realworld open-ended responses provided by a market research company. Multiple components such as topic coherence and document classification are quantitatively and qualitatively evaluated to appraise whether topic models can replace human coding. The results suggest that topic models are a viable alternative for open-ended response coding. However, their usefulness is limited when a correct one-to-one mapping of responses and topics or the exact topic distribution is needed.

Language
Englisch

Bibliographic citation
Series: IRTG 1792 Discussion Paper ; No. 2018-054

Classification
Wirtschaft
Mathematical and Quantitative Methods: General
Subject
Market research
open-ended responses
text analytics
short text topic models

Event
Geistige Schöpfung
(who)
Pietsch, Andra-Selina
Lessmann, Stefan
Event
Veröffentlichung
(who)
Humboldt-Universität zu Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series"
(where)
Berlin
(when)
2018

Handle
Last update
10.03.2025, 11:46 AM CET

Data provider

This object is provided by:
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. If you have any questions about the object, please contact the data provider.

Object type

  • Arbeitspapier

Associated

  • Pietsch, Andra-Selina
  • Lessmann, Stefan
  • Humboldt-Universität zu Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series"

Time of origin

  • 2018

Other Objects (12)