Arbeitspapier

Conditional complexity of compression for authorship attribution

We introduce new stylometry tools based on the sliced conditional compression complexity of literary texts which are inspired by the nearly optimal application of the incomputable Kolmogorov conditional complexity (and presumably approximates it). Whereas other stylometry tools can occasionally be very close for different authors, our statistic is apparently strictly minimal for the true author, if the query and training texts are sufficiently large, compressor is sufficiently good and sampling bias is avoided (as in the poll samplings). We tune it and test its performance on attributing the Federalist papers (Madison vs. Hamilton). Our results confirm the previous attribution of Federalist papers by Mosteller and Wallace (1964) to Madison using the Naive Bayes classifier and the same attribution based on alternative classifiers such as SVM, and the second order Markov model of language. Then we apply our method for studying the attribution of the early poems from the Shakespeare Canon and the continuation of Marlowe's poem 'Hero and Leander' ascribed to G. Chapman.

Language
Englisch

Bibliographic citation
Series: SFB 649 Discussion Paper ; No. 2007,057

Classification
Wirtschaft
Hypothesis Testing: General
Statistical Simulation Methods: General
Computational Techniques; Simulation Modeling
Subject
compression complexity
authorship attribution
Publikationsanalyse
Statistischer Test
Simulation
Theorie

Event
Geistige Schöpfung
(who)
Malyutov, Mikhail B.
Wickramasinghe, Chammi Irosha
Li, Sufeng
Event
Veröffentlichung
(who)
Humboldt University of Berlin, Collaborative Research Center 649 - Economic Risk
(where)
Berlin
(when)
2007

Handle
Last update
10.03.2025, 11:45 AM CET

Data provider

This object is provided by:
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. If you have any questions about the object, please contact the data provider.

Object type

  • Arbeitspapier

Associated

  • Malyutov, Mikhail B.
  • Wickramasinghe, Chammi Irosha
  • Li, Sufeng
  • Humboldt University of Berlin, Collaborative Research Center 649 - Economic Risk

Time of origin

  • 2007

Other Objects (12)