Understanding and explaining Delta measures for authorship attribution

doi:10.1093/LLC/FQX023

Journal Article•DOI•

Understanding and explaining Delta measures for authorship attribution

Stefan Evert¹, Thomas Proisl¹, Fotis Jannidis², Isabella Reger², Steffen Pielström², Christof Schöch², Thorsten Vitt² - Show less +3 more•Institutions (2)

University of Erlangen-Nuremberg¹, University of Würzburg²

01 Dec 2017-Digital Scholarship in the Humanities (Oxford Academic)-Vol. 32

TL;DR: It is shown that feature vector normalization, that is, the transformation of the feature vectors to a uniform length of 1 (implicit in the cosine measure), is the decisive factor for the improvement of Delta proposed recently.

read less

Abstract: This article builds on a mathematical explanation of one the most prominent stylometric measures, Burrows’s Delta (and its variants), to understand and explain its working. Starting with the conceptual separation between feature selection, feature scaling, and distance measures, we have designed a series of controlled experiments in which we used the kind of feature scaling (various types of standardization and normalization) and the type of distance measures (notably Manhattan, Euclidean, and Cosine) as independent variables and the correct authorship attributions as the dependent variable indicative of the performance of each of the methods proposed. In this way, we are able to describe in some detail how each of these two variables interact with each other and how they influence the results. Thus we can show that feature vector normalization, that is, the transformation of the feature vectors to a uniform length of 1 (implicit in the cosine measure), is the decisive factor for the improvement of Delta proposed recently. We are also able to show that the information particularly relevant to the identification of the author of a text lies in the profile of deviation across the most frequent words rather than in the extent of the deviation or in the deviation of specific words only. .................................................................................................................................................................................

...read moreread less

Content maybe subject to copyright Report

Understanding and explaining Delta measures for authorship attribution

Citations

Cites methods from "Understanding and explaining Delta ..."

References

"Understanding and explaining Delta ..." refers methods in this paper

"Understanding and explaining Delta ..." refers methods in this paper

Related Papers (5)

Trending Questions (1)