Journal ArticleDOI
‘Delta’: a Measure of Stylistic Difference and a Guide to Likely Authorship
Reads0
Chats0
TLDR
A new way of using the relative frequencies of the very common words for comparing written texts and testing their likely authorship, which offers a simple but comparatively accurate addition to current methods of distinguishing the most likely author of texts exceeding about 1,500 words in length.Abstract:
This paper is a companion to my 'Questions of authorship: attribution and beyond', in which I sketched a new way of using the relative frequencies of the very common words for comparing written texts and testing their likely authorship. The main emphasis of that paper was not on the new procedure but on the broader consequences of our increasing sophistication in making such comparisons and the increasing (although never absolute) reliability of our inferences about authorship. My present objects, accordingly, are to give a more complete account of the procedure itself; to report the outcome of an extensive set of trials; and to consider the strengths and limitations of the new procedure. The procedure offers a simple but comparatively accurate addition to our current methods of distinguishing the most likely author of texts exceeding about 1,500 words in length. It is of even greater value as a method of reducing the field of likely candidates for texts of as little as 100 words in length. Not unexpectedly, it works least well with texts of a genre uncharacteristic of their author and, in one case, with texts far separated in time across a long literary career. Its possible use for other classificatory tasks has not yet been investigated.read more
Citations
More filters
Journal IssueDOI
A survey of modern authorship attribution methods
TL;DR: A survey of recent advances of the automated approaches to attributing authorship is presented, examining their characteristics for both text representation and text classification.
Book
Authorship Attribution
TL;DR: This review shows that the authorship attribution discipline is quite successful, even in difficult cases involving small documents in unfamiliar and less studied languages; it further analyzes the types of analysis and features used and tries to determine characteristics of well-performing systems, finally formulating these in a set of recommendations for best practices.
Journal IssueDOI
Computational methods in authorship attribution
TL;DR: Three scenarios are considered here for which solutions to the basic attribution problem are inadequate; it is shown how machine learning methods can be adapted to handle the special challenges of that variant.
Proceedings Article
Authorship attribution
TL;DR: In this paper, the authors explore information retrieval methods such as tf-Idf structure with support vector machines, parametric and nonparametric methods with supervised and unsupervised classification techniques in authorship attribution.
References
More filters
Journal ArticleDOI
Feature-Finding for Text Classification
Richard Forsyth,David I. Holmes +1 more
TL;DR: Results of a benchmark test on ten representative text-classification problems suggest that the technique here designated Monte-Carlo Feature-Finding has certain advantages that deserve consideration by future workers in this area.
Journal ArticleDOI
The application of principal component analysis to stylometry
Jose N. Binongo,M. W. A. Smith +1 more
TL;DR: In recent years principal component analysis has become popular for investigations in computational stylistics, particularly for studies of authorship, but the mathematical nature of the theory that underpins the method makes it rather inaccessible to linguists and literary scholars.
Journal ArticleDOI
A widow and her soldier: Stylometry and the American Civil war
TL;DR: This investigation strongly suggests that Pickett's widow, LaSalle Corbell Pickett, did compose the published letters, and they have been questioned, at least in part, by writers and historians of the Civil War.
Journal ArticleDOI
The Provenance of De Doctrina Christiana, attributed to John Milton: A Statistical Investigation
TL;DR: In this paper, a stylometric analysis of De Doctrina Christiana, a theological treatise attributed to John Milton, was performed, and the authors found that frequently occurring words are effective authorial discriminators between the treatise and the control texts.