scispace - formally typeset
Search or ask a question

Showing papers by "Ted Underwood published in 2020"


Journal ArticleDOI
TL;DR: This article showed that machine learning algorithms are actually bad at being objective and rather good at absorbing human perspectives implicit in the evidence used to train them, which may explain why they appear to have limited value for literary study.
Abstract: Numbers appear to have limited value for literary study, since our discipline is usually more concerned with exploring differences of interpretation than with describing the objective features of literary works. But it may be time to reexamine the assumption that numbers are useful only for objective description. Machine learning algorithms are actually bad at being objective and rather good at absorbing human perspectives implicit in the evidence used to train them. To dramatize perspectival uses of machine learning, I train models of genre on groups of books categorized by historical actors who range from Edwardian advertisers to contemporary librarians. Comparing the perspectives implicit in their choices casts new light on received histories of genre. Scientific romance and science fiction—whose shifting names have often suggested a fractured history—turn out to be more stable across two centuries than the genre we call fantasy. (TU)

10 citations


Journal ArticleDOI
TL;DR: This article argues that critics of computation disagree with practitioners even about math, and suggests that internal critiques can sometimes bridge the gap between hardened positions, and Da deserves credit for trying to produce one.
Abstract: Quantitative literary research has a history stretching back to the early twentieth century and has attracted criticism for almost as long. But most critics of the project have argued, along with Stanley Fish, that numbers are useless because they fail to produce humanistic meaning. By contrast, Nan Z. Da’s “Computational Case against Computational Literary Studies” takes its stand inside the world of numbers in order to argue that mathematical approaches to literature must fail on mathematical grounds (see Nan Z. Da, “Computational Case against Computational Literary Studies,” Critical Inquiry 45 [Spring 2019]: 601–39). Internal critiques can sometimes bridge the gap between hardened positions, and Da deserves credit for trying to produce one. But it appears that critics of computation disagree with practitioners even about math. An online forum shortly after the article’s publication included eight scholars, including several who had escaped Da’s criticism. Of that group, only one (whose work doesn’t emphasize computation) was persuaded by Da’s quantitative argument.

4 citations


01 Jan 2020
TL;DR: This work studies the strength of the textual differentiation between genres, and between genre fiction collectively and the rest of the fiction market, in a collection of English-language books stretching from 1860 to 2009 to support an account that has genre differentiation rising to the middle of the twentieth century and declining by the end of the century.
Abstract: The organization of fiction into genres is a relatively recent innovation. We cast new light on the history of this practice by studying the strength of the textual differentiation between genres, and between genre fiction collectively and the rest of the fiction market, in a collection of English-language books stretching from 1860 to 2009. To measure differentiation, we adapt distance measures that have been used to evaluate the strength of clustering. We use genre labels from two different sources: the Library of Congress headings assigned by librarians, and the genre categories implicit in book reviews published by Kirkus Reviews, covering books from 1928 to 2009. Both sources support an account that has genre differentiation rising to (roughly) the middle of the twentieth century, and declining by the end of the century.

3 citations


Proceedings ArticleDOI
01 Aug 2020
TL;DR: This paper provides a use case utilizing an English literature dataset of 178,381 volumes curated by the HathiTrust Research Center (HTRC) for measuring the change of three literature genres.
Abstract: This paper investigates the limitations and challenges of the curated datasets provided by digital libraries in support of digital humanities (DH) research. Our presented work provides a use case utilizing an English literature dataset of 178,381 volumes curated by the HathiTrust Research Center (HTRC) for measuring the change of three literature genres. These volumes were selected from over 17 million digitized items in the HathiTrust Digital Library. We demonstrate our methods and workflow for improving the representativeness and scholarly usability of the existing datasets. We analyzed and effectively overcame three common limitations: duplicate volumes, uneven distribution of data and OCR errors. We suggest that stakeholders of digital libraries should flag and address these limitations to improve their provisions' usability in the context of digital humanities research.