Search or ask a question

Showing papers by "J. Stephen Downie published in 2021"

PDF

Open Access

Journal Article•DOI•

Giving shape to large digital libraries through exploratory data analysis

[...]

Peter Organisciak¹, Benjamin M. Schmidt², J. Stephen Downie³•Institutions (3)

University of Denver¹, New York University², University of Illinois at Urbana–Champaign³

13 Jul 2021-Journal of the Association for Information Science and Technology

TL;DR: This work examines the role that exploratory data analysis and visualization tools may play in understanding large bibliographic datasets and presents one such tool, Hathi trust+Bookworm, which allows multifaceted exploration of the multimillion work HathiTrust Digital Library.

...read moreread less

Abstract: The emergence of large multi‐institutional digital libraries has opened the door to aggregate‐level examinations of the published word. Such large‐scale analysis offers a new way to pursue traditional problems in the humanities and social sciences, using digital methods to ask routine questions of large corpora. However, inquiry into multiple centuries of books is constrained by the burdens of scale, where statistical inference is technically complex and limited by hurdles to access and flexibility. This work examines the role that exploratory data analysis and visualization tools may play in understanding large bibliographic datasets. We present one such tool, HathiTrust+Bookworm, which allows multifaceted exploration of the multimillion work HathiTrust Digital Library, and center it in the broader space of scholarly tools for exploratory data analysis.

...read moreread less

10 citations

Journal Article•DOI•

Synthetic Biology Knowledge System.

[...]

Jeanet Mante¹, Yikai Hao², Jacob Jett³, Udayan Joshi², Kevin W Keating⁴, Xiang Lu², Gaurav Nakum², Nicholas E. Rodriguez⁵, Jiawei Tang², Logan Terry⁶, Xuanyu Wu², Eric Yu⁶, J. Stephen Downie³, Bridget T. McInnes⁵, Mai H. Nguyen², Brandon Sepulvado⁷, Eric M. Young⁴, Chris J. Myers¹ - Show less +14 more•Institutions (7)

University of Colorado Boulder¹, University of California, San Diego², University of Illinois at Urbana–Champaign³, Worcester Polytechnic Institute⁴, Virginia Commonwealth University⁵, University of Utah⁶, University of Chicago⁷

17 Sep 2021-ACS Synthetic Biology

TL;DR: The Synthetic Biology Knowledge System (SBKS) as discussed by the authors is an instance of the SynBioHub repository that includes text and data information that has been mined from papers published in ACS as discussed by the authors.

...read moreread less

Abstract: The Synthetic Biology Knowledge System (SBKS) is an instance of the SynBioHub repository that includes text and data information that has been mined from papers published in ACS Synthetic Biology. This paper describes the SBKS curation framework that is being developed to construct the knowledge stored in this repository. The text mining pipeline performs automatic annotation of the articles using natural language processing techniques to identify salient content such as key terms, relationships between terms, and main topics. The data mining pipeline performs automatic annotation of the sequences extracted from the supplemental documents with the genetic parts used in them. Together these two pipelines link genetic parts to papers describing the context in which they are used. Ultimately, SBKS will reduce the time necessary for synthetic biologists to find the information necessary to complete their designs.

...read moreread less

6 citations

The Gutenberg-HathiTrust Parallel Corpus: A Real-World Dataset for Noise Investigation in Uncorrected OCR Texts

[...]