J
Jouni Sirén
Researcher at University of California, Santa Cruz
Publications - 57
Citations - 2083
Jouni Sirén is an academic researcher from University of California, Santa Cruz. The author has contributed to research in topics: Population & Compressed suffix array. The author has an hindex of 19, co-authored 52 publications receiving 1558 citations. Previous affiliations of Jouni Sirén include University of Chile & Helsinki Institute for Information Technology.
Papers
More filters
Journal ArticleDOI
Variation graph toolkit improves read mapping by representing genetic variation in the reference.
Erik Garrison,Jouni Sirén,Adam M. Novak,Glenn Hickey,Jordan M. Eizenga,Eric T. Dawson,Eric T. Dawson,William J. Jones,Shilpa Garg,Charles Markello,Michael F. Lin,Benedict Paten,Richard Durbin,Richard Durbin +13 more
TL;DR: Vg as discussed by the authors is a toolkit of computational methods for creating, manipulating, and using these structures as references at the scale of the human genome, which provides an efficient approach to mapping reads onto arbitrary variation graphs using generalized compressed suffix arrays, with improved accuracy over alignment to a linear reference.
Journal ArticleDOI
Indexing graphs for path queries with applications in genome research
TL;DR: The Burrows-Wheeler transform of strings to acyclic directed labeled graphs is extended to support path queries as an extension to substring searching, and several applications of such extensions are studied.
Journal ArticleDOI
Storage and Retrieval of Highly Repetitive Sequence Collections
TL;DR: New static and dynamic full-text indexes are developed that are able of capturing the fact that a collection is highly repetitive, and require space basically proportional to the length of one typical sequence plus the total number of edit operations.
Journal ArticleDOI
Genotyping structural variants in pangenome graphs using the vg toolkit.
Glenn Hickey,David Heller,David Heller,Jean Monlong,Jonas Andreas Sibbesen,Jouni Sirén,Jordan M. Eizenga,Eric T. Dawson,Erik Garrison,Adam M. Novak,Benedict Paten +10 more
TL;DR: It is shown that variation graphs, as implemented in the vg toolkit, provide an effective means for leveraging SV catalogs for short-read SV genotyping experiments and is benchmarked against state-of-the-art SV genotypes using three sequence-resolved SV catalogS generated by recent long-read sequencing studies.
Book ChapterDOI
Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections
TL;DR: It is shown that the state-of-the-art entropy-bound full-text self-indexes do not yet provide satisfactory space bounds for this specific task, and some new structures that use run-length encoding are engineer and empirical evidence that these structures are superior to the current structures are given.