scispace - formally typeset
J

Jouni Sirén

Researcher at University of California, Santa Cruz

Publications -  57
Citations -  2083

Jouni Sirén is an academic researcher from University of California, Santa Cruz. The author has contributed to research in topics: Population & Compressed suffix array. The author has an hindex of 19, co-authored 52 publications receiving 1558 citations. Previous affiliations of Jouni Sirén include University of Chile & Helsinki Institute for Information Technology.

Papers
More filters
Journal ArticleDOI

Variation graph toolkit improves read mapping by representing genetic variation in the reference.

TL;DR: Vg as discussed by the authors is a toolkit of computational methods for creating, manipulating, and using these structures as references at the scale of the human genome, which provides an efficient approach to mapping reads onto arbitrary variation graphs using generalized compressed suffix arrays, with improved accuracy over alignment to a linear reference.
Journal ArticleDOI

Indexing graphs for path queries with applications in genome research

TL;DR: The Burrows-Wheeler transform of strings to acyclic directed labeled graphs is extended to support path queries as an extension to substring searching, and several applications of such extensions are studied.
Journal ArticleDOI

Storage and Retrieval of Highly Repetitive Sequence Collections

TL;DR: New static and dynamic full-text indexes are developed that are able of capturing the fact that a collection is highly repetitive, and require space basically proportional to the length of one typical sequence plus the total number of edit operations.
Journal ArticleDOI

Genotyping structural variants in pangenome graphs using the vg toolkit.

TL;DR: It is shown that variation graphs, as implemented in the vg toolkit, provide an effective means for leveraging SV catalogs for short-read SV genotyping experiments and is benchmarked against state-of-the-art SV genotypes using three sequence-resolved SV catalogS generated by recent long-read sequencing studies.
Book ChapterDOI

Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections

TL;DR: It is shown that the state-of-the-art entropy-bound full-text self-indexes do not yet provide satisfactory space bounds for this specific task, and some new structures that use run-length encoding are engineer and empirical evidence that these structures are superior to the current structures are given.