Home
/
Authors
/
Rob Patro

Author

Rob Patro

Other affiliations: Carnegie Mellon University, Stony Brook University

Bio: Rob Patro is an academic researcher from University of Maryland, College Park. The author has contributed to research in topics: De Bruijn graph & Computer science. The author has an hindex of 28, co-authored 103 publications receiving 7105 citations. Previous affiliations of Rob Patro include Carnegie Mellon University & Stony Brook University.

Topics: De Bruijn graph, Computer science, Medicine, De Bruijn sequence, Bioconductor ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Salmon provides fast and bias-aware quantification of transcript expression

[...]

Rob Patro¹, Geet Duggal, Michael I. Love², Rafael A. Irizarry², Carl Kingsford³ - Show less +1 more•Institutions (3)

Stony Brook University¹, Harvard University², Carnegie Mellon University³

01 Apr 2017-Nature Methods

TL;DR: Salmon is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.

...read moreread less

Abstract: We introduce Salmon, a lightweight method for quantifying transcript abundance from RNA-seq reads. Salmon combines a new dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure. It is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which, as we demonstrate here, substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.

...read moreread less

6,095 citations

Journal Article•DOI•

Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms

[...]

Rob Patro¹, Stephen M. Mount², Carl Kingsford¹•Institutions (2)

Carnegie Mellon University¹, University of Maryland, College Park²

01 May 2014-Nature Biotechnology

TL;DR: Sailfish, a computational method for quantifying the abundance of previously annotated RNA isoforms from RNA-seq data, exemplifies the potential of lightweight algorithms for efficiently processing sequencing reads.

...read moreread less

Abstract: A new algorithm speeds up the quantification of transcripts from RNA-seq data by doing away with read mapping.

...read moreread less

612 citations

Journal Article•DOI•

TransRate: reference-free quality assessment of de novo transcriptome assemblies

[...]

Richard Smith-Unna¹, Chris Boursnell¹, Rob Patro², Julian M. Hibberd¹, Steven L. Kelly³ - Show less +1 more•Institutions (3)

University of Cambridge¹, Stony Brook University², University of Oxford³

01 Jun 2016-Genome Research

TL;DR: TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies using only the sequenced reads and the assembly as input and it is revealed that variance in the quality of the input data explains 43% of the variance inThe quality of published de noVO transcriptome assembly assemblies.

...read moreread less

Abstract: TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies Using only the sequenced reads and the assembly as input, we show that multiple common artifacts of de novo transcriptome assembly can be readily detected These include chimeras, structural errors, incomplete assembly, and base errors TransRate evaluates these errors to produce a diagnostic quality score for each contig, and these contig scores are integrated to evaluate whole assemblies Thus, TransRate can be used for de novo assembly filtering and optimization as well as comparison of assemblies generated using different methods from the same input reads Applying the method to a data set of 155 published de novo transcriptome assemblies, we deconstruct the contribution that assembly method, read length, read quantity, and read quality make to the accuracy of de novo transcriptome assemblies and reveal that variance in the quality of the input data explains 43% of the variance in the quality of published de novo transcriptome assemblies Because TransRate is reference-free, it is suitable for assessment of assemblies of all types of RNA, including assemblies of long noncoding RNA, rRNA, mRNA, and mixed RNA samples

...read moreread less

585 citations

Journal Article•DOI•

Sailfish: Alignment-free Isoform Quantification from RNA-seq Reads using Lightweight Algorithms

[...]

Rob Patro¹, Stephen M. Mount², Carl Kingsford¹•Institutions (2)

Carnegie Mellon University¹, University of Maryland, College Park²

16 Aug 2013-arXiv: Genomics

TL;DR: Sailfish as mentioned in this paper is a novel computational method for quantifying the abundance of previously annotated RNA isoforms from RNA-seq data, which avoids mapping reads, which is a timeconsuming step in all current methods.

...read moreread less

Abstract: RNA-seq has rapidly become the de facto technique to measure gene expression. However, the time required for analysis has not kept up with the pace of data generation. Here we introduce Sailfish, a novel computational method for quantifying the abundance of previously annotated RNA isoforms from RNA-seq data. Sailfish entirely avoids mapping reads, which is a time-consuming step in all current methods. Sailfish provides quantification estimates much faster than existing approaches (typically 20-times faster) without loss of accuracy.

...read moreread less

399 citations

Journal Article•DOI•

Global network alignment using multiscale spectral signatures

[...]

Rob Patro¹, Carl Kingsford¹•Institutions (1)

University of Maryland, College Park¹

01 Dec 2012-Bioinformatics

TL;DR: GHOST is introduced, a global pairwise network aligner that uses a novel spectral signature to measure topological similarity between subnetworks and is able to recover larger and more biologically significant, shared subnets between species.

...read moreread less

Abstract: Motivation: Protein interaction networks provide an important system-level view of biological processes. One of the fundamental problems in biological network analysis is the global alignment of a pair of networks, which puts the proteins of one network into correspondence with the proteins of another network in a manner that conserves their interactions while respecting other evidence of their homology. By providing a mapping between the networks of different species, alignments can be used to inform hypotheses about the functions of unannotated proteins, the existence of unobserved interactions, the evolutionary divergence between the two species and the evolution of complexes and pathways. Results: We introduce GHOST, a global pairwise network aligner that uses a novel spectral signature to measure topological similarity between subnetworks. It combines a seed-and-extend global alignment phase with a local search procedure and exceeds state-of-the-art performance on several network alignment tasks. We show that the spectral signature used by GHOST is highly discriminative, whereas the alignments it produces are also robust to experimental noise. When compared with other recent approaches, we find that GHOST is able to recover larger and more biologically significant, shared subnetworks between species. Availability: An efficient and parallelized implementation of GHOST, released under the Apache 2.0 license, is available at http://cbcb.umd.edu/kingsford_group/ghost Contact: rob@cs.umd.edu

...read moreread less

227 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

[...]

Michael I. Love¹, Michael I. Love², Wolfgang Huber, Simon Anders•Institutions (2)

Harvard University¹, Max Planck Society²

05 Dec 2014-Genome Biology

TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.

...read moreread less

Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

...read moreread less

47,038 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)

[...]

Glenn Tesler

01 Jun 2012

TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).

...read moreread less

Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

...read moreread less

10,124 citations

Journal Article•DOI•

Near-optimal probabilistic RNA-seq quantification

[...]

Nicolas Bray¹, Harold Pimentel¹, Páll Melsted², Lior Pachter¹•Institutions (2)

University of California, Berkeley¹, University of Iceland²

01 May 2016-Nature Biotechnology

TL;DR: Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases, which removes a major computational bottleneck in RNA-seq analysis.

...read moreread less

Abstract: We present kallisto, an RNA-seq quantification program that is two orders of magnitude faster than previous approaches and achieves similar accuracy. Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases. We use kallisto to analyze 30 million unaligned paired-end RNA-seq reads in <10 min on a standard laptop computer. This removes a major computational bottleneck in RNA-seq analysis.

...read moreread less

6,468 citations