De novoIdentification of DNA Modifications Enabled by Genome-Guided Nanopore Signal Processing

doi:10.1101/094672

Open AccessPosted ContentDOI

De novoIdentification of DNA Modifications Enabled by Genome-Guided Nanopore Signal Processing

Marcus H. Stoiber, +8 more

- 15 Dec 2016 -

bioRxiv

- pp 094672

TLDR

The first algorithm for the identification of modified nucleotides without the need for prior training data is presented along with the open source software implementation, nanoraw, which accurately assigns contiguous raw nanopore signal to genomic positions, enabling novel data visualization and increasing power and accuracy for the discovery of covalently modified bases in native DNA.

Abstract:

Advances in nanopore sequencing technology have enabled investigation of the full catalogue of covalent DNA modifications. We present the first algorithm for the identification of modified nucleotides without the need for prior training data along with the open source software implementation, nanoraw. Nanoraw accurately assigns contiguous raw nanopore signal to genomic positions, enabling novel data visualization, and increasing power and accuracy for the discovery of covalently modified bases in native DNA. Ground truth case studies utilizing synthetically methylated DNA show the capacity to identify three distinct methylation marks, 4mC, 5mC, and 6mA, in seven distinct sequence contexts without any changes to the algorithm. We demonstrate quantitative reproducibility simultaneously identifying 5mC and 6mA in native E. coli across biological replicates processed in different labs. Finally we propose a pipeline for the comprehensive discovery of DNA modifications in any genome without a priori knowledge of their chemical identities.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

The Architecture of SARS-CoV-2 Transcriptome.

Dong Wan Kim, +5 more

- 14 May 2020 -

Cell

TL;DR: Functional investigation of the unknown transcripts and RNA modifications discovered in this study will open new directions to the understanding of the life cycle and pathogenicity of SARS-CoV-2.

...read moreread less

Journal ArticleDOI

Opportunities and challenges in long-read sequencing data analysis.

Shanika L. Amarasinghe, +10 more

- 07 Feb 2020 -

Genome Biology

TL;DR: The current landscape of available tools is reviewed, the principles of error correction, base modification detection, and long-read transcriptomics analysis are focused on, and the challenges that remain are highlighted.

...read moreread less

Journal ArticleDOI

From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy.

Franka J. Rang, +2 more

- 13 Jul 2018 -

Genome Biology

TL;DR: Computational approaches determining the nanopore sequencing error rate are reviewed, and strategies for translation of raw sequencing data into base calls for detection of base modifications and for obtaining consensus sequences are outlined.

...read moreread less

Journal ArticleDOI

Long-read human genome sequencing and its applications.

Glennis A. Logsdon, +2 more

- 05 Jun 2020 -

Nature Reviews Genetics

TL;DR: The currently available platforms, how the technologies are being applied to assemble and phase human genomes, and their impact on improving the authors' understanding of human genetic variation are discussed.

...read moreread less

Journal ArticleDOI

The RNA modification landscape in human disease.

Nicky Jonkhout, +5 more

- 30 Aug 2017 -

RNA

TL;DR: This work summarizes the state of knowledge and provides a catalog of RNA modifications and their links to neurological disorders, cancers, and other diseases, expecting that this catalog will help prioritize those RNA modifications for transcriptome-wide maps.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Statistical Methods for Research Workers

R. A. Fisher

TL;DR: The prime object of as discussed by the authors is to put into the hands of research workers, and especially of biologists, the means of applying statistical tests accurately to numerical data accumulated in their own laboratories or available in the literature.

...read moreread less

Journal ArticleDOI

On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other

Henry B. Mann, +1 more

- 01 Mar 1947 -

Annals of Mathematical Statistics

TL;DR: In this paper, the authors show that the limit distribution is normal if n, n$ go to infinity in any arbitrary manner, where n = m = 8 and n = n = 8.

...read moreread less

Posted ContentDOI

Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM

Heng Li

- 16 Mar 2013 -

arXiv: Genomics

TL;DR: BWA-MEM automatically chooses between local and end-to-end alignments, supports paired-end reads and performs chimeric alignment, which is robust to sequencing errors and applicable to a wide range of sequence lengths from 70bp to a few megabases.

...read moreread less

Proceedings Article

Fitting a mixture model by expectation maximization to discover motifs in biopolymers.

Timothy L. Bailey, +1 more

TL;DR: The algorithm described in this paper discovers one or more motifs in a collection of DNA or protein sequences by using the technique of expectation maximization to fit a two-component finite mixture model to the set of sequences.

...read moreread less

Journal ArticleDOI

Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.

Sergey Koren, +5 more

- 15 Mar 2017 -

Genome Research

TL;DR: Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences, is presented, demonstrating that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences or Oxford Nanopore technologies.

...read moreread less