scispace - formally typeset
Open AccessPosted ContentDOI

Optimal gap-affine alignment in O(s) space

TLDR
The bidirectional WFA algorithm (BiWFA), the first gap-affine algorithm capable of computing optimal alignments in O(s) memory while retaining WFA’s time complexity of O(ns), is presented.
Abstract
Motivation Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algorithms based on dynamic programming are strongly limited by quadratic requirements in time and memory. The recently proposed wavefront alignment algorithm (WFA) introduced an efficient algorithm to perform exact gap-affine alignment in O(ns) time, where s is the optimal score and n is the sequence length. Notwithstanding these bounds, WFA’s O(s2) memory requirements become computationally impractical for genome-scale alignments, leading to a need for further improvement. Results In this paper, we present the bidirectional WFA algorithm (BiWFA), the first gap-affine algorithm capable of computing optimal alignments in O(s) memory while retaining WFA’s time complexity of O(ns). As a result, this work improves the lowest known memory bound O(n) to compute gap-affine alignments. In practice, our implementation never requires more than a few hundred MBs aligning noisy Oxford Nanopore Technologies reads up to 1 Mbp long while maintaining competitive execution times. Availability All code is publicly available at https://github.com/smarco/BiWFA-paper Contact santiagomsola@gmail.com

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted ContentDOI

A Draft Human Pangenome Reference

TL;DR: The Human Pangenome Reference Consortium (HPRC) as mentioned in this paper presented a first draft human pangeneome reference, which contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals.
Journal ArticleDOI

From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures

TL;DR: In this article , the authors describe the ongoing journey in significantly improving the performance, accuracy, and efficiency of genome analysis using intelligent algorithms and hardware architectures, and conclude with a foreshadowing of future challenges, benefits, and research directions triggered by the development of both very low cost yet highly error prone new sequencing technologies and specialized hardware chips for genomics.
Journal ArticleDOI

Recombination between heterologous human acrocentric chromosomes

TL;DR: In the first complete assembly of a human genome, the Telomere-to-Telomere Consortium's CHM13 assembly (T2T-CHM13) provided a model of their homology as mentioned in this paper .
Posted ContentDOI

Building pangenome graphs

TL;DR: PanGenome Graph Builder (PGGB) as discussed by the authors uses all-to-all whole-genome alignments and learned graph embeddings to build and iteratively refine a model in which they can identify variation, measure conservation, detect recombination events, and infer phylogenetic relationships.
References
More filters
Journal ArticleDOI

Basic Local Alignment Search Tool

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI

The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Journal ArticleDOI

A general method applicable to the search for similarities in the amino acid sequence of two proteins

TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.
Posted ContentDOI

Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM

Heng Li
- 16 Mar 2013 - 
TL;DR: BWA-MEM automatically chooses between local and end-to-end alignments, supports paired-end reads and performs chimeric alignment, which is robust to sequencing errors and applicable to a wide range of sequence lengths from 70bp to a few megabases.
Journal ArticleDOI

Minimap2: pairwise alignment for nucleotide sequences

TL;DR: Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database and is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mapper at higher accuracy, surpassing most aligners specialized in one type of alignment.
Related Papers (5)