scispace - formally typeset
Search or ask a question
JournalISSN: 1748-7188

Algorithms for Molecular Biology 

BioMed Central
About: Algorithms for Molecular Biology is an academic journal published by BioMed Central. The journal publishes majorly in the area(s): Computer science & Phylogenetic tree. It has an ISSN identifier of 1748-7188. It is also open access. Over the lifetime, 425 publications have been published receiving 13700 citations. The journal is also known as: AMB.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties of nucleic acids based on carefully measured thermodynamic parameters.
Abstract: Background Secondary structure forms an important intermediate level of description of nucleic acids that encapsulates the dominating part of the folding energy, is often well conserved in evolution, and is routinely used as a basis to explain experimental findings. Based on carefully measured thermodynamic parameters, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties.

3,620 citations

Journal ArticleDOI
TL;DR: Jane is shown to be a useful tool for cophylogenetic reconstruction and its functionality complements existing tools and it is therefore likely to be of use to researchers in the areas of parasitology, molecular systematics, and biogeography.
Abstract: This paper describes the theory and implementation of a new software tool, called Jane, for the study of historical associations. This problem arises in parasitology (associations of hosts and parasites), molecular systematics (associations of orderings and genes), and biogeography (associations of regions and orderings). The underlying problem is that of reconciling pairs of trees subject to biologically plausible events and costs associated with these events. Existing software tools for this problem have strengths and limitations, and the new Jane tool described here provides functionality that complements existing tools. The Jane software tool uses a polynomial time dynamic programming algorithm in conjunction with a genetic algorithm to find very good, and often optimal, solutions even for relatively large pairs of trees. The tool allows the user to provide rich timing information on both the host and parasite trees. In addition the user can limit host switch distance and specify multiple host switch costs by specifying regions in the host tree and costs for host switches between pairs of regions. Jane also provides a graphical user interface that allows the user to interactively experiment with modifications to the solutions found by the program. Jane is shown to be a useful tool for cophylogenetic reconstruction. Its functionality complements existing tools and it is therefore likely to be of use to researchers in the areas of parasitology, molecular systematics, and biogeography.

354 citations

Journal ArticleDOI
TL;DR: A new encoding of the de Bruijn graph, which occupies an order of magnitude less space than current representations, is proposed, which performed a complete de novo assembly of human genome short reads using 5.7 GB of memory in 23 hours.
Abstract: The de Bruijn graph data structure is widely used in next-generation sequencing (NGS). Many programs, e.g. de novo assemblers, rely on in-memory representation of this graph. However, current techniques for representing the de Bruijn graph of a human genome require a large amount of memory (≥30 GB). We propose a new encoding of the de Bruijn graph, which occupies an order of magnitude less space than current representations. The encoding is based on a Bloom filter, with an additional structure to remove critical false positives. An assembly software implementing this structure, Minia, performed a complete de novo assembly of human genome short reads using 5.7 GB of memory in 23 hours.

345 citations

Journal ArticleDOI
TL;DR: A program, RNAcofold, that computes the hybridization energy and base pairing pattern of a pair of interacting RNA molecules, and provides an extension of McCaskill's partition function algorithm to compute base pairing probabilities, realistic interaction energies, and equilibrium concentrations of duplex structures.
Abstract: RNA has been recognized as a key player in cellular regulation in recent years. In many cases, non-coding RNAs exert their function by binding to other nucleic acids, as in the case of microRNAs and snoRNAs. The specificity of these interactions derives from the stability of inter-molecular base pairing. The accurate computational treatment of RNA-RNA binding therefore lies at the heart of target prediction algorithms. The standard dynamic programming algorithms for computing secondary structures of linear single-stranded RNA molecules are extended to the co-folding of two interacting RNAs. We present a program, RNAcofold, that computes the hybridization energy and base pairing pattern of a pair of interacting RNA molecules. In contrast to earlier approaches, complex internal structures in both RNAs are fully taken into account. RNAcofold supports the calculation of the minimum energy structure and of a complete set of suboptimal structures in an energy band above the ground state. Furthermore, it provides an extension of McCaskill's partition function algorithm to compute base pairing probabilities, realistic interaction energies, and equilibrium concentrations of duplex structures. RNAcofold is distributed as part of the Vienna RNA Package, http://www.tbi.univie.ac.at/RNA/ . Stephan H. Bernhart – berni@tbi.univie.ac.at

290 citations

Journal ArticleDOI
TL;DR: DIALIGN-TX is presented, a substantial improvement of DIAL IGN-T that combines the previous greedy algorithm with a progressive alignment approach and produces significantly better alignments, especially on globally related sequences, without increasing the CPU time and memory consumption exceedingly.
Abstract: DIALIGN-T is a reimplementation of the multiple-alignment program DIALIGN. Due to several algorithmic improvements, it produces significantly better alignments on locally and globally related sequence sets than previous versions of DIALIGN. However, like the original implementation of the program, DIALIGN-T uses a a straight-forward greedy approach to assemble multiple alignments from local pairwise sequence similarities. Such greedy approaches may be vulnerable to spurious random similarities and can therefore lead to suboptimal results. In this paper, we present DIALIGN-TX, a substantial improvement of DIALIGN-T that combines our previous greedy algorithm with a progressive alignment approach. Our new heuristic produces significantly better alignments, especially on globally related sequences, without increasing the CPU time and memory consumption exceedingly. The new method is based on a guide tree; to detect possible spurious sequence similarities, it employs a vertex-cover approximation on a conflict graph. We performed benchmarking tests on a large set of nucleic acid and protein sequences For protein benchmarks we used the benchmark database BALIBASE 3 and an updated release of the database IRMBASE 2 for assessing the quality on globally and locally related sequences, respectively. For alignment of nucleic acid sequences, we used BRAliBase II for global alignment and a newly developed database of locally related sequences called DIRM-BASE 1. IRMBASE 2 and DIRMBASE 1 are constructed by implanting highly conserved motives at random positions in long unalignable sequences. On BALIBASE3, our new program performs significantly better than the previous program DIALIGN-T and outperforms the popular global aligner CLUSTAL W, though it is still outperformed by programs that focus on global alignment like MAFFT, MUSCLE and T-COFFEE. On the locally related test sets in IRMBASE 2 and DIRM-BASE 1, our method outperforms all other programs while MAFFT E-INSi is the only method that comes close to the performance of DIALIGN-TX.

261 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
20234
202226
202121
202019
201925
201816