scispace - formally typeset
Search or ask a question

Showing papers on "Smith–Waterman algorithm published in 2015"


Proceedings ArticleDOI
02 May 2015
TL;DR: This paper proposes an architecture, tailored for varied input sizes as well as harnessing software pruning strategies, to accelerate S-W, and demonstrates a 26.4x speedup over a 24-thread Intel Has well Xeon server, and outperforms wave front-based implementations by up to 6x with the same FPGA resource.
Abstract: The Smith-Waterman (S-W) algorithm is widely adopted by the state-of-the-art DNA sequence aligners Existing wave front-based methods ignored the fact that the S-W algorithm is fed with significantly varied-size inputs in modern aligners, in which the S-W algorithm is further optimized by exerting extensive pruning In this paper, we propose an architecture, tailored for varied input sizes as well as harnessing software pruning strategies, to accelerate S-W Our implementation demonstrates a 264x speedup over a 24-thread Intel Has well Xeon server, and outperforms wave front-based implementations by up to 6x with the same FPGA resource

56 citations


Journal ArticleDOI
TL;DR: A fast and scalable local network alignment tool called LocalAli is developed for the identification of functionally conserved modules in multiple networks based on a maximum-parsimony evolutionary model and suggests that LocalAli outperforms all existing algorithms in terms of coverage, consistency and scalability.
Abstract: Motivation Sequences and protein interaction data are of significance to understand the underlying molecular mechanism of organisms. Local network alignment is one of key systematic ways for predicting protein functions, identifying functional modules and understanding the phylogeny from these data. Most of currently existing tools, however, encounter their limitations, which are mainly concerned with scoring scheme, speed and scalability. Therefore, there are growing demands for sophisticated network evolution models and efficient local alignment algorithms. Results We developed a fast and scalable local network alignment tool called LocalAli for the identification of functionally conserved modules in multiple networks. In this algorithm, we firstly proposed a new framework to reconstruct the evolution history of conserved modules based on a maximum-parsimony evolutionary model. By relying on this model, LocalAli facilitates interpretation of resulting local alignments in terms of conserved modules, which have been evolved from a common ancestral module through a series of evolutionary events. A meta-heuristic method simulated annealing was used to search for the optimal or near-optimal inner nodes (i.e. ancestral modules) of the evolutionary tree. To evaluate the performance and the statistical significance, LocalAli were tested on 26 real datasets and 1040 randomly generated datasets. The results suggest that LocalAli outperforms all existing algorithms in terms of coverage, consistency and scalability, meanwhile retains a high precision in the identification of functionally coherent subnetworks. Availability The source code and test datasets are freely available for download under the GNU GPL v3 license at https://code.google.com/p/localali/. Contact jialu.hu@fu-berlin.de or knut.reinert@fu-berlin.de. Supplementary information Supplementary data are available at Bioinformatics online.

37 citations


Journal ArticleDOI
TL;DR: This work investigates a general tile‐based approach to facilitating fast alignment by deeply exploring the powerful compute capability of CUDA‐enabled GPUs and presents GSWABE, a graphics processing unit (GPU)‐accelerated pairwise sequence alignment algorithm for a collection of short DNA sequences.
Abstract: In this paper, we present GSWABE, a graphics processing unit GPU-accelerated pairwise sequence alignment algorithm for a collection of short DNA sequences. This algorithm supports all-to-all pairwise global, semi-global and local alignment, and retrieves optimal alignments on Compute Unified Device Architecture CUDA-enabled GPUs. All of the three alignment types are based on dynamic programming and share almost the same computational pattern. Thus, we have investigated a general tile-based approach to facilitating fast alignment by deeply exploring the powerful compute capability of CUDA-enabled GPUs. The performance of GSWABE has been evaluated on a Kepler-based Tesla K40 GPU using a variety of short DNA sequence datasets. The results show that our algorithm can yield a performance of up to 59.1 billions cell updates per second GCUPS, 58.5 GCUPS and 50.3 GCUPS for global, semi-global and local alignment, respectively. Furthermore, on the same system GSWABE runs up to 156.0 times faster than the Streaming SIMD Extensions SSE-based SSW library and up to 102.4 times faster than the CUDA-based MSA-CUDA the first stage in terms of local alignment. Compared with the CUDA-based gpu-pairAlign, GSWABE demonstrates stable and consistent speedups with a maximum speedup of 11.2, 10.7, and 10.6 for global, semi-global, and local alignment, respectively. Copyright © 2014 John Wiley & Sons, Ltd.

36 citations


Journal ArticleDOI
TL;DR: Efficient interpair pruning and band optimization makes it possible to complete the all-pairs comparisons of the sequences of the same species 1.2 times faster than the intrapair pruning method.
Abstract: The Smith-Waterman algorithm is known to be a more sensitive approach than heuristic algorithms for local sequence alignment algorithms. Despite its sensitivity, a greater time complexity associated with the Smith-Waterman algorithm prevents its application to the all-pairs comparisons of base sequences, which aids in the construction of accurate phylogenetic trees. The aim of this study is to achieve greater acceleration using the Smith-Waterman algorithm (by realizing interpair block pruning and band optimization) compared with that achieved using a previous method that performs intrapair block pruning on graphics processing units (GPUs). We present an interpair optimization method for the Smith-Waterman algorithm with the aim of accelerating the all-pairs comparison of base sequences. Given the results of the pairs of sequences, our method realizes efficient block pruning by computing a lower bound for other pairs that have not yet been processed. This lower bound is further used for band optimization. We integrated our interpair optimization method into SW#, a previous GPU-based implementation that employs variants of a banded Smith-Waterman algorithm and a banded Myers-Miller algorithm. Evaluation using the six genomes of Bacillus anthracis shows that our method pruned 88 % of the matrix cells on a single GPU and 73 % of the matrix cells on two GPUs. For the genomes of the human chromosome 21, the alignment performance reached 202 giga-cell updates per second (GCUPS) on two Tesla K40 GPUs. Efficient interpair pruning and band optimization makes it possible to complete the all-pairs comparisons of the sequences of the same species 1.2 times faster than the intrapair pruning method. This acceleration was achieved at the first phase of SW#, where our method significantly improved the initial lower bound. However, our interpair optimization was not effective for the comparison of the sequences of different species such as comparing human, chimpanzee, and gorilla. Consequently, our method is useful in accelerating the applications that require optimal local alignments scores for the same species. The source code is available for download from http://www-hagi.ist.osaka-u.ac.jp/research/code/ .

21 citations


Journal ArticleDOI
01 Apr 2015-PLOS ONE
TL;DR: The Parallel SW Alignment Software (PaSWAS) is possible to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and retrieve relevant information such as score, number of gaps and mismatches.
Abstract: Motivation To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis. Results With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation. (...)

14 citations


Journal ArticleDOI
TL;DR: The results showed that the proposed method significantly improves Smith-Waterman algorithm on CUDA-enabled GPUs in proper allocation of block and thread numbers.
Abstract: Sequence alignment lies at heart of the bioinformatics. The Smith-Waterman algorithm is one of the key sequence search algorithms and has gained popularity due to improved implementations and rapidly increasing compute power. Recently, the Smith-Waterman algorithm has been successfully mapped onto the emerging general-purpose graphics processing units (GPUs). In this paper, we focused on how to improve the mapping, especially for short query sequences, by better usage of shared memory. We performed and evaluated the proposed method on two different platforms (Tesla C1060 and Tesla K20) and compared it with two classic methods in CUDASW++. Further, the performance on different numbers of threads and blocks has been analyzed. The results showed that the proposed method significantly improves Smith-Waterman algorithm on CUDA-enabled GPUs in proper allocation of block and thread numbers.

14 citations


Proceedings ArticleDOI
09 Nov 2015
TL;DR: This paper presents new parallelization techniques for searching large-scale biological sequence databases with the Smith-Waterman algorithm on Xeon Phi-based neoheterogenous architectures using a collaborative computing scheme as well as hybrid parallelism.
Abstract: In this paper we present new parallelization techniques for searching large-scale biological sequence databases with the Smith-Waterman algorithm on Xeon Phi-based neoheterogenous architectures. In order to make full use of the compute power of both the multi-core CPU and the many-core Xeon Phi hardware, we use a collaborative computing scheme as well as hybrid parallelism. At the CPU side, we employ SSE intrinsics and multi-threading to implement SIMD parallelism. At the Xeon Phi side, we use Knights Corner vector instructions to gain more data parallelism. We have presented two dynamic task distribution schemes (thread level and device level) in order to achieve better load balancing. Furthermore, a multi-threaded asynchronous scheme is used to overlap communication and computation between CPUs and Xeon Phis. Evaluations on real protein sequence databases show that our method achieves a peak overall performance up to 220 GCUPS on a neo-heterogeneous platform consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of database size and query length. Our implementation is available at: http://turbo0628.github.io/LSBDS/.

12 citations


Proceedings ArticleDOI
05 Mar 2015
TL;DR: A novel structure of the Smith Waterman algorithm is presented that takes less number of cycles at the cost of utilizing a minimal amount of extra hardware resources as compared to its existing form, and achieves up to 25% performance gain.
Abstract: The emergence of bioinformatics has led to many new discoveries in living organisms. These discoveries would not have been possible without the developments made in the sequence alignment techniques. Many sequence alignment algorithms were developed to make the alignment process fast and accurate. However, the more precise algorithms take longer than their less precise counterparts. Researchers came with innovative approaches to combat the time consuming constraint. Their aim was to speed up the computational process by using more efficient implementations of the algorithms using state-of-the-art hardware platforms. Smith Waterman (SW) algorithm, being the most accurate in the alignment process, has been implemented on various high performance computing platforms for the same purpose. However, the intrinsic structure of the algorithm has got little attention. In this paper, we present a novel structure of the SW algorithm that takes less number of cycles at the cost of utilizing a minimal amount of extra hardware resources as compared to its existing form. The newly proposed architecture achieves up to 25% performance gain.

12 citations


Journal ArticleDOI
31 Dec 2015-PLOS ONE
TL;DR: The SW#db tool and a library for fast exact similarity search are presented, primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced.
Abstract: In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result–the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4–5 times faster than SSEARCH, 6–25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases

8 citations


Journal ArticleDOI
01 May 2015
TL;DR: A novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA is provided.
Abstract: This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit GPU cards. The model consists of the Master Node Dispatcher MND and the Worker GPU Nodes WGN. The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.

6 citations


Book ChapterDOI
04 Aug 2015
TL;DR: This paper explores parallelizing the Smith-Waterman algorithm using the OpenSHMEM model and interfaces in Open SHMEM 1.2 as well as the one-sided communication interfaces in MPI-3, and evaluates the parallel implementation on Titan, a Cray XK7 system at the Oak Ridge Leadership Computing Facility OLCF.
Abstract: The Smith-Waterman algorithm is used for determining the similarity between two very long data streams. A popular application of the Smith-Waterman algorithm is for sequence alignment in DNA sequences. Like many computational algorithms, the Smith-Waterman algorithm is constrained by the memory resources and the computational capacity of the system. As such, it can be accelerated and run at larger scales by parallelizing the implementation, allowing the work to be distributed to exploit HPC systems. A central part of the algorithm is computing the similarity matrix which is the mechanism that evaluates the quality of the matching sequences. This access pattern to the matrix to compute the similarity is non-uniform; as such, it better suits the Partioned Global Address Space PGAS programming model. In this paper, we explore parallelizing the Smith-Waterman algorithm using the OpenSHMEM model and interfaces in OpenSHMEM 1.2 as well as the one-sided communication interfaces in MPI-3. Further, we also explore the advantages of using non-blocking communication interfaces, which are proposed as extensions for a future OpenSHMEM specification. We evaluate the parallel implementation on Titan, a Cray XK7 system at the Oak Ridge Leadership Computing Facility OLCF. Our results demonstrate good weak and strong scaling characteristics for both of the OpenSHMEM and MPI-3 implementations.

Journal ArticleDOI
TL;DR: This paper proposes an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system.
Abstract: The Smith-Waterman (SW) algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs) and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization technique, and only using the GPU capability to do the SW computations one by one. Hence, in this paper, we will propose an efficient SW alignment method, called CUDA-SWfr, for the protein database search by using the intratask parallelization technique based on a CPU-GPU collaborative system. Before doing the SW computations on GPU, a procedure is applied on CPU by using the frequency distance filtration scheme (FDFS) to eliminate the unnecessary alignments. The experimental results indicate that CUDA-SWfr runs 9.6 times and 96 times faster than the CPU-based SW method without and with FDFS, respectively.

Journal ArticleDOI
TL;DR: Experimental results show that SpecAlign improves the performance by about 32 % on average when compared with CUDASW++2.0 and DOPA with Ssearch trace for 100 shortlisted sequences on NVIDIA GTX295.
Abstract: Finding regions of similarity between two data streams is a computational intensive and memory consuming problem, which refers as sequence alignment for biological sequences. Smith-Waterman algorithm is an optimal method of finding the local sequence alignment. It requires a large amount of computation and memory space, and is also constrained by the memory access speed of the Graphics Processing Units (GPUs) global memory when accelerating by using GPUs. Since biologists are commonly concerned with one or a few species in their research areas, SpecAlign is proposed to accelerate Smith-Waterman alignment of species-based protein sequences within the available GPU memory. It is designed to provide the best alignments of all the database sequences aligned on GPU. The new implementation improves performance by optimizing the organization of database, increasing GPU threads for every database sequence, and reducing the number of memory accesses to alleviate memory bandwidth bottleneck. Experimental results show that SpecAlign improves the performance by about 32 % on average when compared with CUDASW++2.0 and DOPA with Ssearch trace for 100 shortlisted sequences on NVIDIA GTX295. It also outperforms CUDASW++2.0 with Ssearch trace for 100 shortlisted sequences by about 52 % on NVIDIA GTX460.

Proceedings ArticleDOI
02 Nov 2015
TL;DR: This paper has proposed a k-mer based database searching and local alignment tool using box queries on BoND-SD-tree indexing, which is quite efficient for indexing and searching in Non-Ordered Discrete Data Space (NDDS).
Abstract: In the past, genome sequence databases had used main memory indexing, such as the suffix tree, for fast sequence searches. With next generation sequencing technologies, the amount of sequence data being generated is huge and main memory indexing is limited by the amount of memory available. K-mer based techniques are being more used for various genome sequence database applications such as local alignment. K-mer can also provide an excellent basis for creating efficient disk based indexing. In this paper, we have proposed a k-mer based database searching and local alignment tool using box queries on BoND-SD-tree indexing. BoND-tree is quite efficient for indexing and searching in Non-Ordered Discrete Data Space (NDDS). We have conducted experiments on searching DNA sequence databases using back translated protein query sequences and have compared with existing methods. We have also implemented local alignment of back translated protein query sequences with large DNA sequence databases using this index based k-mer search. Performances of this local alignment approach has been compared with that of Tblastn of NCBI. The results are quite promising and justify significance of the proposed approach.

Journal ArticleDOI
TL;DR: This paper proposes an algorithm which solves the problem of input data wild cards, offers a highly flexible set of parameters and displays a detailed alignment output and a compact representation of the mutated positions of the alignment.
Abstract: Optimal string alignment is used to discover evolutionary relationships or mutations in DNA/RNA or protein sequences. Errors, missing parts or uncertainty in such a sequence can be covered with wild cards, so-called wild bases. This makes an alignment possible even when the data are corrupted or incomplete. The extended pairwise local alignment of wild card DNA/RNA sequences requires additional calculations in the dynamic programming algorithm and necessitates a subsequent best- and worst-case analysis for the wild card positions. In this paper, we propose an algorithm which solves the problem of input data wild cards, offers a highly flexible set of parameters and displays a detailed alignment output and a compact representation of the mutated positions of the alignment. An implementation of the algorithm can be obtained at https://github.com/sysbio-bioinf/swat+ and http://sysbio.uni-ulm.de/?Software:Swat+.

Journal ArticleDOI
01 Dec 2015
TL;DR: The main research in this project is to align the DNA sequences by using the Needleman-Wunsch algorithm for global alignment and Smith-Waterman algorithm for local alignment based on the Dynamic Programming algorithm.
Abstract: The fundamental procedure of analyzing sequence content is sequence comparison. Sequence comparison can be defined as the problem of finding which parts of the sequences are similar and which parts are different, namely comparing two sequences to identify similarities and differences between them. A typical approach to solve this problem is to find a good and reasonable alignment between the two sequences. The main research in this project is to align the DNA sequences by using the Needleman-Wunsch algorithm for global alignment and Smith-Waterman algorithm for local alignment based on the Dynamic Programming algorithm. The Dynamic Programming Algorithm is guaranteed to find optimal alignment by exploring all possible alignments and choosing the best through the scoring and traceback techniques. The algorithms proposed and evaluated are to reduce the gaps in aligning sequences as well as the length of the sequences aligned without compromising the quality or correctness of results. In order to verify the accuracy and consistency of measurements obtained in Needleman-Wunsch and Smith-Waterman algorithms the data is compared with Emboss (global) and Emboss (local) with 600 strands test data.

Journal ArticleDOI
TL;DR: An inexact mapping algorithm based on pruning strategies for search tree exploration over genomic data that achieves a 13x speed-up over similar algorithms when allowing 6 base errors, including insertions, deletions and mismatches.
Abstract: Short sequence mapping methods for Next Generation Sequencing consist on a combination of seeding techniques followed by local alignment based on dynamic programming approaches Most seeding algorithms are based on backward search alignment, using the Burrows Wheeler Transform, the Ferragina and Manzini Index or Suffix Arrays All these backward search algorithms have excellent performance, but their computational cost highly increases when allowing errors In this paper, we discuss an inexact mapping algorithm based on pruning strategies for search tree exploration over genomic data The proposed algorithm achieves a 13x speed-up over similar algorithms when allowing 6 base errors, including insertions, deletions and mismatches This algorithm can deal with 400 bps reads with up to 9 errors in a high quality Illumina dataset In this example, the algorithm works as a preprocessor that reduces by 55% the number of reads to be aligned Depending on the aligner the overall execution time is reduced between 20–40% Although not intended as a complete sequence mapping tool, the proposed algorithm could be used as a preprocessing step to modern sequence mappers This step significantly reduces the number reads to be aligned, accelerating overall alignment time Furthermore, this algorithm could be used for accelerating the seeding step of already available sequence mappers In addition, an out-of-core index has been implemented for working with large genomes on systems without expensive memory configurations

Journal ArticleDOI
TL;DR: This paper presents parallel programming approaches to calculate the values of the cells in matrix’s scoring used in the Smith-Waterman's algorithm for sequence alignment using formulation based on anti-diagonals structure of data.
Abstract: In this paper, we present parallel programming approaches to calculate the values of the cells in matrix’s scoring used in the Smith-Waterman’s algorithm for sequence alignment. This algorithm, well known in bioinformatics for its applications, is unfortunately time-consuming on a serial computer. We use formulation based on anti-diagonals structure of data. This representation focuses on parallelizable parts of the algorithm without changing the initial formulation of the algorithm. Approaching data in that way give us a formulation more flexible. To examine this approach, we encode it in OpenMP and Cuda C. The performance obtained shows the interest of our paper.

Journal ArticleDOI
TL;DR: This paper discusses and evaluates an OpenMP based implementation of Smith-Waterman (SW) algorithm and demonstrates that the parallel version of the SW algorithm runs 2.63 times faster than its sequential counterpart.
Abstract: In bioinformatics, sequence alignment is a common and insistent task. Biologists align genome sequences to find important similarities and dissimilarities in them. Multiple heuristics and dynamic programming based approaches are available for sequence alignment. Smith-Waterman (SW), an exact algorithm for local alignment, is the most accurate of them all. However, the space and time complexity of the SW algorithm is quadratic. It is imperative to use parallelism and distributed computing techniques in order to speed up this process. In this paper, we discuss and evaluate an OpenMP based implementation of SW algorithm. All the experiments have been performed on a Linux based multi-core machine thereby reducing the overall complexity of the SW algorithm from quadratic to linear. The results obtained with various input sequences demonstrate that the parallel version of the SW algorithm runs 2.63 times faster than its sequential counterpart.

Proceedings ArticleDOI
09 Sep 2015
TL;DR: A parallel Fast and Accurate Mapping Assembly (FAMA) algorithm that segments the reference genome with a variant of the divide and conquer technique introduced by Hirschberg in his algorithm for finding the longest common subsequence of two sequences and maps the reads in the genome segments with the Smith-Waterman local alignment algorithm.
Abstract: Massively parallel sequencing technologies deliver thousands of short reads of a genome sample that are the building blocks for its computational reconstruction. Genome reconstruction algorithms are grouped in two broad classes, namely de novo assembly, and mapping (or assembly) to a reference genome. This paper introduces a parallel Fast and Accurate Mapping Assembly (FAMA) algorithm. The methods segments the reference genome with a variant of the divide and conquer technique introduced by Hirschberg in his algorithm for finding the longest common subsequence of two sequences, and maps the reads in the genome segments with the Smith-Waterman local alignment algorithm. Despite the divisions of the reference genome, the algorithm retains all the accuracy of Smith-Waterman while keeping all the characteristics of a massively parallel algorithm. Accuracy benchmarks shows that the algorithm is superior to one of the most popular assemblers used today.

02 Mar 2015
TL;DR: PaSWAS is a very effective parallel implementation of the Smith-Waterman algorithm, which delivers excellent results for GPUs (in both CUDA and OpenCL), and can be quite effective on CPUs, too.
Abstract: Detecting similarities between (RNA, DNA, and protein) sequences is an important part of bioinformatics. Among the algorithms used to accomplish this, the Smith-Waterman algorithm is very popular. A sequential implementation of Smith-Waterman requires quadratic running time with respect to the length of the sequences. As the amount of data in this field is continuously increasing, quick analysis through a sequential implementation is no longer feasible. One way to reduce the running time is by using parallelism and parallel platforms. There is a great diversity of hardware platforms that enable parallelism in different ways, each favoring different types of computations. The subject of this thesis is to understand the performance of the state-of-the-art parallel implementation of the Smith-Waterman algorithm, PaSWAS, on different parallel hardware platforms. PaSWAS has been designed and implemented using CUDA, the proprietary framework from NVIDIA. This choice limits the PaSWAS functionality to NVIDIA GPUs. By using OpenCL, a platform independent, standard programming model for many-cores, we enable PaSWAS to run in parallel on other hardware platforms. We show that, for NVIDIA GPUs, the portability enabled by OpenCL comes at the expense of performance. We further define a set of platform-specific parameters that have a high performance impact for the OpenCL implementation, and demonstrate empirically that their values are different for different hardware platforms. We also demonstrate that proper partitioning of the sequences can increase parallelism, which leads to better performance. Lastly, we create a performance estimator which is able to predict the execution time of the PaSWAS algorithm on different hardware platforms for given dataset. This enables us to determine a-priori which hardware platform to use for a given dataset. We conclude that PaSWAS is a very effective parallel implementation of the Smith-Waterman algorithm, which delivers excellent results for GPUs (in both CUDA and OpenCL), and can be quite effective on CPUs, too. The performance vs. portability tradeoff of OpenCL is relevant for PaSWAS, and it is ultimately the choice of the end user which of the two is more relevant.

DOI
04 May 2015
TL;DR: The purpose of this survey is to study various parallel models which perform alignment of the sequences as fast as possible, which is a big challenge for both engineer and biologist.
Abstract: Nowadays stack of biological data growing steeply, so there is need of smart way to handle and process these data to extract meaningful information related to biological life. The purpose of this survey is to study various parallel models which perform alignment of the sequences as fast as possible, which is a big challenge for both engineer and biologist. The various parallel models discussed in this paper are: implementation using associative massive parallelism contain architecture such as associative computing, ClearSpeed coprocessor and Convey Computer. Some parallel programming models such as MPI, OpenMP and hybrid (combination of both). Then the implementation of alignment using systolic array and lastly uses single and multi-graphics processors, that is, using graphics processing units.

Journal ArticleDOI
TL;DR: The PBSalign method is introduced, which integrates techniques in graph theory, 3D localized shape analysis, geometric scoring, and utilization of physicochemical and geometrical properties to provide an efficient and accurate solution to binding-site alignment while striking the balance between topological details and computational complexity.
Abstract: Accurate alignment of protein-protein binding sites can aid in protein docking studies and constructing templates for predicting structure of protein complexes, along with in-depth understanding of evolutionary and functional relationships. However, over the past three decades, structural alignment algorithms have focused predominantly on global alignments with little effort on the alignment of local interfaces. In this paper, we introduce the PBSalign ( P rotein-protein B inding S ite align ment) method, which integrates techniques in graph theory, 3D localized shape analysis, geometric scoring, and utilization of physicochemical and geometrical properties. Computational results demonstrate that PBSalign is capable of identifying similar homologous and analogous binding sites accurately and performing alignments with better geometric match measures than existing protein-protein interface comparison tools. The proportion of better alignment quality generated by PBSalign is 46, 56, and 70 percent more than iAlign as judged by the average match index (MI), similarity index (SI), and structural alignment score (SAS), respectively. PBSalign provides the life science community an efficient and accurate solution to binding-site alignment while striking the balance between topological details and computational complexity.

DOI
01 Jan 2015
TL;DR: PhyLAT, the Phylogenetic Local Alignment Tool, is introduced, to compute local alignments of a query sequence against a fixed multiple-genome alignment of closely related species, and its alignments are more accurate than those of other commonly used programs, including BLAST, POY, MAFFT, MUSCLE, and CLUSTAL.
Abstract: OF THE DISSERTATION Integration of Alignment and Phylogeny in the Whole-Genome Era by Hongtao Sun Doctor of Philosophy in Computer Science Washington University in St. Louis, 2015 Professor Jeremy Buhler, Chair With the development of new sequencing techniques, whole genomes of many species have become available. This huge amount of data gives rise to new opportunities and challenges. These new sequences provide valuable information on relationships among species, e.g. genome rearrangement and conservation. One of the principal ways to investigate such information is multiple sequence alignment (MSA). Currently, there is large amount of MSA data on the internet, such as the UCSC genome database, but how to e↵ectively use this information to solve classical and new problems is still an area lacking of exploration. In this thesis, we explored how to use this information in four problems, i.e. sequence similarity search, multiple alignment improvement, short read mapping, and genome rearrangement inference. The first problem is sequence similarity search, i.e., given a query sequence, search its similar sequences in a database. The expansion of DNA sequencing capacity has enabled the sequencing of whole genomes from a number of related species. These genomes can be combined in a multiple alignment that provides useful information about the evolutionary x history at each genomic locus. One area in which evolutionary information can productively be exploited is in aligning a new sequence to a database of existing, aligned genomes. However, existing high-throughput alignment tools are not designed to work e↵ectively with multiple genome alignments. We introduce PhyLAT, the Phylogenetic Local Alignment Tool, to compute local alignments of a query sequence against a fixed multiple-genome alignment of closely related species. PhyLAT uses a known phylogenetic tree on the species in the multiple alignment to improve the quality of its computed alignments while also estimating the placement of the query on this tree. It combines a probabilistic approach to alignment with seeding and expansion heuristics to accelerate discovery of significant alignments. We provide evidence, using alignments of human chromosome 22 against a 5-species alignment from the UCSC Genome Browser database, that PhyLAT’s alignments are more accurate than those of other commonly used programs, including BLAST, POY, MAFFT, MUSCLE, and CLUSTAL. PhyLAT also identifies more alignments in coding DNA than does pairwise alignment alone. Finally, our tool determines the evolutionary relationship of query sequences to the database more accurately than do POY, RAxML, EPA, or pplacer. The second problem is multiple alignment quality improvement, i.e., given a multiple alignment, correct any wrong matches, i.e., matches between non-orthologous characters (bases or residues). This is important to all other data analysis based on multiple alignments. However, existing methods either compute alignments non-iteratively or use complex models which are very time-consuming and have the risk of overfitting. We developed an optimization algorithm to iteratively refine the multiple alignment quality. In each iteration, we take out one sequence from the multiple alignment, and realign it to the rest of the sequences using our phylogeny-aware alignment framework. We tested several strategies for picking sequences, i.e., picking out the most distant species from the rest species, picking out the closest species from the rest species and randomly picking out a sequence. Experiment xi results showed that di↵erent picking strategies gave very similar results. In other words, our method is very insensitive to sequence picking strategy, which makes it a stable algorithm for improving alignments of any number of sequences. The results showed that our method is more accurate than existing methods, i.e. MAFFT, Clustal-O, and MAVID, on test data from three sets of species from the UCSC genome database. The third problem is phylogeny-aware short read mapping using multiple informant sequences. Given a set of short reads from next-generation sequencing results, mapping them back to their orthologous locations in a reference genome is called short read mapping. This is a new problem arising with the development of next-generation sequencing techniques. Existing methods cannot deal with indels in alignments, and cannot do interspecies mapping. We developed a model, PhyMap, to align a read to a multiple alignment allowing mismatches and indels. PhyMap computes local alignments of a query sequence against a fixed multiplegenome alignment of closely related species. PhyMap uses a known phylogenetic tree on the species in the multiple alignment to improve the quality of its computed alignments while also estimating the placement of the query on this tree. We showed theoretically that our model can di↵erentiate orthologous sequences from paralogous sequences. Thus our algorithm can align short reads to their homologous positions in reference sequences. Our experiment results have proved this and showed that our model can di↵erentiate between orthologous and paralogous alignments. Furthermore, we compared our method with other popular short read mapping tools (BWA, BOWTIE and BLAST) on simulated data, and found that our method can map more reads to their orthologous locations in their closely-related species’ genomes than any one of them. The fourth problem is genome rearrangement inference, i.e., given a set of orthologous alignments along with the genomic orders in each aligned sequence and a set of new sequences

Book ChapterDOI
01 Jan 2015
TL;DR: This paper proposes an approach based on sequence alignment to compute the similarity between any two sequences and the results obtained were acceptable to some extent compared to previous studies that have surveyed.
Abstract: Enzymes are important in our life due to its importance in the most biological processes. Thus, classification of the enzyme’s function is vital to save efforts and time in the labs. In this paper, we propose an approach based on sequence alignment to compute the similarity between any two sequences. In the proposed approach, two different sequence alignment methods are used, namely, local and global sequence alignment. There are different score matrices such as BLOSUM and PAM are used in the local and global alignment to calculate the similarity between the unknown sequence and each sequence of the training sequences. The results which obtained were acceptable to some extent compared to previous studies that have surveyed.

Proceedings ArticleDOI
28 Nov 2015
TL;DR: Three commonly used parallel computing algorithms based on Compute Unified Device Architecture (CUDA) for SW algorithm acceleration are introduced and their advantages and disadvantages are illustrated.
Abstract: Smith-Waterman (SW) algorithm, which calculates the similarity between two given sequences, is broadly used in bioinformatics research field. However, the time complexity of the SW algorithm prevents it from being used for long sequence alignment. Since SW algorithm is based on dynamic programing, using single instruction multiple data parallel computing algorithm can significantly reduce the computing cost. For this reason, this review introduces three commonly used parallel computing algorithms based on Compute Unified Device Architecture (CUDA) for SW algorithm acceleration as well as illustrates their advantages and disadvantages.

Journal ArticleDOI
TL;DR: The problem of identifying conserved patterns of protein interaction networks as a graph optimization problem is formulated, and a fast heuristic algorithm for this problem is developed that discovers conserved modules with a larger number of proteins in an order of magnitude less time.
Abstract: Recently, researchers seeking to understand, modify, and create beneficial traits in organisms have looked for evolutionarily conserved patterns of protein interactions. Their conservation likely means that the proteins of these conserved functional modules are important to the trait's expression. In this paper, we formulate the problem of identifying these conserved patterns as a graph optimization problem, and develop a fast heuristic algorithm for this problem. We compare the performance of our network alignment algorithm to that of the MaWISh algorithm [Koyuturk M, Kim Y, Topkara U, Subramaniam S, Szpankowski W, Grama A, Pairwise alignment of protein interaction networks, J Comput Biol13(2):182-199, 2006.], which bases its search algorithm on a related decision problem formulation. We find that our algorithm discovers conserved modules with a larger number of proteins in an order of magnitude less time. The protein sets found by our algorithm correspond to known conserved functional modules at comparable precision and recall rates as those produced by the MaWISh algorithm.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: The local alignment is made for pairwise molecular sequences by applying a wavelet transform based approach (WMSA) to solve the alignment problem.
Abstract: The first fact of sequence analysis is sequence alignment, which is in turn pave the way for structural and functional analysis of the molecular sequence. Owing to the increase in biological data, the alignment approaches take more time for computation. Focusing this issue, in this work the local alignment is made for pairwise molecular sequences by applying a wavelet transform based approach (WMSA) to solve the alignment problem. Like global alignment, local alignment also gives more peculiar information of molecular sequences. Here, the sequence is converted into numerical values using the electron-ion interaction potential model. This in turn decomposed using one of the wavelet transform types and the similarity between the sequences is found using the cross-correlation measure. The significance of the similarity is evaluated using two scoring function namely Position Weight Matrix (PSM) and a new function called Count Score. The work is compared with the standard Smith-Waterman algorithm. The result shows that the proposed approach improves the speed without sacrificing the accuracy.

01 Jan 2015
TL;DR: Inspired by CUDA stream which enables concurrent kernel execution on Nvidia’s GPUs, this thesis proposes a socket based mechanism to enable concurrent task execution on the Xeon Phi.
Abstract: The Intel Xeon Phi coprocessor is a new choice for the high performance computing industry and it needs to be tested. In this thesis, we compared the difference in performance between the Xeon Phi and the GPU. The Smith-Waterman algorithm is a widely used algorithm for solving the sequence alignment problem. We implemented two versions of parallel SmithWaterman algorithm for the Xeon Phi and GPU. Inspired by CUDA stream which enables concurrent kernel execution on Nvidia’s GPUs, we propose a socket based mechanism to enable concurrent task execution on the Xeon Phi. We then compared our socket implementation with Intel’s offload mode and with an Nvidia GPU. The results showed that our socket implementation performs better than the offload mode but is still not as good as the GPU.