scispace - formally typeset
Open AccessBook

Flexible Pattern Matching in Strings: Practical On-Line Search Algorithms for Texts and Biological Sequences

Reads0
Chats0
TLDR
This book presents a practical approach to string matching problems, focusing on the algorithms and implementations that perform best in practice, and includes all of the most significant new developments in complex pattern searching.
Abstract
This book presents a practical approach to string matching problems, focusing on the algorithms and implementations that perform best in practice. It covers searching for simple, multiple, and extended strings, as well as regular expressions, exactly and approximately. It includes all of the most significant new developments in complex pattern searching. The clear explanations, step-by-step examples, algorithms pseudo-code, and implementation efficiency maps will enable researchers, professionals, and students in bioinformatics, computer science, and software engineering to choose the most appropriate algorithms for their applications.

read more

Citations
More filters
Journal ArticleDOI

PatMaN: rapid alignment of short sequences to large databases.

TL;DR: A tool suited for searching for many short nucleotide sequences in large databases, allowing for a predefined number of gaps and mismatches, using a non-deterministic automata matching algorithm on a keyword tree of the search strings.
Journal ArticleDOI

Fast exact string matching algorithms

TL;DR: A very fast new family of string matching algorithms based on hashing q-grams are proposed, which are the fastest on many cases, in particular, on small size alphabets.
Journal ArticleDOI

Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells.

TL;DR: The usefulness of motif analysis is exemplified in this review by how motif discovery improves peak calling in ChIP-seq and Ch IP-exo experiments and, when coupled with information on gene expression, allows insights into physical mechanisms of transcriptional modulation.
Journal ArticleDOI

Lightweight natural language text compression

TL;DR: End-Tagged Dense Code and (s, c)-Dense Code are described, two new semistatic statistical methods for compressing natural language texts that permit simpler and faster encoding and obtain better compression ratios than Tagged Huffman Code, while maintaining its fast direct search and random access capabilities.
Book ChapterDOI

An efficient compression code for text databases

TL;DR: A new compression format for natural language texts, allowing both exact and approximate search without decompression, and new upper and lower bounds for the redundancy of d-ary Huffman codes are presented.
References
More filters
Journal ArticleDOI

Basic Local Alignment Search Tool

TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Book

Introduction to Algorithms

TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Journal ArticleDOI

Improved tools for biological sequence comparison.

TL;DR: Three computer programs for comparisons of protein and DNA sequences can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity.
Journal ArticleDOI

Identification of common molecular subsequences.

TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Book

Modern Information Retrieval

TL;DR: In this article, the authors present a rigorous and complete textbook for a first course on information retrieval from the computer science (as opposed to a user-centred) perspective, which provides an up-to-date student oriented treatment of the subject.