Showing papers by "Ming-Yang Kao published in 2011"

PDF

Open Access

Journal Article•DOI•

Discovering almost any hidden motif from multiple sequences

[...]

Bin Fu¹, Ming-Yang Kao², Lusheng Wang³•Institutions (3)

University of Texas–Pan American¹, Northwestern University², City University of Hong Kong³

01 Mar 2011-ACM Transactions on Algorithms

TL;DR: An efficient algorithm is developed that can discover a hidden motif from a set of sequences for any alphabet Σ with |Σ|≥ 2 and is applicable to DNA motif discovery.

...read moreread less

Abstract: We study a natural probabilistic model for motif discovery. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet Σ. A motif Geg1g2… gm is a string of m characters. Each background sequence is implanted with a probabilistically generated approximate copy of G. For a probabilistically generated approximate copy b1b2… bm of G, every character is probabilistically generated such that the probability for bin gi is at most α. In this article, we develop an efficient algorithm that can discover a hidden motif from a set of sequences for any alphabet Σ with vΣvg 2 and is applicable to DNA motif discovery. We prove that for α c0, e, and Δ2 such that if there are at least c0 log n input sequences, then in O(n2/h(log n)O(1)) time this algorithm finds the motif with probability at least 3/4 for every Gi Σρ-Ψρ, h,e(Σ), where n the length of longest sequences, ρ is the length of the motif, h is a parameter with ρg 4hg Δ2log n, and Ψρ, h,e(Σ) is a small subset of at most 2−Θ(e2 h) fraction of the sequences in Σρ.

...read moreread less

3 citations