scispace - formally typeset
Search or ask a question

Showing papers by "Matej Lexa published in 2010"


Proceedings ArticleDOI
02 May 2010
TL;DR: This paper investigates the possibilities for hardware acceleration of approximate tandem repeat searching and describes a parametrized architecture suitable for chips with FPGA technology that is able to detect tandems with both types of errors and does not limit the length of detected tandem.
Abstract: Understanding the structure and function of DNA sequences represents an important area of research in modern biology. Unfortunately, analysis of such data is often complicated by the presence of mutations introduced by evolutionary processes. At the lowest scale, these usually occur in biological sequences as character substitutions, insertions or deletions (indel). They increase the time-complexity of algorithms for sequence analysis by introducing an element of uncertainty, complicating their practical usage. One class of such algorithms has been designed to search for tandem repeats with possible errors - approximate tandem repeats. This paper investigates the possibilities for hardware acceleration of approximate tandem repeat searching and describes a parametrized architecture suitable for chips with FPGA technology. The proposed architecture is able to detect tandems with both types of errors (mismatches and indels) and does not limit the length of detected tandem. A prototype of the circuit was implemented in VHDL language and synthesized for Virtex5 technology. Application on test sequences shows that the circuit is able to speed up tandem searching in orders of thousands in comparison with the best-known software method relying on suffix arrays.

3 citations


Book ChapterDOI
01 Sep 2010
TL;DR: A novel method for clustering of protein substructures that was developed to study the relationships between protein sequences and their corresponding structures is proposed and compared to other commonly used methods.
Abstract: In this paper, we propose a novel method for clustering of protein substructures that we developed to study the relationships between protein sequences and their corresponding structures. We show the results of the comparison to other commonly used methods for clustering of protein structures. Finally, we outline a procedure for finding sequence profiles that tend to occur in more than one structural conformation but the number of their structural conformations is limited. This procedure is based on our method for protein substructure clustering.