Top 2 papers published in the topic of Hybrid genome assembly in 1997

[...]

01 Jan 1997

TL;DR: A heuristic to speed up fragment assembly and implement it using a data structure called suffix array, which greatly improves the speed of overlap detection by up to 1,000 times while maintaining a high accuracy, and it is shown that data structures are powerful in many pattern matching applications.

...read moreread less

Abstract: This thesis is concerned with computational approaches to genome analysis. We discuss three biological applications: genomic rearrangements, gene recognitions, and genome sequencing, all of whose practical solutions involve interesting algorithm problems. In the genomic rearrangements, we seek to reconstruct the evolutionary history of the genome. We study the distance between genomes using fixed-length inversions and give a complete theoretical characterization for both linear and circular genomes. We also prove upper and lower bounds to the minimum distance. Pattern recognition is central to many gene recognition systems. We apply linear discriminant analysis in a special program called Pombe to identify protein coding regions in the Schizosaccharomyces pombe genome. The accuracy of gene structures we predicted is 97.2% correlation coefficient at the nucleotide level by cross validation. In a large scale genome sequencing project, we show that data structures are powerful in many pattern matching applications. We introduce a heuristic to speed up fragment assembly and implement it using a data structure called suffix array, which greatly improves the speed of overlap detection by up to 1,000 times while maintaining a high accuracy. Finally, we report a recent progress on this sequencing project and the assembly program STROLL. Compared with other widely used assemblers, STROLL is significantly faster and more reliable to handle repeat regions. In the last chapter, we point our future research to some open problems which are of great interest to both computer scientists and biologists.

...read moreread less

9 citations

Patent•

Efficient method to conduct large-scale genome sequencing

[...]

Bruce E. Kimmel, Michael C. Ellis, David A. Ruddy

23 Sep 1997

TL;DR: In this paper, an efficient method for sequencing large fragments of DNA is described, where a subclone path through the fragment is first identified; the collection of subclones that define this path is then sequenced using transposon-mediated direct sequencing techniques to an extent sufficient to provide the complete sequence of the fragment.

...read moreread less

Abstract: An efficient method for sequencing large fragments of DNA is described. A subclone path through the fragment is first identified; the collection of subclones that define this path is then sequenced using transposon-mediated direct sequencing techniques to an extent sufficient to provide the complete sequence of the fragment.

...read moreread less

1 citations

Showing papers on "Hybrid genome assembly published in 1997"