Prediction of probable genes by Fourier analysis of genomic sequences
Shrish Tiwari,Srinivasan Ramachandran,Alok Bhattacharya,Sudha Bhattacharya,Ramakrishna Ramaswamy +4 more
Reads0
Chats0
TLDR
The aim is to use Fourier techniques to analyse this periodicity, and thereby to develop a tool to recognize coding regions in genomic DNA, and find that the relative-height of the peak at f = 1/3 in the Fourier spectrum is a good discriminator of coding potential.Abstract:
Motivation: The major signal in coding regions of genomic sequences is a three-base periodicity. Our aim is to use Fourier techniques to analyse this periodicity, and thereby to develop a tool to recognize coding regions in genomic DNA. Result: The three-base periodicity in the nucleotide arrangement is evidenced as a sharp peak at frequency f — 1/3 in the Fourier (or power) spectrum. From extensive spectral analysis of DNA sequences of total length over 5.5 million base pairs from a wide variety or organisms (including the human genome), and by separately examining coding and non-coding sequences, we find that the relative height of the peak at f = 1/3 in the Fourier spectrum is a good discriminator of coding potential. This feature is utilized by us to detect probable coding regions in DNA sequences, by examining the local signal-to-noise ratio of the peak within a sliding window. While the overall accuracy is comparable to that of other techniques currently in use, the measure that is presently proposed is independent of training sets or existing database information, and can thus find general application. Availability: A computer program GeneScan which locates coding open reading frames and exonic regions in genomic sequences has been developed, and is available on request. Contact: E-mail: rama@jnuniv.emet.in.read more
Citations
More filters
Journal ArticleDOI
Antimicrobial peptides: an overview of a promising class of therapeutics
TL;DR: This work reviews the advantages of these molecules in clinical applications, their disadvantages including their low in vivo stability, high costs of production and the strategies for their discovery and optimization.
Journal ArticleDOI
Genomic signal processing
TL;DR: Digital signal processing provides a set of novel and useful tools for solving highly relevant problems in genomic information science and technology, in the form of local texture, color spectrograms visually provide significant information about biomolecular sequences which facilitates understanding of local nature, structure, and function.
Journal ArticleDOI
Predicting function: from genes to genomes and back.
Peer Bork,Thomas Dandekar,Yolande Diaz-Lazcoz,Frank Eisenhaber,Martijn A. Huynen,Yanping Yuan +5 more
TL;DR: This review focuses on the added value that is provided by completely sequenced genomes in function prediction, and various levels of sequence annotation and function prediction are discussed, ranging from genomic sequence to that of complex cellular processes.
Journal ArticleDOI
Heterochromatic sequences in a Drosophila whole-genome shotgun assembly
Roger A. Hoskins,Christopher D. Smith,Joseph W. Carlson,A. Bernardo Carvalho,Aaron L. Halpern,Joshua S. Kaminker,Cameron Kennedy,Christopher J. Mungall,Beth A. Sullivan,Granger G. Sutton,Jiro C. Yasuhara,Barbara T. Wakimoto,Eugene W. Myers,Susan E. Celniker,Gerald M. Rubin,Gerald M. Rubin,Gary H. Karpen +16 more
TL;DR: Bacterial artificial chromosome (BAC)-based fluorescence in situ hybridization analysis was used to correlate the genomic sequence with the cytogenetic map in order to refine the genomic definition of the centric heterochromatin; on the basis of the cytological definition, the annotated Release 3 euchromatic sequence extends into the centrics of the Drosophila genome on each chromosome arm.
Journal ArticleDOI
Molecular cloning and characterization of prostase, an androgen-regulated serine protease with prostate-restricted expression.
TL;DR: The sequence homology between prostase and other well-characterized serine proteases suggests several potential functional roles for the prostase protein that include the degradation of extracellular matrix and the activation of PSA and other proteases.
References
More filters
Journal ArticleDOI
Whole-genome random sequencing and assembly of Haemophilus influenzae Rd.
Fleischmann Rd,Adams,Owen White,Rebecca A. Clayton,Ewen F. Kirkness,Anthony R. Kerlavage,Carol J. Bult,J F Tomb,Brian Dougherty,Merrick Jm +9 more
TL;DR: An approach for genome analysis based on sequencing and assembly of unselected pieces of DNA from the whole chromosome has been applied to obtain the complete nucleotide sequence of the genome from the bacterium Haemophilus influenzae Rd.
Journal ArticleDOI
2.2 Mb of contiguous nucleotide sequence from chromosome III of C. elegans
Richard K. Wilson,R. Ainscough,Karen E. Anderson,C. Baynes,Mary Berks,James K. Bonfield,James Burton,M. Connell,T. Copsey,John A. Cooper,Alan Coulson,M. Craxton,S. Dear,Zijin Du,Richard Durbin,Anthony Favello,A. Fraser,Lucinda Fulton,A. Gardner,Philip Green,T. Hawkins,LaDeana W. Hillier,M. Jier,L. Johnston,Martin K. Jones,J. Kershaw,J. Kirsten,N. Laisster,Phil Latreille,J. Lightning,C. Lloyd,Beverley J. Mortimore,M. O'Callaghan,J. Parsons,C. Percy,L. Rifken,A. Roopra,D. Saunders,Ratna Shownkeen,M. Sims,N. Smaldon,Andrew J.H. Smith,Michael D. Smith,Erik L. L. Sonnhammer,Rodger Staden,John Sulston,Jean Thierry-Mieg,K. Thomas,M. Vaudin,K. Vaughan,Robert H. Waterston,A. Watson,L. Weinstock,J. Wilkinson-Sproat,P. Wohldman +54 more
TL;DR: The nucleotide sequence of a contiguous 2,181,032 base pairs in the central gene cluster of chromosome III is completed, and comparison with the public sequence databases reveals similarities to previously known genes for about one gene in three.
Journal ArticleDOI
Long-range correlations in nucleotide sequences
Chung-Kang Peng,Sergey V. Buldyrev,Ary L. Goldberger,Shlomo Havlin,Shlomo Havlin,Francesco Sciortino,Michael Simons,Michael Simons,H. E. Stanley +8 more
TL;DR: This work proposes a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which it refers to as a 'DNA walk', and uncovers a remarkably long-range power law correlation.
Journal ArticleDOI
The complete DNA sequence of Autographa californica nuclear polyhedrosis virus.
TL;DR: The complete nucleotide sequence of the genome of clone 6 of the baculovirus Autographa californica nuclear polyhedrosis virus (AcNPV) has been determined and it is proposed that clone C6 is considered the archetype AcNPV for comparison purposes.
Journal ArticleDOI
Recognition of protein coding regions in DNA sequences
TL;DR: The test has been thoroughly proven on 400,000 bases of sequence data: it misclassifies 5% of the regions tested and gives an answer of "No Opinion" one fifth of the time.