scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Order and correlations in genomic DNA sequences. The spectral approach

01 Jan 2000-Physics-Uspekhi (IOP Publishing)-Vol. 43, Iss: 1, pp 55-78
TL;DR: In this paper, the structural analysis of genomic DNA sequences is discussed in the framework of the spectral approach, which is sufficiently universal due to the reciprocal correspondence and mutual complementarity of Fourier transform length scales.
Abstract: The structural analysis of genomic DNA sequences is discussed in the framework of the spectral approach, which is sufficiently universal due to the reciprocal correspondence and mutual complementarity of Fourier transform length scales. The spectral characteristics of random sequences of the same nucleotide composition possess the property of self-averaging for relatively short sequences of length M≥100–300. Comparison with the characteristics of random sequences determines the statistical significance of the structural features observed. Apart from traditional applications to the search for hidden periodicities, spectral methods are also efficient in studying mutual correlations in DNA sequences. By combining spectra for structure factors and correlation functions, not only integral correlations can be estimated but also their origin identified. Using the structural spectral entropy approach, the regularity of a sequence can be quantitatively assessed. A brief introduction to the problem is also presented and other major methods of DNA sequence analysis described.
Citations
More filters
Journal ArticleDOI
TL;DR: The objectives of BIOS 781 are to present basic population and quantitative genetic principles, including classical genetics, chromosomal theory of inheritance, and meiotic recombination, and methods for genome-wide association and stratification control.
Abstract: LEARNING The objectives of BIOS 781 are to present: OBJECTIVES: 1. basic population and quantitative genetic principles, including classical genetics, chromosomal theory of inheritance, and meiotic recombination 2. an exposure to QTL mapping methods of complex quantitative traits and linkage methods to detect co-segregation with disease 3. methods for assessing marker-disease linkage disequilibrium, including case-control approaches 4. methods for genome-wide association and stratification control.

1,516 citations

Journal ArticleDOI
TL;DR: The Drosophila literature is reviewed and the proposal that pseudogenes be considered as potogenes, i.e., DNA sequences with a potentiality for becoming new genes is agreed.
Abstract: ▪ Abstract Pseudogenes have been defined as nonfunctional sequences of genomic DNA originally derived from functional genes. It is therefore assumed that all pseudogene mutations are selectively neutral and have equal probability to become fixed in the population. Rather, pseudogenes that have been suitably investigated often exhibit functional roles, such as gene expression, gene regulation, generation of genetic (antibody, antigenic, and other) diversity. Pseudogenes are involved in gene conversion or recombination with functional genes. Pseudogenes exhibit evolutionary conservation of gene sequence, reduced nucleotide variability, excess synonymous over nonsynonymous nucleotide polymorphism, and other features that are expected in genes or DNA sequences that have functional roles. We first review the Drosophila literature and then extend the discussion to the various functional features identified in the pseudogenes of other organisms. A pseudogene that has arisen by duplication or retroposition may, a...

460 citations

Journal ArticleDOI
TL;DR: This review covers several aspects of IT applications, ranging from genome global analysis and comparison, including block-entropy estimation and resolution-free metrics based on iterative maps, to local analysis, comprising the classification of motifs, prediction of transcription factor binding sites and sequence characterization based on linguistic complexity and entropic profiles.
Abstract: Information theory (IT) addresses the analysis of communication systems and has been widely applied in molecular biology. In particular, alignment-free sequence analysis and comparison greatly benefited from concepts derived from IT, such as entropy and mutual information. This review covers several aspects of IT applications, ranging from genome global analysis and comparison, including block-entropy estimation and resolution-free metrics based on iterative maps, to local analysis, comprising the classification of motifs, prediction of transcription factor binding sites and sequence characterization based on linguistic complexity and entropic profiles. IT has also been applied to high-level correlations that combine DNA, RNA or protein features with sequence-independent properties, such as gene mapping and phenotype analysis, and has also provided models based on communication systems theory to describe information transmission channels at the cell level and also during evolutionary processes. While not exhaustive, this review attempts to categorize existing methods and to indicate their relation with broader transversal topics such as genomic signatures, data compression and complexity, time series analysis and phylogenetic classification, providing a resource for future developments in this promising area.

119 citations

Journal ArticleDOI
TL;DR: Application of the Fourier transform algorithm to the G CR2 family revealed strongly predicted seven fold periodicity in hydrophobicity, suggesting why GCR2 has been reported to be a GPCR, despite negative indications in most transmembrane prediction algorithms.

61 citations

Journal ArticleDOI
01 Jun 2003-Genetics
TL;DR: The data indicate that psiEst-6 exhibits some characteristics that are typical of nonfunctional genes, while other characteristics are typically attributed to functional genes; the same situation has been observed in other pseudogenes (including Drosophila).
Abstract: We have analyzed nucleotide polymorphism within a 5.3-kb region encompassing the functional Est-6 gene and the psiEst-6 putative pseudogene in 28 strains of Drosophila melanogaster and one of D. simulans. Two divergent sequence types were detected, which are not perfectly associated with Est-6 allozyme variation. The level of variation (pi) is very close in the 5'-flanking region (0.0059) and Est-6 gene (0.0057), but significantly higher in the intergenic region (0.0141) and putative pseudogene (0.0122). The variation in the 3'-flanking region is intermediate (0.0083). These observations may reflect different levels of purifying selection in the different regions. Strong linkage disequilibrium occurs within the region studied, with the largest values revealed in the putative pseudogene and 3'-flanking region. Moreover, recombination is restricted within psiEst-6. Gene conversion is detected both within and (to a lesser extent) between Est-6 and psiEst-6. The data indicate that psiEst-6 exhibits some characteristics that are typical of nonfunctional genes, while other characteristics are typically attributed to functional genes; the same situation has been observed in other pseudogenes (including Drosophila). The results of structural entropy analysis demonstrate higher structural ordering in Est-6 than in psiEst-6, in accordance with expectations if psiEst-6 is indeed a pseudogene. Taking into account that the function of psiEst-6 is not known (but could exist) and following the terminology of J. Brosius and S. J. Gould, we suggest that the term "potogene" may be appropriate for psiEst-6, indicating that it is a potential gene that may have acquired some distinctive but unknown function.

40 citations

References
More filters
Book
14 Sep 1984
TL;DR: In this article, the distribution of the Mean Vector and the Covariance Matrix and the Generalized T2-Statistic is analyzed. But the distribution is not shown to be independent of sets of Variates.
Abstract: Preface to the Third Edition.Preface to the Second Edition.Preface to the First Edition.1. Introduction.2. The Multivariate Normal Distribution.3. Estimation of the Mean Vector and the Covariance Matrix.4. The Distributions and Uses of Sample Correlation Coefficients.5. The Generalized T2-Statistic.6. Classification of Observations.7. The Distribution of the Sample Covariance Matrix and the Sample Generalized Variance.8. Testing the General Linear Hypothesis: Multivariate Analysis of Variance9. Testing Independence of Sets of Variates.10. Testing Hypotheses of Equality of Covariance Matrices and Equality of Mean Vectors and Covariance Matrices.11. Principal Components.12. Cononical Correlations and Cononical Variables.13. The Distributions of Characteristic Roots and Vectors.14. Factor Analysis.15. Pattern of Dependence Graphical Models.Appendix A: Matrix Theory.Appendix B: Tables.References.Index.

9,693 citations

Book
01 Jan 1981
TL;DR: In this article, the authors introduce the Fokker-planck equation, the Langevin approach, and the diffusion type of the master equation, as well as the statistics of jump events.
Abstract: Preface to the first edition. Preface to the second edition. Abbreviated references. I. Stochastic variables. II. Random events. III. Stochastic processes. IV. Markov processes. V. The master equation. VI. One-step processes. VII. Chemical reactions. VIII. The Fokker-Planck equation. IX. The Langevin approach. X. The expansion of the master equation. XI. The diffusion type. XII. First-passage problems. XIII. Unstable systems. XIV. Fluctuations in continuous systems. XV. The statistics of jump events. XVI. Stochastic differential equations. XVII. Stochastic behavior of quantum systems.

7,858 citations

Book
01 Oct 1986
TL;DR: This paper discusses the physical properties of polypeptides, the structure of which has been determined Crystallographically to High Resolution and its role in the biosynthesis of Proteins.
Abstract: Chemical Properties of Polypeptides Biosynthesis of Proteins Evolutionary and Genetic Origins of Protein Sequences Physical Interactions that Determine the Properties of Proteins Conformational Properties of Polypeptide Chains The Folded Conformations of Globular Proteins Proteins in Solution and in Membranes Interactions with Other Molecules Enzyme Catalysis Degradation Appendix: References to Protein Structures Determined Crystallographically to High Resolution

4,285 citations