scispace - formally typeset
Journal ArticleDOI

Understanding Long-range Correlations in DNA Sequences

Reads0
Chats0
TLDR
A review of the literature on statistical long-range correlation in DNA sequences can be found in this paper, where the authors conclude that a mixture of many length scales (including some relatively long ones) is responsible for the observed 1/f-like spectral component.
Abstract
In this paper, we review the literature on statistical long-range correlation in DNA sequences. We examine the current evidence for these correlations, and conclude that a mixture of many length scales (including some relatively long ones) in DNA sequences is responsible for the observed 1/f-like spectral component. We note the complexity of the correlation structure in DNA sequences. The observed complexity often makes it hard, or impossible, to decompose the sequence into a few statistically stationary regions. We suggest that, based on the complexity of DNA sequences, a fruitful approach to understand long-range correlation is to model duplication, and other rearrangement processes, in DNA sequences. One model, called ``expansion-modification system", contains only point duplication and point mutation. Though simplistic, this model is able to generate sequences with 1/f spectra. We emphasize the importance of DNA duplication in its contribution to the observed long-range correlation in DNA sequences.

read more

Citations
More filters
Book ChapterDOI

Power Law Correlations in DNA Sequences

TL;DR: In this paper, the authors review the degree to which power laws can characterize fluctuating nucleotide content of the DNA sequences, see also a critical review of W. Li's work.
Journal ArticleDOI

Isochores Merit the Prefix 'Iso'

TL;DR: This work argues that a statement in IGHSC analysis concerning the existence of isochore is incorrect, because it had applied an inappropriate statistical test, and proposes to use another statistical test: the analysis of variance (ANOVA).
Journal ArticleDOI

Measuring complexity, nonextensivity and chaos in the DNA sequence of the Major Histocompatibility Complex

TL;DR: Analysis of 4 Mb sequences of the Major Histocompatibility Complex, which is a DNA segment on chromosome 6 with high gene density, controlling many immunological functions and associated with many diseases, revealed that the DNA complexity and self-organization can be related to fractional dynamical nonlinear processes with low-dimensional deterministic chaotic and non-extensive statistical character.
Journal ArticleDOI

Investigating long range correlation in DNA sequences using significance tests of conditional mutual information

TL;DR: The Markov chain order estimation from symbol sequences of systems exhibiting long memory or long range correlations, such as DNA sequences, shows a different dependence on the DNA sequence length for bacteria, the plant Arabidopsis thaliana and the human chromosome, indicating a different long memory structure in their DNA.
Journal ArticleDOI

A new method to study genome mutations using the information entropy

TL;DR: It is shown that the information entropy spectrum of genomes contains sufficient information to allow detection of genetic mutations, as well as possibly predicting future ones, and the best m-block size is 2 and the optimal window size should contain more than 9, and less than 33 nucleotides.
References
More filters
Journal ArticleDOI

A mathematical theory of communication

TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.
Journal Article

The mathematical theory of communication

TL;DR: The Mathematical Theory of Communication (MTOC) as discussed by the authors was originally published as a paper on communication theory more than fifty years ago and has since gone through four hardcover and sixteen paperback printings.
Related Papers (5)