Home
/
Authors
/
Christian J. Michel

Author

Christian J. Michel

Other affiliations: Centre national de la recherche scientifique, University of Franche-Comté, University of Paris-Sud ...read more

Bio: Christian J. Michel is an academic researcher from University of Strasbourg. The author has contributed to research in topics: Genetic code & Genome. The author has an hindex of 25, co-authored 96 publications receiving 1786 citations. Previous affiliations of Christian J. Michel include Centre national de la recherche scientifique & University of Franche-Comté.

Topics: Genetic code, Genome, Population, Gene, Transfer RNA ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1990
1989
1987
1986

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A complementary circular code in the protein coding genes.

[...]

Didier Arquès¹, Christian J. Michel²•Institutions (2)

University of Marne-la-Vallée¹, University of Franche-Comté²

07 Sep 1996-Journal of Theoretical Biology

TL;DR: Surprisingly, the same three subsets of trinucleotides are identified in these two gene populations: Tzero, T1 and T2 replaced in the three two-letter genetic alphabets purine/pyrimidine, amino/ceto and strong/weak interaction.

...read moreread less

185 citations

Journal Article•DOI•

Characterization of accessory genes in coronavirus genomes.

[...]

Christian J. Michel¹, Claudine Mayer², Claudine Mayer³, Claudine Mayer¹, Olivier Poch¹, Julie Dawn Thompson¹ - Show less +2 more•Institutions (3)

University of Strasbourg¹, Pasteur Institute², Paris Diderot University³

27 Aug 2020-Virology Journal

TL;DR: The analysis provides evidence supporting the presence of overlapping ORFs 7b, 9b and 9c in all the genomes and thus helps to resolve some differences in current genome annotations and predicts that ORF3b is not functional in all genomes.

...read moreread less

Abstract: The Covid19 infection is caused by the SARS-CoV-2 virus, a novel member of the coronavirus (CoV) family. CoV genomes code for a ORF1a / ORF1ab polyprotein and four structural proteins widely studied as major drug targets. The genomes also contain a variable number of open reading frames (ORFs) coding for accessory proteins that are not essential for virus replication, but appear to have a role in pathogenesis. The accessory proteins have been less well characterized and are difficult to predict by classical bioinformatics methods. We propose a computational tool GOFIX to characterize potential ORFs in virus genomes. In particular, ORF coding potential is estimated by searching for enrichment in motifs of the X circular code, that is known to be over-represented in the reading frames of viral genes. We applied GOFIX to study the SARS-CoV-2 and related genomes including SARS-CoV and SARS-like viruses from bat, civet and pangolin hosts, focusing on the accessory proteins. Our analysis provides evidence supporting the presence of overlapping ORFs 7b, 9b and 9c in all the genomes and thus helps to resolve some differences in current genome annotations. In contrast, we predict that ORF3b is not functional in all genomes. Novel putative ORFs were also predicted, including a truncated form of the ORF10 previously identified in SARS-CoV-2 and a little known ORF overlapping the Spike protein in Civet-CoV and SARS-CoV. Our findings contribute to characterizing sequence properties of accessory genes of SARS coronaviruses, and especially the newly acquired genes making use of overlapping reading frames.

...read moreread less

131 citations

Journal Article•DOI•

Circular code motifs in transfer and 16S ribosomal RNAs: A possible translation code in genes

[...]

Christian J. Michel¹•Institutions (1)

University of Strasbourg¹

01 Apr 2012-Computational Biology and Chemistry

TL;DR: These identified X circular code motifs and their mathematical properties may constitute a translation code involved in retrieval, maintenance and synchronization of reading frames in genes.

...read moreread less

77 citations

Posted Content•DOI•

Characterization of accessory genes in coronavirus genomes

[...]

Christian J. Michel¹, Claudine Mayer², Claudine Mayer¹, Claudine Mayer³, Olivier Poch¹, Julie Dawn Thompson¹ - Show less +2 more•Institutions (3)

University of Strasbourg¹, Paris Diderot University², Pasteur Institute³

30 May 2020-bioRxiv

TL;DR: The GOFIX analysis provides evidence supporting the presence of overlapping ORFs 7b, 9b and 9c in all the genomes and thus helps to resolve some differences in current genome annotations, and predicts that ORF3b is not functional in all genomes.

...read moreread less

62 citations

Journal Article•DOI•

Circular code motifs in transfer RNAs.

[...]

Christian J. Michel¹•Institutions (1)

University of Strasbourg¹

01 Aug 2013-Computational Biology and Chemistry

TL;DR: The identification of X motifs and a gene circular code property in tRNAs strengthens the concept proposed in Michel (2012) of a possible translation (framing) code based on a circular code.

...read moreread less

59 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Evolution of Protein Molecules

[...]

S. Jeffery

01 Apr 1979-Biochemical Society Transactions

3,734 citations

Journal Article•DOI•

BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments

[...]

Alexis Criscuolo¹, Simonetta Gribaldo¹•Institutions (1)

Pasteur Institute¹

13 Jul 2010-BMC Evolutionary Biology

TL;DR: Simulation analyses show that the character trimming performed by BMGE produces datasets leading to accurate trees, especially with alignments including distantly-related sequences.

...read moreread less

Abstract: The quality of multiple sequence alignments plays an important role in the accuracy of phylogenetic inference. It has been shown that removing ambiguously aligned regions, but also other sources of bias such as highly variable (saturated) characters, can improve the overall performance of many phylogenetic reconstruction methods. A current scientific trend is to build phylogenetic trees from a large number of sequence datasets (semi-)automatically extracted from numerous complete genomes. Because these approaches do not allow a precise manual curation of each dataset, there exists a real need for efficient bioinformatic tools dedicated to this alignment character trimming step. Here is presented a new software, named BMGE (Block Mapping and Gathering with Entropy), that is designed to select regions in a multiple sequence alignment that are suited for phylogenetic inference. For each character, BMGE computes a score closely related to an entropy value. Calculation of these entropy-like scores is weighted with BLOSUM or PAM similarity matrices in order to distinguish among biologically expected and unexpected variability for each aligned character. Sets of contiguous characters with a score above a given threshold are considered as not suited for phylogenetic inference and then removed. Simulation analyses show that the character trimming performed by BMGE produces datasets leading to accurate trees, especially with alignments including distantly-related sequences. BMGE also implements trimming and recoding methods aimed at minimizing phylogeny reconstruction artefacts due to compositional heterogeneity. BMGE is able to perform biologically relevant trimming on a multiple alignment of DNA, codon or amino acid sequences. Java source code and executable are freely available at ftp://ftp.pasteur.fr/pub/GenSoft/projects/BMGE/ .

...read moreread less

1,080 citations

Journal Article•DOI•

Assessment of protein coding measures

[...]

James W. Fickett¹, Chang shung Tung¹•Institutions (1)

Los Alamos National Laboratory¹

25 Dec 1992-Nucleic Acids Research

TL;DR: This paper reviews and synthesizes the underlying coding measures from published algorithms and concludes that a very simple and obvious measure--counting oligomers--is more effective than any of the more sophisticated measures.

...read moreread less

Abstract: A number of methods for recognizing protein coding genes in DNA sequence have been published over the last 13 years, and new, more comprehensive algorithms, drawing on the repertoire of existing techniques, continue to be developed. To optimize continued development, it is valuable to systematically review and evaluate published techniques. At the core of most gene recognition algorithms is one or more coding measures--functions which produce, given any sample window of sequence, a number or vector intended to measure the degree to which a sample sequence resembles a window of 'typical' exonic DNA. In this paper we review and synthesize the underlying coding measures from published algorithms. A standardized benchmark is described, and each of the measures is evaluated according to this benchmark. Our main conclusion is that a very simple and obvious measure--counting oligomers--is more effective than any of the more sophisticated measures. Different measures contain different information. However there is a great deal of redundancy in the current suite of measures. We show that in future development of gene recognition algorithms, attention can probably be limited to six of the twenty or so measures proposed to date.

...read moreread less

407 citations

Journal Article•

Genome-scale phylogeny and the detection of systematic biases

[...]

Matthew J. Phillips, Frédéric Delsuc, David Penny

01 Jan 2004-Science & Engineering Faculty

TL;DR: A compositional bias is identified as responsible for this inconsistency and it is reduced effectively by coding the nucleotides as purines and pyrimidines (RY-coding), reinforcing the original tree.

...read moreread less

Abstract: Phylogenetic inference from sequences can be misled by both sampling (stochastic) error and systematic error (nonhistorical signals where reality differs from our simplified models). A recent study of eight yeast species using 106 concatenated genes from complete genomes showed that even small internal edges of a tree received 100% bootstrap support. This effective negation of stochastic error from large data sets is important, but longer sequences exacerbate the potential for biases (systematic error) to be positively misleading. Indeed, when we analyzed the same data set using minimum evolution optimality criteria, an alternative tree received 100% bootstrap support. We identified a compositional bias as responsible for this inconsistency and showed that it is reduced effectively by coding the nucleotides as purines and pyrimidines (RY-coding), reinforcing the original tree. Thus, a comprehensive exploration of potential systematic biases is still required, even though genome-scale data sets greatly reduce sampling error.

...read moreread less

392 citations

Journal Article•DOI•

A new challenge for compression algorithms: genetic sequences

[...]

Stéphane Grumbach, Fariza Tahi

01 Oct 1994-Information Processing and Management

TL;DR: A lossless algorithm is presented, biocompress-2, to compress the information contained in DNA and RNA sequences, based on the detection of regularities, such as the presence of palindromes, which leads to the highest compression of DNA.

...read moreread less

Abstract: Universal data compression algorithms fail to compress genetic sequences. It is due to the specificity of this particular kind of “text.” We analyze in some detail the properties of the sequences, which cause the failure of classical algorithms. We then present a lossless algorithm, biocompress-2, to compress the information contained in DNA and RNA sequences, based on the detection of regularities, such as the presence of palindromes. The algorithm combines substitutional and statistical methods, and to the best of our knowledge, leads to the highest compression of DNA. The results, although not satisfactory, give insight to the necessary correlation between compression and comprehension of genetic sequences.

...read moreread less

302 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116

Collapse