scispace - formally typeset
Search or ask a question

Showing papers by "Christophe Klopp published in 2017"


Journal ArticleDOI
TL;DR: The reconstruction of the genome of the most recent common ancestor (MRCA) of modern monocots and eudicots, accounting for 95% of extant angiosperms, with its potential repertoire of 22,899 ancestral genes conserved in present-day crops is described.
Abstract: We describe here the reconstruction of the genome of the most recent common ancestor (MRCA) of modern monocots and eudicots, accounting for 95% of extant angiosperms, with its potential repertoire of 22,899 ancestral genes conserved in present-day crops. The MRCA provides a starting point for deciphering the reticulated evolutionary plasticity between species (rapidly versus slowly evolving lineages), subgenomes (pre- versus post-duplication blocks), genomic compartments (stable versus labile loci), genes (ancestral versus species-specific genes) and functions (gained versus lost ontologies), the key mutational forces driving the success of polyploidy in crops. The estimation of the timing of angiosperm evolution, based on MRCA genes, suggested that this group emerged 214 million years ago during the late Triassic era, before the oldest recorded fossil. Finally, the MRCA constitutes a unique resource for scientists to dissect major agronomic traits in translational genomics studies extending from model species to crops.

168 citations


Journal ArticleDOI
16 Feb 2017-PeerJ
TL;DR: The de novo RNA-seq assembly pipeline (DRAP) as discussed by the authors is an easy-to-use software package to produce compact and corrected transcript set, which reduces the number of contigs needed to represent the transcriptome.
Abstract: BACKGROUND De novo transcriptome assembly of short reads is now a common step in expression analysis of organisms lacking a reference genome sequence. Several software packages are available to perform this task. Even if their results are of good quality it is still possible to improve them in several ways including redundancy reduction or error correction. Trinity and Oases are two commonly used de novo transcriptome assemblers. The contig sets they produce are of good quality. Still, their compaction (number of contigs needed to represent the transcriptome) and their quality (chimera and nucleotide error rates) can be improved. RESULTS We built a de novo RNA-Seq Assembly Pipeline (DRAP) which wraps these two assemblers (Trinity and Oases) in order to improve their results regarding the above-mentioned criteria. DRAP reduces from 1.3 to 15 fold the number of resulting contigs of the assemblies depending on the read set and the assembler used. This article presents seven assembly comparisons showing in some cases drastic improvements when using DRAP. DRAP does not significantly impair assembly quality metrics such are read realignment rate or protein reconstruction counts. CONCLUSION Transcriptome assembly is a challenging computational task even if good solutions are already available to end-users, these solutions can still be improved while conserving the overall representation and quality of the assembly. The de novo RNA-Seq Assembly Pipeline (DRAP) is an easy to use software package to produce compact and corrected transcript set. DRAP is free, open-source and available under GPL V3 license at http://www.sigenae.org/drap.

88 citations


Journal ArticleDOI
TL;DR: By transiently enhancing 3D chromatin interactions, stable and isogenic Drosophila epilines are established that carry alternative epialleles, as defined by differential levels of Polycomb-dependent trimethylation of histone H3 Lys27 (forming H3K27me3).
Abstract: Transgenerational epigenetic inheritance (TEI) describes the transmission of alternative functional states through multiple generations in the presence of the same genomic DNA sequence. Very little is known about the principles and the molecular mechanisms governing this type of inheritance. Here, by transiently enhancing 3D chromatin interactions, we established stable and isogenic Drosophila epilines that carry alternative epialleles, as defined by differential levels of Polycomb-dependent trimethylation of histone H3 Lys27 (forming H3K27me3). After being established, epialleles can be dominantly transmitted to naive flies and can induce paramutation. Importantly, epilines can be reset to a naive state by disruption of chromatin interactions. Finally, we found that environmental changes modulate the expressivity of the epialleles, and we extended our paradigm to naturally occurring phenotypes. Our work sheds light on how nuclear organization and Polycomb group (PcG) proteins contribute to epigenetically inheritable phenotypic variability.

84 citations


Journal ArticleDOI
TL;DR: The availability of large data sets of whole-genome sequences, high-density SNP chip genotypes and extensive recording of phenotype offers an unprecedented opportunity to quickly dissect the genetic architecture of severe dominant conditions in livestock.
Abstract: In humans, the clinical and molecular characterization of sporadic syndromes is often hindered by the small number of patients and the difficulty in developing animal models for severe dominant conditions. Here we show that the availability of large data sets of whole-genome sequences, high-density SNP chip genotypes and extensive recording of phenotype offers an unprecedented opportunity to quickly dissect the genetic architecture of severe dominant conditions in livestock. We report on the identification of seven dominant de novo mutations in CHD7, COL1A1, COL2A1, COPA, and MITF and exploit the structure of cattle populations to describe their clinical consequences and map modifier loci. Moreover, we demonstrate that the emergence of recessive genetic defects can be monitored by detecting de novo deleterious mutations in the genome of bulls used for artificial insemination. These results demonstrate the attractiveness of cattle as a model species in the post genomic era, particularly to confirm the genetic aetiology of isolated clinical case reports in humans.

58 citations


Journal ArticleDOI
TL;DR: An extensive profiling of the lncRNA transcriptome in the chicken liver and adipose tissue by RNA-Seq contributes to improving the structural and functional annotation of the chicken genome and provides a basis for further studies on energy storage and mobilization traits in theChicken.
Abstract: BackgroundImproving functional annotation of the chicken genome is a key challenge in bridging the gap between genotype and phenotype. Among all transcribed regions, long noncoding RNAs (lncRNAs) are a major component of the transcriptome and its regulation, and whole-transcriptome sequencing (RNA-Seq) has greatly improved their identification and characterization. We performed an extensive profiling of the lncRNA transcriptome in the chicken liver and adipose tissue by RNA-Seq. We focused on these two tissues because of their importance in various economical traits for which energy storage and mobilization play key roles and also because of their high cell homogeneity. To predict lncRNAs, we used a recently developed tool called FEELnc, which also classifies them with respect to their distance and strand orientation to the closest protein-coding genes. Moreover, to confidently identify the genes/transcripts expressed in each tissue (a complex task for weakly expressed molecules such as lncRNAs), we probed a particularly large number of biological replicates (16 per tissue) compared to common multi-tissue studies with a larger set of tissues but less sampling.ResultsWe predicted 2193 lncRNA genes, among which 1670 were robustly expressed across replicates in the liver and/or adipose tissue and which were classified into 1493 intergenic and 177 intragenic lncRNAs located between and within protein-coding genes, respectively. We observed similar structural features between chickens and mammals, with strong synteny conservation but without sequence conservation. As previously reported, we confirm that lncRNAs have a lower and more tissue-specific expression than mRNAs. Finally, we showed that adjacent lncRNA-mRNA genes in divergent orientation have a higher co-expression level when separated by less than 1 kb compared to more distant divergent pairs. Among these, we highlighted for the first time a novel lncRNA candidate involved in lipid metabolism, lnc_DHCR24, which is highly correlated with the DHCR24 gene that encodes a key enzyme of cholesterol biosynthesis.ConclusionsWe provide a comprehensive lncRNA repertoire in the chicken liver and adipose tissue, which shows interesting patterns of co-expression between mRNAs and lncRNAs. It contributes to improving the structural and functional annotation of the chicken genome and provides a basis for further studies on energy storage and mobilization traits in the chicken.

56 citations


Journal ArticleDOI
TL;DR: It is shown that SRF drastically reduces the chimera content and computational time, enabling the analysis of a complete MiSeq run in just a few minutes, and accurately determines the actual community diversity: the differences in α‐ and β‐community diversity obtained with SRF and standard procedures are much smaller than the intrinsic variability of technical and biological replicates.
Abstract: Next-generation sequencing technologies give access to large sets of data, which are extremely useful in the study of microbial diversity based on 16S rRNA gene. However, the production of such large data sets is not only marred by technical biases and sequencing noise but also increases computation time and disc space use. To improve the accuracy of OTU predictions and overcome both computations, storage and noise issues, recent studies and tools suggested removing all single reads and low abundant OTUs, considering them as noise. Although the effect of applying an OTU abundance threshold on α- and β-diversity has been well documented, the consequences of removing single reads have been poorly studied. Here, we test the effect of singleton read filtering (SRF) on microbial community composition using in silico simulated data sets as well as sequencing data from synthetic and real communities displaying different levels of diversity and abundance profiles. Scalability to large data sets is also assessed using a complete MiSeq run. We show that SRF drastically reduces the chimera content and computational time, enabling the analysis of a complete MiSeq run in just a few minutes. Moreover, SRF accurately determines the actual community diversity: the differences in α- and β-community diversity obtained with SRF and standard procedures are much smaller than the intrinsic variability of technical and biological replicates.

53 citations


Journal ArticleDOI
TL;DR: It is shown that most TGD duplicates gained their current status relatively rapidly after TGD, and novel cases of TGD ohnolog subfunctionalization and neofunctionalization are reported that further illustrate the importance of these processes.
Abstract: Whole-genome duplications (WGDs) are important evolutionary events. Our understanding of underlying mechanisms, including the evolution of duplicated genes after WGD, however, remains incomplete. Teleost fish experienced a common WGD (teleost-specific genome duplication, or TGD) followed by a dramatic adaptive radiation leading to more than half of all vertebrate species. The analysis of gene expression patterns following TGD at the genome level has been limited by the lack of suitable genomic resources. The recent concomitant release of the genome sequence of spotted gar (a representative of holosteans, the closest-related lineage of teleosts that lacks the TGD) and the tissue-specific gene expression repertoires of over 20 holostean and teleostean fish species, including spotted gar, zebrafish, and medaka (the PhyloFish project), offers a unique opportunity to study the evolution of gene expression following TGD in teleosts. We show that most TGD duplicates gained their current status (loss of one duplicate gene or retention of both duplicates) relatively rapidly after TGD (i.e., prior to the divergence of medaka and zebrafish lineages). The loss of one duplicate is the most common fate after TGD with a probability of approximately 80%. In addition, the fate of duplicate genes after TGD, including subfunctionalization, neofunctionalization, or retention of two "similar" copies occurred not only before but also after the divergence of species tested, in consistency with a role of the TGD in speciation and/or evolution of gene function. Finally, we report novel cases of TGD ohnolog subfunctionalization and neofunctionalization that further illustrate the importance of these processes.

42 citations


Journal ArticleDOI
TL;DR: The first transcriptomic approach aimed at identifying genetic variants of the genes expressed in the lactating mammary gland of sheep is presented, which could be included as suitable markers in genotyping platforms or custom SNP arrays to perform association analyses in commercial populations and apply genomic selection protocols in the dairy production industry.
Abstract: The identification of genetic variation underlying desired phenotypes is one of the main challenges of current livestock genetic research. High-throughput transcriptome sequencing (RNA-Seq) offers new opportunities for the detection of transcriptome variants (SNPs and short indels) in different tissues and species. In this study, we used RNA-Seq on Milk Sheep Somatic Cells (MSCs) with the goal of characterizing the genetic variation within the coding regions of the milk transcriptome in Churra and Assaf sheep, two common dairy sheep breeds farmed in Spain. A total of 216,637 variants were detected in the MSCs transcriptome of the eight ewes analyzed. Among them, a total of 57,795 variants were detected in the regions harboring Quantitative Trait Loci (QTL) for milk yield, protein percentage and fat percentage, of which 21.44% were novel variants. Among the total variants detected, 561 (2.52%) and 1,649 (7.42%) were predicted to produce high or moderate impact changes in the corresponding transcriptional unit, respectively. In the functional enrichment analysis of the genes positioned within selected QTL regions harboring novel relevant functional variants (high and moderate impact), the KEGG pathway with the highest enrichment was “protein processing in endoplasmic reticulum”. Additionally, a total of 504 and 1,063 variants were identified in the genes encoding principal milk proteins and molecules involved in the lipid metabolism, respectively. Of these variants, 20 mutations were found to have putative relevant effects on the encoded proteins. We present herein the first transcriptomic approach aimed at identifying genetic variants of the genes expressed in the lactating mammary gland of sheep. Through the transcriptome analysis of variability within regions harboring QTL for milk yield, protein percentage and fat percentage, we have found several pathways and genes that harbor mutations that could affect dairy production traits. Moreover, remarkable variants were also found in candidate genes coding for major milk proteins and proteins related to milk fat metabolism. Several of the SNPs found in this study could be included as suitable markers in genotyping platforms or custom SNP arrays to perform association analyses in commercial populations and apply genomic selection protocols in the dairy production industry.

37 citations


Journal ArticleDOI
TL;DR: The ileal mucosa-associated microbiota encompasses the enzymatic potential for PCW polysaccharide degradation in the small intestine.
Abstract: The digestion of dietary fibers is a major function of the human intestinal microbiota. So far this function has been attributed to the microorganisms inhabiting the colon, and many studies have focused on this distal part of the gastrointestinal tract using easily accessible fecal material. However, microbial fermentations, supported by the presence of short-chain fatty acids, are suspected to occur in the upper small intestine, particularly in the ileum. Using a fosmid library from the human ileal mucosa, we screened 20,000 clones for their activities against carboxymethylcellulose and xylans chosen as models of the major plant cell wall (PCW) polysaccharides from dietary fibres. Eleven positive clones revealed a broad range of CAZyme encoding genes from Bacteroides and Clostridiales species, as well as Polysaccharide Utilization Loci (PULs). The functional glycoside hydrolase genes were identified, and oligosaccharide break-down products examined from different polysaccharides including mixed-linkage β-glucans. CAZymes and PULs were also examined for their prevalence in human gut microbiome. Several clusters of genes of low prevalence in fecal microbiome suggested they belong to unidentified strains rather specifically established upstream the colon, in the ileum. Thus, the ileal mucosa-associated microbiota encompasses the enzymatic potential for PCW polysaccharide degradation in the small intestine.

36 citations


Journal ArticleDOI
TL;DR: Grafted grapevines showed common and rootstock-genotype-specific root transcriptome responses under different nitrogen regimes, and functional categories and potential hub genes involved in genotype-dependent responses were identified.
Abstract: In many fruit species, including grapevine, grafting is used to improve scion productivity and quality and to adapt the plant to environmental conditions. However, the mechanisms underlying the rootstock control of scion development are still poorly understood. The ability of rootstocks to regulate nitrogen uptake and assimilation may contribute to this control. A split-root system was used to grow heterografted grapevines and to investigate the molecular responses to changes in nitrate availability of two rootstocks known to affect scion growth differently. Transcriptome profiling by RNA sequencing was performed on root samples collected 3 and 24 h after nitrogen supply. The results demonstrated a common response involving nitrogen-related genes, as well as a more pronounced transcriptomic reprogramming in the genotype conferring the lower scion growth. A weighted gene co-expression network analysis allowed the identification of co-regulated gene modules, suggesting a role for nitrate transporter 2 family genes and some transcription factors as main actors controlling this genotype-dependent response to heterogeneous nitrogen supply. The relationship between nitrate, ethylene, and strigolactone hormonal pathways was found to differ between the two genotypes. These findings indicated that the genotypes responded differently to heterogeneous nitrogen availability, and this may contribute to their contrasting effect on scion growth.

36 citations


Journal ArticleDOI
TL;DR: Results show that G. boninense possesses a high degree of genetic diversity and no detectable genetic structure at the scale of Sumatra and peninsular Malaysia, and approximate Bayesian computation (ABC) modelling indicates that the fungus has undergone a demographic expansion in the past, probably before the oil palm was introduced into Southeast Asia.

Journal ArticleDOI
TL;DR: This study reports the identification and characterization of CNV in eight French beef and dairy breeds using whole-genome sequence data from 200 animals and validated a subset of the CNV by both in silico and experimental approaches.
Abstract: BackgroundCopy number variations (CNV) are known to play a major role in genetic variability and disease pathogenesis in several species including cattle. In this study, we report the identification and characterization of CNV in eight French beef and dairy breeds using whole-genome sequence data from 200 animals. Bioinformatics analyses to search for CNV were carried out using four different but complementary tools and we validated a subset of the CNV by both in silico and experimental approaches. ResultsWe report the identification and localization of 4178 putative deletion-only, duplication-only and CNV regions, which cover 6% of the bovine autosomal genome; they were validated by two in silico approaches and/or experimentally validated using array-based comparative genomic hybridization and single nucleotide polymorphism genotyping arrays. The size of these variants ranged from 334 bp to 7.7 Mb, with an average size of ~ 54 kb. Of these 4178 variants, 3940 were deletions, 67 were duplications and 171 corresponded to both deletions and duplications, which were defined as potential CNV regions. Gene content analysis revealed that, among these variants, 1100 deletions and duplications encompassed 1803 known genes, which affect a wide spectrum of molecular functions, and 1095 overlapped with known QTL regions.ConclusionsOur study is a large-scale survey of CNV in eight French dairy and beef breeds. These CNV will be useful to study the link between genetic variability and economically important traits, and to improve our knowledge on the genomic architecture of cattle.

Journal ArticleDOI
14 Dec 2017-PLOS ONE
TL;DR: The potential of highly diverse microbiota such as the ruminal one for the discovery of promiscuous enzymes, whose versatility could be exploited for industrial uses is highlighted.
Abstract: Bioremediation of pollutants is a major concern worldwide, leading to the research of new processes to break down and recycle xenobiotics and environment contaminating polymers. Among them, carbamates have a very broad spectrum of uses, such as toxinogenic pesticides or elastomers. In this study, we mined the bovine rumen microbiome for carbamate degrading enzymes. We isolated 26 hit clones exhibiting esterase activity, and were able to degrade at least one of the targeted polyurethane and pesticide carbamate compounds. The most active clone was deeply characterized. In addition to Impranil, this clone was active on Tween 20, pNP-acetate, butyrate and palmitate, and on the insecticide fenobucarb. Sequencing and sub-cloning of the best target revealed a novel carboxyl-ester hydrolase belonging to the lipolytic family IV, named CE_Ubrb. This study highlights the potential of highly diverse microbiota such as the ruminal one for the discovery of promiscuous enzymes, whose versatility could be exploited for industrial uses.

Journal ArticleDOI
TL;DR: In young, middle-aged and old animals, transcription levels were mainly explained by Cu, Zn and age, respectively, which suggests differences in the molecular responses of this species to metals during its lifetime that must be better assessed in future ecotoxicology studies.
Abstract: The freshwater pearl mussel Margaritifera margaritifera is one of the most threatened freshwater bivalves worldwide. In this study, we aimed (i) to study the processes by which water quality might affect freshwater mussels in situ and (ii) to provide insights into the ecotoxicological significance of water pollution to natural populations in order to provide necessary information to enhance conservation strategies. M. margaritifera specimens were sampled in two close sites located upstream or downstream from an illegal dumping site. The renal transcriptome of these animals was assembled and gene transcription determined by RNA-seq. Correlations between transcription levels of each single transcript and the bioaccumulation of nine trace metals, age (estimated by sclerochronology), and condition index were determined in order to identify genes likely to respond to a specific factor. Amongst the studied metals, Cr, Zn, Cd, and Ni were the main factors correlated with transcription levels, with effects on translation, apoptosis, immune response, response to stimulus, and transport pathways. However, the main factor explaining changes in gene transcription appeared to be the age of individuals with a negative correlation with the transcription of retrotransposon-related genes. To investigate this effect further, mussels were classified into three age classes. In young, middle-aged and old animals, transcription levels were mainly explained by Cu, Zn and age, respectively. This suggests differences in the molecular responses of this species to metals during its lifetime that must be better assessed in future ecotoxicology studies.

Journal ArticleDOI
TL;DR: It is shown that in the model strain ORS285, nifV is required for free-living and symbiotic dinitrogen fixation with NF-independent Aeschynomene species, and this data indicates that efficient symbiotic nitrogen fixation in many of the tested Aes chynomenes species requires rhizobial homocitrate synthesis.
Abstract: In the most studied rhizobium-legume interactions, the host plant supplies the symbiont with homocitrate, an essential co-factor of the nitrogenase enzyme complex, via the expression of a nodule-specific homocitrate synthase FEN1. Photosynthetic bradyrhizobia interacting with Nod factor (NF) dependent and NF-independent Aeschynomene legumes are able to synthesize homocitrate themselves as they contain a nifV gene encoding a homocitrate synthase. Here, we show that in the model strain ORS285, nifV is required for free-living and symbiotic dinitrogen fixation with NF-independent Aeschynomene species. In contrast, in symbiosis with NF-dependent Aeschynomene species, the nifV requirement for efficient nitrogen fixation was found to be host plant dependent. Interestingly, orthologs of FEN1 were found in both NF-dependent and NF-independent Aeschynomene species. However, a high nodule specific induction of FEN1 expression was only observed in A. afraspera, a host plant in which nifV is not required for symbiotic dinitrogen fixation. These data indicate that efficient symbiotic nitrogen fixation in many of the tested Aeschynomene species requires rhizobial homocitrate synthesis. Considering that more than 10% of the fully sequenced rhizobium strains do contain a nifV gene, the Aeschynomene/photosynthetic Bradyrhizobium interaction is likely not the only rhizobium/legume symbiosis where rhizobial nifV expression is required.

Journal ArticleDOI
TL;DR: It is concluded that lack of Rio1p allows premature entry of pre-40S particles in the translation process and that the presence of Nob1p and of the 18S rRNA 3′ extension in the 20S pre-rRNA is not incompatible with translation elongation.
Abstract: Cytoplasmic maturation of precursors to the small ribosomal subunit in yeast requires the intervention of a dozen assembly factors (AFs), the precise roles of which remain elusive. One of these is Rio1p that seems to intervene at a late step of pre-40S particle maturation. We have investigated the role played by Rio1p in the dynamic association and dissociation of AFs with and from pre-40S particles. Our results indicate that Rio1p depletion leads to the stalling of at least 4 AFs (Nob1p, Tsr1p, Pno1p/Dim2p and Fap7p) in 80S-like particles. We conclude that Rio1p is important for the timely release of these factors from 80S-like particles. In addition, we present immunoprecipitation and electron microscopy evidence suggesting that when Rio1p is depleted, a subset of Nob1p-containing pre-40S particles associate with translating polysomes. Using Nob1p as bait, we purified pre-40S particles from cells lacking Rio1p and performed ribosome profiling experiments which suggest that immature 40S subunits can carry out translation elongation. We conclude that lack of Rio1p allows premature entry of pre-40S particles in the translation process and that the presence of Nob1p and of the 18S rRNA 3' extension in the 20S pre-rRNA is not incompatible with translation elongation.

Journal ArticleDOI
TL;DR: The genome sequence of the type strain CBS 6936 indicates conservation of chromosomal structure but significant nucleotide divergence, and comparison with sequences of strain ATCC 42720 indicates conservationof chromosomal structures but significantucleotide divergence.
Abstract: Clavispora lusitaniae, an environmental saprophytic yeast belonging to the CTG clade of Candida, can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence.

Journal ArticleDOI
TL;DR: This work reports a large-scale analysis of differentially transcribed genes of R. balthica exposed to oxazepam during egg development until hatching, which enriched the de novo database of potential ecotoxicological models.
Abstract: Pharmaceuticals are increasingly found in aquatic ecosystems due to the non-efficiency of waste water treatment plants. Therefore, aquatic organisms are frequently exposed to a broad diversity of pharmaceuticals. Freshwater snail Radix balthica has been chosen as model to study the effects of oxazepam (psychotropic drug) on developmental stages ranging from trochophore to hatching. In order to provide a global insight of these effects, a transcriptome deep sequencing has been performed on exposed embryos. Eighteen libraries were sequenced, six libraries for three conditions: control, exposed to the lowest oxazepam concentration with a phenotypic effect (delayed hatching) (TA) and exposed to oxazepam concentration found in freshwater (TB). A total of 39,759,772 filtered raw reads were assembled into 56,435 contigs having a mean length of 1579.68 bp and mean depth of 378.96 reads. 44.91% of the contigs have at least one annotation. The differential expression analysis between the control condition and the two exposure conditions revealed 146 contigs differentially expressed of which 144 for TA and two for TB. 34.0% were annotated with biological function. There were four mainly impacted processes: two cellular signalling systems (Notch and JNK) and two biosynthesis pathways (Polyamine and Catecholamine pathways). This work reports a large-scale analysis of differentially transcribed genes of R. balthica exposed to oxazepam during egg development until hatching. In addition, these results enriched the de novo database of potential ecotoxicological models.

Journal ArticleDOI
TL;DR: The genome sequence, annotation, and features of the commensal E. coli BG1, isolated from the gastro-intestinal tract of cattle, can be used as a reference (control strain) for subsequent evolution and comparative studies.
Abstract: Escherichia coli is the most abundant facultative anaerobic bacteria in the gastro-intestinal tract of mammals but can be responsible for intestinal infection due to acquisition of virulence factors. Genomes of pathogenic E. coli strains are widely described whereas those of bovine commensal E. coli strains are very scarce. Here, we report the genome sequence, annotation, and features of the commensal E. coli BG1 isolated from the gastro-intestinal tract of cattle. Whole genome sequencing analysis showed that BG1 has a chromosome of 4,782,107 bp coding for 4465 proteins and 97 RNAs. E. coli BG1 belonged to the serotype O159:H21, was classified in the phylogroup B1 and possessed the genetic information encoding “virulence factors” such as adherence systems, iron acquisition and flagella synthesis. A total of 12 adherence systems were detected reflecting the potential ability of BG1 to colonize different segments of the bovine gastro-intestinal tract. E. coli BG1 is unable to assimilate ethanolamine that confers a nutritional advantage to some pathogenic E. coli in the bovine gastro-intestinal tract. Genome analysis revealed the presence of i) 34 amino acids change due to non-synonymous SNPs among the genes encoding ethanolamine transport and assimilation, and ii) an additional predicted alpha helix inserted in cobalamin adenosyltransferase, a key enzyme required for ethanolamine assimilation. These modifications could explain the incapacity of BG1 to use ethanolamine. The BG1 genome can now be used as a reference (control strain) for subsequent evolution and comparative studies.

Posted ContentDOI
19 Jun 2017-bioRxiv
TL;DR: It is shown that most TGD duplicates gained their current status relatively rapidly after TGD, and novel cases of TGD ohnolog subfunctionalization and neofunctionalization are reported that further illustrate the importance of these processes.
Abstract: Whole genome duplications (WGD) are important evolutionary events. Our understanding of underlying mechanisms, including the evolution of duplicated genes after WGD, however remains incomplete. Teleost fish experienced a common WGD (teleost-specific genome duplication, or TGD) followed by a dramatic adaptive radiation leading to more than half of all vertebrate species. The analysis of gene expression patterns following TGD at the genome level has been limited by the lack of suitable genomic resources. The recent concomitant release of the genome sequence of spotted gar (a representative of holosteans, the closest lineage of teleosts that lacks the TGD) and the tissue-specific gene expression repertoires of over 20 holostean and teleostean fish species, including spotted gar, zebrafish and medaka (the PhyloFish project), offered a unique opportunity to study the evolution of gene expression following TGD in teleosts. We show that most TGD duplicates gained their current status (loss of one duplicate gene or retention of both duplicates) relatively rapidly after TGD (i.e. prior to the divergence of medaka and zebrafish lineages). The loss of one duplicate is the most common fate after TGD with a probability of approximately 80%. In addition, the fate of duplicate genes after TGD, including subfunctionalization, neofunctionalization, or retention of two almost similar copies occurred not only before, but also after the radiation of species tested, in consistency with a role of the TGD in speciation and/or evolution of gene function. Finally, we report novel cases of TGD ohnolog subfunctionalization and neofunctionalization that further illustrate the importance of these processes.

Journal ArticleDOI
TL;DR: This study conducted an investigation to first identify bidirectional promoters sharing genes expressed in bovine Longissimus thoracis and then to find genetic variants affecting the activity of some of these biddirectional promoters.
Abstract: Bidirectional promoters are regulatory regions co-regulating the expression of two neighbouring genes organized in a head-to-head orientation. In recent years, these regulatory regions have been studied in many organisms; however, no investigation to date has been done to analyse the genetic variation of the activity of this type of promoter regions. In our study, we conducted an investigation to first identify bidirectional promoters sharing genes expressed in bovine Longissimus thoracis and then to find genetic variants affecting the activity of some of these bidirectional promoters. Combining bovine gene information and expression data obtained using RNA-Seq, we identified 120 putative bidirectional promoters active in bovine muscle. We experimentally validated in vitro 16 of these bidirectional promoters. Finally, using gene expression and whole-genome genotyping data, we explored the variability of the activity in muscle of the identified bidirectional promoters and discovered genetic variants affecting their activity. We found that the expression level of 77 genes is correlated with the activity of 12 bidirectional promoters. We also identified 57 single nucleotide polymorphisms associated with the activity of 5 bidirectional promoters. To our knowledge, our study is the first analysis in any species of the genetic variability of the activity of bidirectional promoters.

Journal ArticleDOI
TL;DR: The draft genome sequence of EHEC O157:H7 strain MC2 isolated from cattle in France contains 5,400,376 bp that encoded 5,914 predicted genes (5,805 protein-encoding genes and 109 RNA genes).
Abstract: Enterohemorrhagic Escherichia coli (EHEC) with serotype O157:H7 is a major foodborne pathogen. Here, we report the draft genome sequence of EHEC O157:H7 strain MC2 isolated from cattle in France. The assembly contains 5,400,376 bp that encoded 5,914 predicted genes (5,805 protein-encoding genes and 109 RNA genes).

Journal ArticleDOI
TL;DR: The authors' de novo tench gut transcriptome assembly (online NGS Pipeline) provide a guide for further diverse studies because, understanding feed-gut interactions and intestinal homeostasis in farmed tench is important to maximise performance of this species and to ensure that freshwater aquaculture continues to be a sustainable source of food for a growing world population.

Journal ArticleDOI
TL;DR: This work has identified the hgcA and hgcB genes involved in mercury methylation, but not those responsible for mercury demethylation.
Abstract: Desulfovibrio BerOc1 is a sulfate-reducing bacterium isolated from the Berre lagoon (French Mediterranean coast). BerOc1 is able to methylate and demethylate mercury. The genome size is 4,081,579 bp assembled into five contigs. We identified the hgcA and hgcB genes involved in mercury methylation, but not those responsible for mercury demethylation.