scispace - formally typeset
Search or ask a question
Author

Chen Xie

Other affiliations: Peking University
Bio: Chen Xie is an academic researcher from Max Planck Society. The author has contributed to research in topics: Gene & Transcriptome. The author has an hindex of 7, co-authored 13 publications receiving 2544 citations. Previous affiliations of Chen Xie include Peking University.

Papers
More filters
Journal ArticleDOI
TL;DR: A web server, KOBAS 2.0, is reported, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations, which allows for both ID mapping and cross-species sequence similarity mapping.
Abstract: High-throughput experimental technologies often identify dozens to hundreds of genes related to, or changed in, a biological or pathological process. From these genes one wants to identify biological pathways that may be involved and diseases that may be implicated. Here, we report a web server, KOBAS 2.0, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations. It allows for both ID mapping and cross-species sequence similarity mapping. It then performs statistical tests to identify statistically significantly enriched pathways and diseases. KOBAS 2.0 incorporates knowledge across 1327 species from 5 pathway databases (KEGG PATHWAY, PID, BioCyc, Reactome and Panther) and 5 human disease databases (OMIM, KEGG DISEASE, FunDO, GAD and NHGRI GWAS Catalog). KOBAS 2.0 can be accessed at http://kobas.cbi.pku.edu.cn.

3,293 citations

Journal ArticleDOI
TL;DR: It is suggested that at least a portion of long non-coding RNAs, especially those with active and regulated transcription, may serve as a birth pool for protein-c coding genes, which are then further optimized at the transcriptional level.
Abstract: Tinkering with pre-existing genes has long been known as a major way to create new genes. Recently, however, motherless protein-coding genes have been found to have emerged de novo from ancestral non-coding DNAs. How these genes originated is not well addressed to date. Here we identified 24 hominoid-specific de novo protein-coding genes with precise origination timing in vertebrate phylogeny. Strand-specific RNA–Seq analyses were performed in five rhesus macaque tissues (liver, prefrontal cortex, skeletal muscle, adipose, and testis), which were then integrated with public transcriptome data from human, chimpanzee, and rhesus macaque. On the basis of comparing the RNA expression profiles in the three species, we found that most of the hominoid-specific de novo protein-coding genes encoded polyadenylated non-coding RNAs in rhesus macaque or chimpanzee with a similar transcript structure and correlated tissue expression profile. According to the rule of parsimony, the majority of these hominoid-specific de novo protein-coding genes appear to have acquired a regulated transcript structure and expression profile before acquiring coding potential. Interestingly, although the expression profile was largely correlated, the coding genes in human often showed higher transcriptional abundance than their non-coding counterparts in rhesus macaque. The major findings we report in this manuscript are robust and insensitive to the parameters used in the identification and analysis of de novo genes. Our results suggest that at least a portion of long non-coding RNAs, especially those with active and regulated transcription, may serve as a birth pool for protein-coding genes, which are then further optimized at the transcriptional level.

136 citations

Journal ArticleDOI
TL;DR: A combined proteomics, bioinformatics and qPCR analysis revealed that Al(3+) invasion caused complex proteomic changes in rice roots involving energy, stress and defense, protein turnover, metabolism, signal transduction, transport and intracellular traffic, cell structure, cell growth/division, and transcription.

111 citations

Journal ArticleDOI
22 Aug 2019-eLife
TL;DR: The findings support the hypothesis that a de novo evolved gene can directly adopt a function without much sequence adaptation, and specifically analyze Gm13030, which is specifically expressed in females in the oviduct.
Abstract: The de novo emergence of new genes has been well documented through genomic analyses. However, a functional analysis, especially of very young protein-coding genes, is still largely lacking. Here, we identify a set of house mouse-specific protein-coding genes and assess their translation by ribosome profiling and mass spectrometry data. We functionally analyze one of them, Gm13030, which is specifically expressed in females in the oviduct. The interruption of the reading frame affects the transcriptional network in the oviducts at a specific stage of the estrous cycle. This includes the upregulation of Dcpp genes, which are known to stimulate the growth of preimplantation embryos. As a consequence, knockout females have their second litters after shorter times and have a higher infanticide rate. Given that Gm13030 shows no signs of positive selection, our findings support the hypothesis that a de novo evolved gene can directly adopt a function without much sequence adaptation.

34 citations

Journal ArticleDOI
Yue Huang1, Chen Xie1, Adam Yongxin Ye1, Chuan-Yun Li1, Ge Gao1, Liping Wei1 
09 Apr 2013-PLOS ONE
TL;DR: It is shown that genes that are highly expressed in the central nervous system are enriched in recent positive selection events in human history identified by intra-species genomic scan, especially in brain regions related to cognitive functions.
Abstract: BACKGROUND AND OBJECTIVES Analysis of positively-selected genes can help us understand how human evolved, especially the evolution of highly developed cognitive functions. However, previous works have reached conflicting conclusions regarding whether human neuronal genes are over-represented among genes under positive selection. METHODS AND RESULTS We divided positively-selected genes into four groups according to the identification approaches, compiling a comprehensive list from 27 previous studies. We showed that genes that are highly expressed in the central nervous system are enriched in recent positive selection events in human history identified by intra-species genomic scan, especially in brain regions related to cognitive functions. This pattern holds when different datasets, parameters and analysis pipelines were used. Functional category enrichment analysis supported these findings, showing that synapse-related functions are enriched in genes under recent positive selection. In contrast, immune-related functions, for instance, are enriched in genes under ancient positive selection revealed by inter-species coding region comparison. We further demonstrated that most of these patterns still hold even after controlling for genomic characteristics that might bias genome-wide identification of positively-selected genes including gene length, gene density, GC composition, and intensity of negative selection. CONCLUSION Our rigorous analysis resolved previous conflicting conclusions and revealed recent adaptation of human brain functions.

16 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: G:Profiler is now capable of analysing data from any organism, including vertebrates, plants, fungi, insects and parasites, and the 2019 update introduces an extensive technical rewrite making the services faster and more flexible.
Abstract: Biological data analysis often deals with lists of genes arising from various studies. The g:Profiler toolset is widely used for finding biological categories enriched in gene lists, conversions between gene identifiers and mappings to their orthologs. The mission of g:Profiler is to provide a reliable service based on up-to-date high quality data in a convenient manner across many evidence types, identifier spaces and organisms. g:Profiler relies on Ensembl as a primary data source and follows their quarterly release cycle while updating the other data sources simultaneously. The current update provides a better user experience due to a modern responsive web interface, standardised API and libraries. The results are delivered through an interactive and configurable web design. Results can be downloaded as publication ready visualisations or delimited text files. In the current update we have extended the support to 467 species and strains, including vertebrates, plants, fungi, insects and parasites. By supporting user uploaded custom GMT files, g:Profiler is now capable of analysing data from any organism. All past releases are maintained for reproducibility and transparency. The 2019 update introduces an extensive technical rewrite making the services faster and more flexible. g:Profiler is freely available at https://biit.cs.ut.ee/gprofiler.

2,959 citations

Journal ArticleDOI
03 Jul 2013-Cell
TL;DR: This Review outlines the emerging understanding of lincRNAs in vertebrate animals, with emphases on how they are being identified and current conclusions and questions regarding their genomics, evolution and mechanisms of action.

2,213 citations

01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations

Journal ArticleDOI
TL;DR: This study represents the first characterization of the proteome of F. Ginseng during development and provides new insights into the metabolism and accumulation of ginsenosides.
Abstract: F. Ginseng (Panax ginseng) is planted in the forest to enhance the natural ginseng resources, which have an immense medicinal and economic value. The morphology of the cultivated plants becomes similar to that of wild growing ginseng (W. Ginseng) over the years. So far, there have been no studies highlighting the physiological or functional changes in F. Ginseng and its wild counterparts. In the present study, we used proteomic technologies (2DE and iTRAQ) coupled to mass spectrometry to compare W. Ginseng and F. Ginseng at various growth stages. Hierarchical cluster analysis based on protein abundance revealed that the protein expression profile of 25-year-old F. Ginseng was more like W. Ginseng than less 20-year-old F. Ginseng. We identified 192 differentially expressed protein spots in F. Ginseng. These protein spots increased with increase in growth years of F. Ginseng and were associated with proteins involved in energy metabolism, ginsenosides biosynthesis, and stress response. The mRNA, physiological, and metabolic analysis showed that the external morphology, protein expression profile, and ginsenoside synthesis ability of the F. Ginseng increased just like that of W. Ginseng with the increase in age. Our study represents the first characterization of the proteome of F. Ginseng during development and provides new insights into the metabolism and accumulation of ginsenosides.

1,505 citations

Journal ArticleDOI
03 Apr 2019-Nature
TL;DR: Transcriptional adaptation, a genetic compensation process by which organisms respond to mutations by upregulating related genes, is triggered by mRNA decay and involves a sequence-dependent mechanism.
Abstract: Genetic robustness, or the ability of an organism to maintain fitness in the presence of harmful mutations, can be achieved via protein feedback loops. Previous work has suggested that organisms may also respond to mutations by transcriptional adaptation, a process by which related gene(s) are upregulated independently of protein feedback loops. However, the prevalence of transcriptional adaptation and its underlying molecular mechanisms are unknown. Here, by analysing several models of transcriptional adaptation in zebrafish and mouse, we uncover a requirement for mutant mRNA degradation. Alleles that fail to transcribe the mutated gene do not exhibit transcriptional adaptation, and these alleles give rise to more severe phenotypes than alleles displaying mutant mRNA decay. Transcriptome analysis in alleles displaying mutant mRNA decay reveals the upregulation of a substantial proportion of the genes that exhibit sequence similarity with the mutated gene's mRNA, suggesting a sequence-dependent mechanism. These findings have implications for our understanding of disease-causing mutations, and will help in the design of mutant alleles with minimal transcriptional adaptation-derived compensation.

679 citations