scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Large scale comparison of global gene expression patterns in human and mouse

23 Dec 2010-Genome Biology (BioMed Central)-Vol. 11, Iss: 12, pp 1-11
TL;DR: The results indicate that the global patterns of tissue-specific expression of orthologous genes are conserved in human and mouse.
Abstract: It is widely accepted that orthologous genes between species are conserved at the sequence level and perform similar functions in different organisms. However, the level of conservation of gene expression patterns of the orthologous genes in different species has been unclear. To address the issue, we compared gene expression of orthologous genes based on 2,557 human and 1,267 mouse samples with high quality gene expression data, selected from experiments stored in the public microarray repository ArrayExpress. In a principal component analysis (PCA) of combined data from human and mouse samples merged on orthologous probesets, samples largely form distinctive clusters based on their tissue sources when projected onto the top principal components. The most prominent groups are the nervous system, muscle/heart tissues, liver and cell lines. Despite the great differences in sample characteristics and experiment conditions, the overall patterns of these prominent clusters are strikingly similar for human and mouse. We further analyzed data for each tissue separately and found that the most variable genes in each tissue are highly enriched with human-mouse tissue-specific orthologs and the least variable genes in each tissue are enriched with human-mouse housekeeping orthologs. The results indicate that the global patterns of tissue-specific expression of orthologous genes are conserved in human and mouse. The expression of groups of orthologous genes co-varies in the two species, both for the most variable genes and the most ubiquitously expressed genes.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This study demonstrated that exploration of genes showing ASE and ASS in hybrids of closely related species is feasible for species evolution research.
Abstract: Divergence of gene expression and alternative splicing is a crucial driving force in the evolution of species; to date, however the molecular mechanism remains unclear. Hybrids of closely related species provide a suitable model to analyze allele-specific expression (ASE) and allele-specific alternative splicing (ASS). Analysis of ASE and ASS can uncover the differences in cis-regulatory elements between closely related species, while eliminating interference of trans-regulatory elements. Here, we provide a detailed characterization of ASE and ASS from 19 and 10 transcriptome datasets across five tissues from reciprocal-cross hybrids of horse×donkey (mule/hinny) and cattle×yak (dzo), respectively. Results showed that 4.8%–8.7% and 10.8%–16.7% of genes exhibited ASE and ASS, respectively. Notably, lncRNAs and pseudogenes were more likely to show ASE than protein-coding genes. In addition, genes showing ASE and ASS in mule/hinny were found to be involved in the regulation of muscle strength, whereas those of dzo were involved in high-altitude adaptation. In conclusion, our study demonstrated that exploration of genes showing ASE and ASS in hybrids of closely related species is feasible for species evolution research.

17 citations


Cites background from "Large scale comparison of global ge..."

  • ...This is consistent with the view that gene expression evolution is conserved and strongly shaped by purifying selection (Jordan, 2004; Liao & Zhang, 2006; Zheng-Bradley et al., 2010)....

    [...]

Journal ArticleDOI
TL;DR: Differences between the acetylation level and the transcriptional level of cell-type-specific markers suggest additional mechanism(s) between acetylome and transcriptome, and provide promising TF-encoding genes that could serve as master regulators of myocardial remodeling.
Abstract: H3K27ac histone acetylome changes contribute to the phenotypic response in heart diseases, particularly in end-stage heart failure. However, such epigenetic alterations have not been systematically investigated in remodeled non-failing human hearts. Therefore, valuable insight into cardiac dysfunction in early remodeling is lacking. This study aimed to reveal the acetylation changes of chromatin regions in response to myocardial remodeling and their correlations to transcriptional changes of neighboring genes. We detected chromatin regions with differential acetylation activity (DARs; Padj. < 0.05) between remodeled non-failing patient hearts and healthy donor hearts. The acetylation level of the chromatin region correlated with its RNA polymerase II occupancy level and the mRNA expression level of its adjacent gene per sample. Annotated genes from DARs were enriched in disease-related pathways, including fibrosis and cell metabolism regulation. DARs that change in the same direction have a tendency to cluster together, suggesting the well-reorganized chromatin architecture that facilitates the interactions of regulatory domains in response to myocardial remodeling. We further show the differences between the acetylation level and the mRNA expression level of cell-type-specific markers for cardiomyocytes and 11 non-myocyte cell types. Notably, we identified transcriptome factor (TF) binding motifs that were enriched in DARs and defined TFs that were predicted to bind to these motifs. We further showed 64 genes coding for these TFs that were differentially expressed in remodeled myocardium when compared with controls. Our study reveals extensive novel insight on myocardial remodeling at the DNA regulatory level. Differences between the acetylation level and the transcriptional level of cell-type-specific markers suggest additional mechanism(s) between acetylome and transcriptome. By integrating these two layers of epigenetic profiles, we further provide promising TF-encoding genes that could serve as master regulators of myocardial remodeling. Combined, our findings highlight the important role of chromatin regulatory signatures in understanding disease etiology.

17 citations

Book ChapterDOI
TL;DR: The new developments made, how genomic data and new genetic tools have deeply changed the way of making models, extended the panel of animal models, and increased the understanding of the neurobiology of the disease are reported.
Abstract: The genotype–phenotype relationship and the physiopathology of Down Syndrome (DS) have been explored in the last 20 years with more and more relevant mouse models. From the early age of transgenesis to the new CRISPR/CAS9-derived chromosomal engineering and the transchromosomic technologies, mouse models have been key to identify homologous genes or entire regions homologous to the human chromosome 21 that are necessary or sufficient to induce DS features, to investigate the complexity of the genetic interactions that are involved in DS and to explore therapeutic strategies. In this review we report the new developments made, how genomic data and new genetic tools have deeply changed our way of making models, extended our panel of animal models, and increased our understanding of the neurobiology of the disease. But even if we have made an incredible progress which promises to make DS a curable condition, we are facing new research challenges to nurture our knowledge of DS pathophysiology as a neurodevelopmental disorder with many comorbidities during ageing.

16 citations

Journal ArticleDOI
TL;DR: It is shown, using 11 large organ-specific datasets, that IQRray, a new quality metrics developed by us, exhibits the highest correlation with this reference metric, among 14 metrics tested.
Abstract: Motivation: Microarray results accumulated in public repositories are widely reused in meta-analytical studies and secondary databases. The quality of the data obtained with this technology varies from experiment to experiment, and an efficient method for quality assessment is necessary to ensure their reliability. Results: The lack of a good benchmark has hampered evaluation of existing methods for quality control. In this study, we propose a new independent quality metric that is based on evolutionary conservation of expression profiles. We show, using 11 large organ-specific datasets, that IQRray, a new quality metrics developed by us, exhibits the highest correlation with this reference metric, among 14 metrics tested. IQRray outperforms other methods in identification of poor quality arrays in datasets composed of arrays from many independent experiments. In contrast, the performance of methods designed for detecting outliers in a single experiment like Normalized Unscaled Standard Error and Relative Log Expression was low because of the inability of these methods to detect datasets containing only low-quality arrays and because the scores cannot be directly compared between experiments. Availability and implementation: The R implementation of IQRray is available at: ftp://lausanne.isb-sib.ch/pub/databases/Bgee/general/IQRray.R. Contact: hc.linu@zciweikisoR.atraM Supplementary information: Supplementary data are available at Bioinformatics online.

16 citations

Journal ArticleDOI
TL;DR: Pounds et al. as mentioned in this paper developed the agreement of differential expression (AGDEX) procedure to measure and determine the statistical significance of the similarity of the results of two experiments that measure differential expression across two groups.
Abstract: Motivation: Animal models play a pivotal role in translation biomedical research. The scientific value of an animal model depends on how accurately it mimics the human disease. In principle, microarrays collect the necessary data to evaluate the transcriptomic fidelity of an animal model in terms of the similarity of expression with the human disease. However, statistical methods for this purpose are lacking. Results: We develop the agreement of differential expression (AGDEX) procedure to measure and determine the statistical significance of the similarity of the results of two experiments that measure differential expression across two groups. AGDEX defines a metric of agreement and determines statistical significance by permutation of each experiment's group labels. Additionally, AGDEX performs a comprehensive permutation-based analysis of differential expression for each experiment, including gene-set analyses and meta-analytic integration of results across studies. As an example, we show how AGDEX was recently used to evaluate the similarity of the transcriptome of a novel model of the brain tumor ependymoma in mice to that of a subtype of the human disease. This result, combined with other observations, helped us to infer the cell of origin of this devastating human cancer. Availability: An R package is currently available from www.stjuderesearch.org/site/depts/biostats/agdex and will shortly be available from www.bioconductor.org. Contact: stanley.pounds@stjude.org Supplementary information:Supplementary data are available at Bioinformatics online.

16 citations

References
More filters
Journal ArticleDOI
TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.
Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

34,830 citations

Journal ArticleDOI
TL;DR: There is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities, and the exploratory data analyses of the probe level data motivate a new summary measure that is a robust multi-array average (RMA) of background-adjusted, normalized, and log-transformed PM values.
Abstract: SUMMARY In this paper we report exploratory analyses of high-density oligonucleotide array data from the Affymetrix GeneChip R � system with the objective of improving upon currently used measures of gene expression. Our analyses make use of three data sets: a small experimental study consisting of five MGU74A mouse GeneChip R � arrays, part of the data from an extensive spike-in study conducted by Gene Logic and Wyeth’s Genetics Institute involving 95 HG-U95A human GeneChip R � arrays; and part of a dilution study conducted by Gene Logic involving 75 HG-U95A GeneChip R � arrays. We display some familiar features of the perfect match and mismatch probe ( PM and MM )v alues of these data, and examine the variance–mean relationship with probe-level data from probes believed to be defective, and so delivering noise only. We explain why we need to normalize the arrays to one another using probe level intensities. We then examine the behavior of the PM and MM using spike-in data and assess three commonly used summary measures: Affymetrix’s (i) average difference (AvDiff) and (ii) MAS 5.0 signal, and (iii) the Li and Wong multiplicative model-based expression index (MBEI). The exploratory data analyses of the probe level data motivate a new summary measure that is a robust multiarray average (RMA) of background-adjusted, normalized, and log-transformed PM values. We evaluate the four expression summary measures using the dilution study data, assessing their behavior in terms of bias, variance and (for MBEI and RMA) model fit. Finally, we evaluate the algorithms in terms of their ability to detect known levels of differential expression using the spike-in data. We conclude that there is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities. ∗ To whom correspondence should be addressed

10,711 citations


"Large scale comparison of global ge..." refers methods in this paper

  • ...The resulting 1,323 CEL files were pre-processed using Bioconductor’s RMA package [32] to create an integrated, normalized data matrix....

    [...]

Journal ArticleDOI
TL;DR: In this paper, high-density oligonucleotide arrays offer the opportunity to examine patterns of gene expression on a genome scale, and the authors have designed custom arrays that interrogate the expression of the vast majority of proteinencoding human and mouse genes and have used them to profile a panel of 79 human and 61 mouse tissues.
Abstract: The tissue-specific pattern of mRNA expression can indicate important clues about gene function. High-density oligonucleotide arrays offer the opportunity to examine patterns of gene expression on a genome scale. Toward this end, we have designed custom arrays that interrogate the expression of the vast majority of protein-encoding human and mouse genes and have used them to profile a panel of 79 human and 61 mouse tissues. The resulting data set provides the expression patterns for thousands of predicted genes, as well as known and poorly characterized genes, from mice and humans. We have explored this data set for global trends in gene expression, evaluated commonly used lines of evidence in gene prediction methodologies, and investigated patterns indicative of chromosomal organization of transcription. We describe hundreds of regions of correlated transcription and show that some are subject to both tissue and parental allele-specific expression, suggesting a link between spatial expression and imprinting.

3,513 citations


"Large scale comparison of global ge..." refers background or result in this paper

  • ...While studies suggested that orthologous genes do not share similar expression patterns [1-5], other groups reported the opposite observations [6-9]....

    [...]

  • ...Alternatively, many other studies made use of species-specific arrays to identify coexpressed groups of orthologous genes [4-6,16,17]....

    [...]

Journal ArticleDOI
TL;DR: The ability of the trained ANN models to recognize SRBCTs is demonstrated, and the potential applications of these methods for tumor diagnosis and the identification of candidate targets for therapy are demonstrated.
Abstract: The purpose of this study was to develop a method of classifying cancers to specific diagnostic categories based on their gene expression signatures using artificial neural networks (ANNs). We trained the ANNs using the small, round blue-cell tumors (SRBCTs) as a model. These cancers belong to four distinct diagnostic categories and often present diagnostic dilemmas in clinical practice. The ANNs correctly classified all samples and identified the genes most relevant to the classification. Expression of several of these genes has been reported in SRBCTs, but most have not been associated with these cancers. To test the ability of the trained ANN models to recognize SRBCTs, we analyzed additional blinded samples that were not previously used for the training procedure, and correctly classified them in all cases. This study demonstrates the potential applications of these methods for tumor diagnosis and the identification of candidate targets for therapy.

2,683 citations


"Large scale comparison of global ge..." refers methods in this paper

  • ...PCA has been often used to study high-dimensional data generated by genome-wide gene expression studies [22-25]....

    [...]

Book
27 Jan 2006
TL;DR: In this article, the authors present a detailed case study of R algorithms with publicly available data, and a major section of the book is devoted to fully worked case studies, with a companion website where readers can reproduce every number, figure and table on their own computers.
Abstract: Full four-color book. Some of the editors created the Bioconductor project and Robert Gentleman is one of the two originators of R. All methods are illustrated with publicly available data, and a major section of the book is devoted to fully worked case studies. Code underlying all of the computations that are shown is made available on a companion website, and readers can reproduce every number, figure, and table on their own computers.

2,625 citations

Related Papers (5)