scispace - formally typeset
Search or ask a question

Showing papers by "Rainer Breitling published in 2007"


Journal ArticleDOI
18 Jul 2007-PLOS ONE
TL;DR: It is shown that many mapped local eQTLs in genetical genomics experiments do not reflect actual expression differences caused by sequence polymorphisms in cis-acting factors changing mRNA levels, but indicate hybridization differences causedby sequence polymorphism in the mRNA region that is targeted by the microarray probes.
Abstract: Many investigations have reported the successful mapping of quantitative trait loci (QTLs) for gene expression phenotypes (eQTLs). Local eQTLs, where expression phenotypes map to the genes themselves, are of especially great interest, because they are direct candidates for previously mapped physiological QTLs. Here we show that many mapped local eQTLs in genetical genomics experiments do not reflect actual expression differences caused by sequence polymorphisms in cis-acting factors changing mRNA levels. Instead they indicate hybridization differences caused by sequence polymorphisms in the mRNA region that is targeted by the microarray probes. Many such polymorphisms can be detected by a sensitive and novel statistical approach that takes the individual probe signals into account. Applying this approach to recent mouse and human eQTL data, we demonstrate that indeed many local eQTLs are falsely reported as "cis-acting" or "cis" and can be successfully detected and eliminated with this approach.

140 citations


Book ChapterDOI
16 Dec 2007
TL;DR: The analysis suggests that correlations between the perfect match intensity of a particular probe and its neighbors are highly relevant for successful exon identification.
Abstract: We apply learning vector quantization to the analysis of tiling microarray data. As an example we consider the classification of C. elegans genomic probes as intronic or exonic. Training is based on the current annotation of the genome. Relevance learning techniques are used to weight and select features according to their importance for the classification. Among other findings, the analysis suggests that correlations between the perfect match intensity of a particular probe and its neighbors are highly relevant for successful exon identification.

96 citations


Journal ArticleDOI
TL;DR: It is shown that experimental design and environmental confounders greatly influence the identification of candidate genes in ecological microarray studies, and that following several simple recommendations could facilitate the analysis of microarray data in ecological settings.
Abstract: Microarrays are used to measure simultaneously the amount of mRNAs transcribed from many genes. They were originally designed for gene expression profiling in relatively simple biological systems, such as cell lines and model systems under constant laboratory conditions. This poses a challenge to ecologists who increasingly want to use microarrays to unravel the genetic mechanisms underlying complex interactions among organisms and between organisms and their environment. Here, we discuss typical experimental and statistical problems that arise when analyzing genome-wide expression profiles in an ecological context. We show that experimental design and environmental confounders greatly influence the identification of candidate genes in ecological microarray studies, and that following several simple recommendations could facilitate the analysis of microarray data in ecological settings.

79 citations


Journal ArticleDOI
TL;DR: UNLABELLED FIVA (Function Information Viewer and Analyzer) aids researchers in the prokaryotic community to quickly identify relevant biological processes following transcriptome analysis.
Abstract: FIVA (Function Information Viewer and Analyzer) aids researchers in the prokaryotic community to quickly identify relevant biological processes following transcriptome analysis. Our software assists in functional profiling of large sets of genes and generates a comprehensive overview of affected biological processes.

36 citations


Journal ArticleDOI
TL;DR: There is very good prospect of successfully predicting the function of yet uncharacterized proteins using machine learning classifiers trained on proteins of known function, and in the case of highly specialized proteomes classifiers from a different, but more conventional, species may in fact outperform the endogenous species-specific classifier.
Abstract: Predicting the function of newly discovered proteins by simply inspecting their amino acid sequence is one of the major challenges of post-genomic computational biology, especially when done without recourse to experimentation or homology information. Machine learning classifiers are able to discriminate between proteins belonging to different functional classes. Until now, however, it has been unclear if this ability would be transferable to proteins of unknown function, which may show distinct biases compared to experimentally more tractable proteins. Here we show that proteins with known and unknown function do indeed differ significantly. We then show that proteins from different bacterial species also differ to an even larger and very surprising extent, but that functional classifiers nonetheless generalize successfully across species boundaries. We also show that in the case of highly specialized proteomes classifiers from a different, but more conventional, species may in fact outperform the endogenous species-specific classifier. We conclude that there is very good prospect of successfully predicting the function of yet uncharacterized proteins using machine learning classifiers trained on proteins of known function.

32 citations


Journal ArticleDOI
TL;DR: The accuracy of Affymetrix probe sequences is higher than previously reported, particularly on newer arrays, and refined probe set definitions have clear effects on the detection of differentially expressed genes.
Abstract: The Affymetrix GeneChip technology uses multiple probes per gene to measure its expression level. Individual probe signals can vary widely, which hampers proper interpretation. This variation can be caused by probes that do not properly match their target gene or that match multiple genes. To determine the accuracy of Affymetrix arrays, we developed an extensive verification protocol, for mouse arrays incorporating the NCBI RefSeq, NCBI UniGene Unique, NIA Mouse Gene Index, and UCSC mouse genome databases. Applying this protocol to Affymetrix Mouse Genome arrays (the earlier U74Av2 and the newer 430 2.0 array), the number of sequence-verified probes with perfect matches was no less than 85% and 95%, respectively; and for 74% and 85% of the probe sets all probes were sequence verified. The latter percentages increased to 80% and 94% after discarding one or two unverifiable probes per probe set, and even further to 84% and 97% when, in addition, allowing for one or two mismatches between probe and target gene. Similar results were obtained for other mouse arrays, as well as for human and rat arrays. Based on these data, refined chip definition files for all arrays are provided online. Researchers can choose the version appropriate for their study to (re)analyze expression data. The accuracy of Affymetrix probe sequences is higher than previously reported, particularly on newer arrays. Yet, refined probe set definitions have clear effects on the detection of differentially expressed genes. We demonstrate that the interpretation of the results of Affymetrix arrays is improved when the new chip definition files are used.

23 citations


Journal ArticleDOI
TL;DR: It is predicted that a robust functioning of the Hh pathway will require the involvement of more sterol metabolites, and these should be the subject of future research.
Abstract: The close link between signaling by the developmental regulators of the Hedgehog family and cholesterol biochemistry has been known for some time. The morphogen is covalently attached to cholesterol in a peculiar autocatalytic reaction and embryonal disruption of cholesterol synthesis leads to malformations that mimic Hh signaling defects. Recently, it was furthermore shown that secreted Hh could hitchhike on lipoprotein particles to establish its morphogenic gradient in the developing embryo. Additionally, there is new evidence that the Hh-receptor Patched transmits the Hh signal by modulating the secretion of an inhibitory sterol molecule from the receiving cells. Here we present some of the most recent discoveries on the Hh-sterol link and discuss their implications from a systems design perspective. We predict that a robust functioning of the Hh pathway will require the involvement of more sterol metabolites, and these should be the subject of future research.

21 citations


Journal Article
TL;DR: This work presents an approach to analyze temporal variation in the proteome and suggests this approach may be useful to evaluate surgical, nutritional, and pharmacological interventions.
Abstract: Monitoring changes in serum protein expression in response to acute events such as trauma, infection or drug intervention may reveal key proteins of great value in predicting recovery or treatment response. Concerted actions of many proteins are expected. Proteins sharing similar expression changes may function in the same physiological process. As a model we analyzed expression changes in serum of colon cancer patients, before, during, and after laparoscopic colon resection. Eight samples were taken from each of four patients before, during, and up to 5 days after surgery. Total serum and a low molecular weight fraction were analyzed by SELDI‐TOF‐MS. In total 146 masses were detected. A principal components analysis (PCA) illustrates the temporal variation in the postsurgery proteome. Time series for each mass could be clustered into four distinct groups based on similarity in expression pattern. Two masses of 11.4 and 11.6 kDa, part of a slow response cluster, were identified as forms of the acute phase protein serum amyloid A (SAA). Fourteen more proteins belong to this cluster and may also function in acute phase response. We present an approach to analyze temporal variation in the proteome. This approach may be useful to evaluate surgical, nutritional, and pharmacological interventions.

16 citations


Journal ArticleDOI
TL;DR: In this article, the authors analyzed changes in serum protein expression in response to acute events such as trauma, infection or drug intervention may reveal key proteins of great value in predicting recovery or treatment response.
Abstract: Monitoring changes in serum protein expression in response to acute events such as trauma, infection or drug intervention may reveal key proteins of great value in predicting recovery or treatment response. Concerted actions of many proteins are expected. Proteins sharing similar expression changes may function in the same physiological process. As a model we analyzed expression changes in serum of colon cancer patients, before, during, and after laparoscopic colon resection. Eight samples were taken from each of four patients before, during, and up to 5 days after surgery. Total serum and a low molecular weight fraction were analyzed by SELDI-TOF-MS. In total 146 masses were detected. A principal components analysis (PCA) illustrates the temporal variation in the postsurgery proteome. Time series for each mass could be clustered into four distinct groups based on similarity in expression pattern. Two masses of 11.4 and 11.6 kDa, part of a slow response cluster, were identified as forms of the acute phase protein serum amyloid A (SAA). Fourteen more proteins belong to this cluster and may also function in acute phase response. We present an approach to analyze temporal variation in the proteome. This approach may be useful to evaluate surgical, nutritional, and pharmacological interventions.

15 citations


Book ChapterDOI
16 Dec 2007
TL;DR: It is shown that by training on selected protein sequence properties, SVMs can successfully discriminate between proteins of different species and takes us a step closer to inferring the functional characteristics of these proteins.
Abstract: Much work has been done to identify species-specific proteins in sequenced genomes and hence to determine their function. We assumed that such proteins have specific physico-chemical properties that will discriminate them from proteins in other species. In this paper, we examine the validity of this assumption by comparing proteins and their properties from different bacterial species using Support Vector Machines (SVM). We show that by training on selected protein sequence properties, SVMs can successfully discriminate between proteins of different species. This finding takes us a step closer to inferring the functional characteristics of these proteins.