scispace - formally typeset
Search or ask a question

Showing papers in "BMC Genomics in 2009"


Journal ArticleDOI
TL;DR: All aspects of BioMart are described from a user's perspective and it is demonstrated how it can be used to solve real biological use cases such as SNP selection for candidate gene screening or annotation of microarray results.
Abstract: Biologists need to perform complex queries, often across a variety of databases. Typically, each data resource provides an advanced query interface, each of which must be learnt by the biologist before they can begin to query them. Frequently, more than one data source is required and for high-throughput analysis, cutting and pasting results between websites is certainly very time consuming. Therefore, many groups rely on local bioinformatics support to process queries by accessing the resource's programmatic interfaces if they exist. This is not an efficient solution in terms of cost and time. Instead, it would be better if the biologist only had to learn one generic interface. BioMart provides such a solution.

791 citations


Journal ArticleDOI
TL;DR: A high-density consensus genetic map of barley based only on complete and error-free datasets and genic markers, represented accurately by graphs and approximately by a best-fit linear order, and supported by a readily available SNP genotyping resource is presented in this paper.
Abstract: High density genetic maps of plants have, nearly without exception, made use of marker datasets containing missing or questionable genotype calls derived from a variety of genic and non-genic or anonymous markers, and been presented as a single linear order of genetic loci for each linkage group. The consequences of missing or erroneous data include falsely separated markers, expansion of cM distances and incorrect marker order. These imperfections are amplified in consensus maps and problematic when fine resolution is critical including comparative genome analyses and map-based cloning. Here we provide a new paradigm, a high-density consensus genetic map of barley based only on complete and error-free datasets and genic markers, represented accurately by graphs and approximately by a best-fit linear order, and supported by a readily available SNP genotyping resource. Approximately 22,000 SNPs were identified from barley ESTs and sequenced amplicons; 4,596 of them were tested for performance in three pilot phase Illumina GoldenGate assays. Data from three barley doubled haploid mapping populations supported the production of an initial consensus map. Over 200 germplasm selections, principally European and US breeding material, were used to estimate minor allele frequency (MAF) for each SNP. We selected 3,072 of these tested SNPs based on technical performance, map location, MAF and biological interest to fill two 1536-SNP "production" assays (BOPA1 and BOPA2), which were made available to the barley genetics community. Data were added using BOPA1 from a fourth mapping population to yield a consensus map containing 2,943 SNP loci in 975 marker bins covering a genetic distance of 1099 cM. The unprecedented density of genic markers and marker bins enabled a high resolution comparison of the genomes of barley and rice. Low recombination in pericentric regions is evident from bins containing many more than the average number of markers, meaning that a large number of genes are recombinationally locked into the genetic centromeric regions of several barley chromosomes. Examination of US breeding germplasm illustrated the usefulness of BOPA1 and BOPA2 in that they provide excellent marker density and sensitivity for detection of minor alleles in this genetically narrow material.

564 citations


Journal ArticleDOI
TL;DR: The methods described here for deep sequencing of the transcriptome should be widely applicable to generate catalogs of genes and genetic markers in emerging model organisms to facilitate genomics studies in corals and other non-model systems.
Abstract: New methods are needed for genomic-scale analysis of emerging model organisms that exemplify important biological questions but lack fully sequenced genomes. For example, there is an urgent need to understand the potential for corals to adapt to climate change, but few molecular resources are available for studying these processes in reef-building corals. To facilitate genomics studies in corals and other non-model systems, we describe methods for transcriptome sequencing using 454, as well as strategies for assembling a useful catalog of genes from the output. We have applied these methods to sequence the transcriptome of planulae larvae from the coral Acropora millepora. More than 600,000 reads produced in a single 454 sequencing run were assembled into ~40,000 contigs with five-fold average sequencing coverage. Based on sequence similarity with known proteins, these analyses identified ~11,000 different genes expressed in a range of conditions including thermal stress and settlement induction. Assembled sequences were annotated with gene names, conserved domains, and Gene Ontology terms. Targeted searches using these annotations identified the majority of genes associated with essential metabolic pathways and conserved signaling pathways, as well as novel candidate genes for stress-related processes. Comparisons with the genome of the anemone Nematostella vectensis revealed ~8,500 pairs of orthologs and ~100 candidate coral-specific genes. More than 30,000 SNPs were detected in the coral sequences, and a subset of these validated by re-sequencing. The methods described here for deep sequencing of the transcriptome should be widely applicable to generate catalogs of genes and genetic markers in emerging model organisms. Our data provide the most comprehensive sequence resource currently available for reef-building corals, and include an extensive collection of potential genetic markers for association and population connectivity studies. The characterization of the larval transcriptome for this widely-studied coral will enable research into the biological processes underlying stress responses in corals and evolutionary adaptation to global climate change.

491 citations


Journal ArticleDOI
TL;DR: The core of T 6SS is composed of 13 proteins, conserved in both pathogenic and non-pathogenic bacteria, suggesting that T6SS has evolved to adapt to various microenvironments and specialized functions.
Abstract: The availability of hundreds of bacterial genomes allowed a comparative genomic study of the Type VI Secretion System (T6SS), recently discovered as being involved in pathogenesis By combining comparative and phylogenetic approaches using more than 500 prokaryotic genomes, we characterized the global T6SS genetic structure in terms of conservation, evolution and genomic organization This genome wide analysis allowed the identification of a set of 13 proteins constituting the T6SS protein core and a set of conserved accessory proteins 176 T6SS loci (encompassing 92 different bacteria) were identified and their comparison revealed that T6SS-encoded genes have a specific conserved genetic organization Phylogenetic reconstruction based on the core genes showed that lateral transfer of the T6SS is probably its major way of dissemination among pathogenic and non-pathogenic bacteria Furthermore, the sequence analysis of the VgrG proteins, proposed to be exported in a T6SS-dependent way, confirmed that some C-terminal regions possess domains showing similarities with adhesins or proteins with enzymatic functions The core of T6SS is composed of 13 proteins, conserved in both pathogenic and non-pathogenic bacteria Subclasses of T6SS differ in regulatory and accessory protein content suggesting that T6SS has evolved to adapt to various microenvironments and specialized functions Based on these results, new functional hypotheses concerning the assembly and function of T6SS proteins are proposed

487 citations


Journal ArticleDOI
TL;DR: The results suggest that RNA profiling might provide indirect support to antibodies' specificity, since whenever a evident correlation between the RNA and protein profiles exists, this can sustain that the antibodies used in the immunoassay recognized their cognate antigens.
Abstract: The Central Dogma of biology holds, in famously simplified terms, that DNA makes RNA makes proteins, but there is considerable uncertainty regarding the general, genome-wide correlation between levels of RNA and corresponding proteins. Therefore, to assess degrees of this correlation we compared the RNA profiles (determined using both cDNA- and oligo-based microarrays) and protein profiles (determined immunohistochemically in tissue microarrays) of 1066 gene products in 23 human cell lines. A high mean correlation coefficient (0.52) was obtained from the pairwise comparison of RNA levels determined by the two platforms. Significant correlations, with correlation coefficients exceeding 0.445, between protein and RNA levels were also obtained for a third of the specific gene products. However, the correlation coefficients between levels of RNA and protein products of specific genes varied widely, and the mean correlations between the protein and corresponding RNA levels determined using the cDNA- and oligo-based microarrays were 0.25 and 0.20, respectively. Significant correlations were found in one third of the examined RNA species and corresponding proteins. These results suggest that RNA profiling might provide indirect support to antibodies' specificity, since whenever a evident correlation between the RNA and protein profiles exists, this can sustain that the antibodies used in the immunoassay recognized their cognate antigens.

470 citations


Journal ArticleDOI
TL;DR: The metabolic responses of grapes to water deficit varied with the cultivar and fruit pigmentation, and changes in metabolism have important impacts on berry flavor and quality characteristics.
Abstract: Water deficit has significant effects on grape berry composition resulting in improved wine quality by the enhancement of color, flavors, or aromas. While some pathways or enzymes affected by water deficit have been identified, little is known about the global effects of water deficit on grape berry metabolism. The effects of long-term, seasonal water deficit on berries of Cabernet Sauvignon, a red-wine grape, and Chardonnay, a white-wine grape were analyzed by integrated transcript and metabolite profiling. Over the course of berry development, the steady-state transcript abundance of approximately 6,000 Unigenes differed significantly between the cultivars and the irrigation treatments. Water deficit most affected the phenylpropanoid, ABA, isoprenoid, carotenoid, amino acid and fatty acid metabolic pathways. Targeted metabolites were profiled to confirm putative changes in specific metabolic pathways. Water deficit activated the expression of numerous transcripts associated with glutamate and proline biosynthesis and some committed steps of the phenylpropanoid pathway that increased anthocyanin concentrations in Cabernet Sauvignon. In Chardonnay, water deficit activated parts of the phenylpropanoid, energy, carotenoid and isoprenoid metabolic pathways that contribute to increased concentrations of antheraxanthin, flavonols and aroma volatiles. Water deficit affected the ABA metabolic pathway in both cultivars. Berry ABA concentrations were highly correlated with 9-cis-epoxycarotenoid dioxygenase (NCED1) transcript abundance, whereas the mRNA expression of other NCED genes and ABA catabolic and glycosylation processes were largely unaffected. Water deficit nearly doubled ABA concentrations within berries of Cabernet Sauvignon, whereas it decreased ABA in Chardonnay at veraison and shortly thereafter. The metabolic responses of grapes to water deficit varied with the cultivar and fruit pigmentation. Chardonnay berries, which lack any significant anthocyanin content, exhibited increased photoprotection mechanisms under water deficit conditions. Water deficit increased ABA, proline, sugar and anthocyanin concentrations in Cabernet Sauvignon, but not Chardonnay berries, consistent with the hypothesis that ABA enhanced accumulation of these compounds. Water deficit increased the transcript abundance of lipoxygenase and hydroperoxide lyase in fatty metabolism, a pathway known to affect berry and wine aromas. These changes in metabolism have important impacts on berry flavor and quality characteristics. Several of these metabolites are known to contribute to increased human-health benefits.

450 citations


Journal ArticleDOI
TL;DR: This study demonstrates that CRC cell-derived microvesicles are enriched in cell cycle-related mRNAs that promote proliferation of endothelial cells, suggesting that microvesicle of cancer cells can be involved in tumor growth and metastasis by facilitating angiogenesis-related processes.
Abstract: Various cancer cells, including those of colorectal cancer (CRC), release microvesicles (exosomes) into surrounding tissues and peripheral circulation. These microvesicles can mediate communication between cells and affect various tumor-related processes in their target cells. We present potential roles of CRC cell-derived microvesicles in tumor progression via a global comparative microvesicular and cellular transcriptomic analysis of human SW480 CRC cells. We first identified 11,327 microvesicular mRNAs involved in tumorigenesis-related processes that reflect the physiology of donor CRC cells. We then found 241 mRNAs enriched in the microvesicles above donor cell levels, of which 27 were involved in cell cycle-related processes. Network analysis revealed that most of the cell cycle-related microvesicle-enriched mRNAs were associated with M-phase activities. The integration of two mRNA datasets showed that these M-phase-related mRNAs were differentially regulated across CRC patients, suggesting their potential roles in tumor progression. Finally, we experimentally verified the network-driven hypothesis by showing a significant increase in proliferation of endothelial cells treated with the microvesicles. Our study demonstrates that CRC cell-derived microvesicles are enriched in cell cycle-related mRNAs that promote proliferation of endothelial cells, suggesting that microvesicles of cancer cells can be involved in tumor growth and metastasis by facilitating angiogenesis-related processes. This information will help elucidate the pathophysiological functions of tumor-derived microvesicles, and aid in the development of cancer diagnostics, including colorectal cancer.

403 citations


Journal ArticleDOI
TL;DR: This study evaluates relative accuracy of microarrays and transcriptome sequencing (RNA-Seq) using third methodology: proteomics to find that RNA-Sequ provides a better estimate of absolute expression levels.
Abstract: Microarrays revolutionized biological research by enabling gene expression comparisons on a transcriptome-wide scale. Microarrays, however, do not estimate absolute expression level accurately. At present, high throughput sequencing is emerging as an alternative methodology for transcriptome studies. Although free of many limitations imposed by microarray design, its potential to estimate absolute transcript levels is unknown. In this study, we evaluate relative accuracy of microarrays and transcriptome sequencing (RNA-Seq) using third methodology: proteomics. We find that RNA-Seq provides a better estimate of absolute expression levels. Our result shows that in terms of overall technical performance, RNA-Seq is the technique of choice for studies that require accurate estimation of absolute transcript levels.

317 citations


Journal ArticleDOI
TL;DR: This study critically evaluated the performance of a novel miRNA expression profiling approach, quantitative RT-PCR array (qPCR-array), compared to miRNA detection with oligonucleotide microchip (microarray), and demonstrated high reproducibility of TaqMan qPCr-array.
Abstract: MicroRNAs (miRNAs) have critical functions in various biological processes. MiRNA profiling is an important tool for the identification of differentially expressed miRNAs in normal cellular and disease processes. A technical challenge remains for high-throughput miRNA expression analysis as the number of miRNAs continues to increase with in silico prediction and experimental verification. Our study critically evaluated the performance of a novel miRNA expression profiling approach, quantitative RT-PCR array (qPCR-array), compared to miRNA detection with oligonucleotide microchip (microarray). High reproducibility with qPCR-array was demonstrated by comparing replicate results from the same RNA sample. Pre-amplification of the miRNA cDNA improved sensitivity of the qPCR-array and increased the number of detectable miRNAs. Furthermore, the relative expression levels of miRNAs were maintained after pre-amplification. When the performance of qPCR-array and microarrays were compared using different aliquots of the same RNA, a low correlation between the two methods (r = -0.443) indicated considerable variability between the two assay platforms. Higher variation between replicates was observed in miRNAs with low expression in both assays. Finally, a higher false positive rate of differential miRNA expression was observed using the microarray compared to the qPCR-array. Our studies demonstrated high reproducibility of TaqMan qPCR-array. Comparison between different reverse transcription reactions and qPCR-arrays performed on different days indicated that reverse transcription reactions did not introduce significant variation in the results. The use of cDNA pre-amplification increased the sensitivity of miRNA detection. Although there was variability associated with pre-amplification in low abundance miRNAs, the latter did not involve any systemic bias in the estimation of miRNA expression. Comparison between microarray and qPCR-array indicated superior sensitivity and specificity of qPCR-array.

306 citations


Journal ArticleDOI
TL;DR: The QTL confirmed in this study represents a case of a major gene explaining the bulk of genetic variation for a presumed complex trait, providing a solid framework for linkage-based MAS within the whole population in subsequent generations.
Abstract: Infectious pancreatic necrosis (IPN) is one of the most prevalent and economically devastating diseases in Atlantic salmon (Salmo salar) farming worldwide. The disease causes large mortalities at both the fry- and post-smolt stages. Family selection for increased IPN resistance is performed through the use of controlled challenge tests, where survival rates of sib-groups are recorded. However, since challenge-tested animals cannot be used as breeding candidates, within-family selection is not performed and only half of the genetic variation for IPN resistance is being exploited. DNA markers linked to quantitative trait loci (QTL) affecting IPN resistance would therefore be a powerful selection tool. The aim of this study was to identify and fine-map QTL for IPN-resistance in Atlantic salmon, for use in marker-assisted selection to increase the rate of genetic improvement for this trait. A genome scan was carried out using 10 large full-sib families of challenge-tested Atlantic salmon post-smolts and microsatellite markers distributed across the genome. One major QTL for IPN-resistance was detected, explaining 29% and 83% of the phenotypic and genetic variances, respectively. This QTL mapped to the same location as a QTL recently detected in a Scottish Atlantic salmon population. The QTL was found to be segregating in 10 out of 20 mapping parents, and subsequent fine-mapping with additional markers narrowed the QTL peak to a 4 cM region on linkage group 21. Challenge-tested fry were used to show that the QTL had the same effect on fry as on post-smolt, with the confidence interval for QTL position in fry overlapping the confidence interval found in post-smolts. A total of 178 parents were tested for segregation of the QTL, identifying 72 QTL-heterozygous parents. Genotypes at QTL-heterozygous parents were used to determine linkage phases between alleles at the underlying DNA polymorphism and alleles at single markers or multi-marker haplotypes. One four-marker haplotype was found to be the best predictor of QTL alleles, and was successfully used to deduce genotypes of the underlying polymorphism in 72% of the parents of the next generation within a breeding nucleus. A highly significant population-level correlation was found between deduced alleles at the underlying polymorphism and survival of offspring groups in the fry challenge test, parents with the three deduced genotypes (QQ, Qq, qq) having mean offspring mortality rates of 0.13, 0.32, and 0.49, respectively. The frequency of the high-resistance allele (Q) in the population was estimated to be 0.30. Apart from this major QTL, one other experiment-wise significant QTL for IPN-resistance was detected, located on linkage group 4. The QTL confirmed in this study represents a case of a major gene explaining the bulk of genetic variation for a presumed complex trait. QTL genotypes were deduced within most parents of the 2005 generation of a major breeding company, providing a solid framework for linkage-based MAS within the whole population in subsequent generations. Since haplotype-trait associations valid at the population level were found, there is also a potential for MAS based on linkage disequilibrium (LD). However, in order to use MAS across many generations without reassessment of linkage phases between markers and the underlying polymorphism, the QTL needs to be positioned with even greater accuracy. This will require higher marker densities than are currently available.

305 citations


Journal ArticleDOI
TL;DR: Bioinformatic analysis coupled with gene transcript profiling extends the understanding of the iron and reduced inorganic sulfur compounds oxidation pathways in A. ferrooxidans and suggests mechanisms for their regulation.
Abstract: Acidithiobacillus ferrooxidans gains energy from the oxidation of ferrous iron and various reduced inorganic sulfur compounds at very acidic pH. Although an initial model for the electron pathways involved in iron oxidation has been developed, much less is known about the sulfur oxidation in this microorganism. In addition, what has been reported for both iron and sulfur oxidation has been derived from different A. ferrooxidans strains, some of which have not been phylogenetically characterized and some have been shown to be mixed cultures. It is necessary to provide models of iron and sulfur oxidation pathways within one strain of A. ferrooxidans in order to comprehend the full metabolic potential of the pangenome of the genus. Bioinformatic-based metabolic reconstruction supported by microarray transcript profiling and quantitative RT-PCR analysis predicts the involvement of a number of novel genes involved in iron and sulfur oxidation in A. ferrooxidans ATCC23270. These include for iron oxidation: cup (copper oxidase-like), ctaABT (heme biogenesis and insertion), nuoI and nuoK (NADH complex subunits), sdrA1 (a NADH complex accessory protein) and atpB and atpE (ATP synthetase F0 subunits). The following new genes are predicted to be involved in reduced inorganic sulfur compounds oxidation: a gene cluster (rhd, tusA, dsrE, hdrC, hdrB, hdrA, orf2, hdrC, hdrB) encoding three sulfurtransferases and a heterodisulfide reductase complex, sat potentially encoding an ATP sulfurylase and sdrA2 (an accessory NADH complex subunit). Two different regulatory components are predicted to be involved in the regulation of alternate electron transfer pathways: 1) a gene cluster (ctaRUS) that contains a predicted iron responsive regulator of the Rrf2 family that is hypothesized to regulate cytochrome aa3 oxidase biogenesis and 2) a two component sensor-regulator of the RegB-RegA family that may respond to the redox state of the quinone pool. Bioinformatic analysis coupled with gene transcript profiling extends our understanding of the iron and reduced inorganic sulfur compounds oxidation pathways in A. ferrooxidans and suggests mechanisms for their regulation. The models provide unified and coherent descriptions of these processes within the type strain, eliminating previous ambiguity caused by models built from analyses of multiple and divergent strains of this microorganism.

Journal ArticleDOI
TL;DR: In this article, micro-RNAs have been found to be involved in plant cold stress responses in a winter-habit monocot, but little is known about miRNAs in this plant.
Abstract: Background MicroRNAs (miRNAs) are endogenous small RNAs having large-scale regulatory effects on plant development and stress responses. Extensive studies of miRNAs have only been performed in a few model plants. Although miRNAs are proved to be involved in plant cold stress responses, little is known for winter-habit monocots. Brachypodium distachyon, with close evolutionary relationship to cool-season cereals, has recently emerged as a novel model plant. There are few reports of Brachypodium miRNAs.

Journal ArticleDOI
TL;DR: A robust microarray and PCR-based method, Transposon-Mediated Differential Hybridisation (TMDH), that uses novel bioinformatics to identify transposon inserts in genome-wide libraries is developed and determined the first comprehensive list of S. aureus essential genes.
Abstract: Background: In recent years there has been an increasing problem with Staphylococcus aureus strains that are resistant to treatment with existing antibiotics. An important starting point for the development of new antimicrobial drugs is the identification of "essential" genes that are important for bacterial survival and growth. Results: We have developed a robust microarray and PCR-based method, Transposon-Mediated Differential Hybridisation (TMDH), that uses novel bioinformatics to identify transposon inserts in genome-wide libraries. Following a microarray-based screen, genes lacking transposon inserts are re-tested using a PCR and sequencingbased approach. We carried out a TMDH analysis of the S. aureus genome using a large random mariner transposon library of around a million mutants, and identified a total of 351 S. aureus genes important for survival and growth in culture. A comparison with the essential gene list experimentally derived for Bacillus subtilis highlighted interesting differences in both pathways and individual genes. Conclusion: We have determined the first comprehensive list of S. aureus essential genes. This should act as a useful starting point for the identification of potential targets for novel antimicrobial compounds. The TMDH methodology we have developed is generic and could be applied to identify essential genes in other bacterial pathogens.

Journal ArticleDOI
TL;DR: A thesaurus-based approach that allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching and demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome.
Abstract: The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome.

Journal ArticleDOI
TL;DR: Unfoldomics of human diseases utilizes unrivaled bioinformatics and experimental techniques, paves the road for better understanding of human Diseases, their pathogenesis and molecular mechanisms, and helps develop new strategies for the analysis of disease-related proteins.
Abstract: Background Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) lack stable tertiary and/or secondary structure yet fulfills key biological functions. The recent recognition of IDPs and IDRs is leading to an entire field aimed at their systematic structural characterization and at determination of their mechanisms of action. Bioinformatics studies showed that IDPs and IDRs are highly abundant in different proteomes and carry out mostly regulatory functions related to molecular recognition and signal transduction. These activities complement the functions of structured proteins. IDPs and IDRs were shown to participate in both one-to-many and many-to-one signaling. Alternative splicing and posttranslational modifications are frequently used to tune the IDP functionality. Several individual IDPs were shown to be associated with human diseases, such as cancer, cardiovascular disease, amyloidoses, diabetes, neurodegenerative diseases, and others. This raises questions regarding the involvement of IDPs and IDRs in various diseases.

Journal ArticleDOI
TL;DR: In this article, the authors examined the developmental relationship and functions of CD16+ and CD16-Mo subsets in human peripheral blood from healthy individuals and found that CD16−Mo have a more MΦ-and DC-like transcription program suggesting a more advanced stage of differentiation.
Abstract: Human peripheral blood monocytes (Mo) consist of subsets distinguished by expression of CD16 (FCγRIII) and chemokine receptors. Classical CD16- Mo express CCR2 and migrate in response to CCL2, while a minor CD16+ Mo subset expresses CD16 and CX3CR1 and migrates into tissues expressing CX3CL1. CD16+ Mo produce pro-inflammatory cytokines and are expanded in certain inflammatory conditions including sepsis and HIV infection. To gain insight into the developmental relationship and functions of CD16+ and CD16- Mo, we examined transcriptional profiles of these Mo subsets in peripheral blood from healthy individuals. Of 16,328 expressed genes, 2,759 genes were differentially expressed and 228 and 250 were >2-fold upregulated and downregulated, respectively, in CD16+ compared to CD16- Mo. CD16+ Mo were distinguished by upregulation of transcripts for dendritic cell (DC) (SIGLEC10, CD43, RARA) and macrophage (MΦ) (CSF1R/CD115, MafB, CD97, C3aR) markers together with transcripts relevant for DC-T cell interaction (CXCL16, ICAM-2, LFA-1), cell activation (LTB, TNFRSF8, LST1, IFITM1-3, HMOX1, SOD-1, WARS, MGLL), and negative regulation of the cell cycle (CDKN1C, MTSS1), whereas CD16- Mo were distinguished by upregulation of transcripts for myeloid (CD14, MNDA, TREM1, CD1d, C1qR/CD93) and granulocyte markers (FPR1, GCSFR/CD114, S100A8-9/12). Differential expression of CSF1R, CSF3R, C1QR1, C3AR1, CD1d, CD43, CXCL16, and CX3CR1 was confirmed by flow cytometry. Furthermore, increased expression of RARA and KLF2 transcripts in CD16+ Mo coincided with absence of cell surface cutaneous lymphocyte associated antigen (CLA) expression, indicating potential imprinting for non-skin homing. These results suggest that CD16+ and CD16- Mo originate from a common myeloid precursor, with CD16+ Mo having a more MΦ – and DC-like transcription program suggesting a more advanced stage of differentiation. Distinct transcriptional programs, together with their recruitment into tissues via different mechanisms, also suggest that CD16+ and CD16- Mo give rise to functionally distinct DC and MΦ in vivo.

Journal ArticleDOI
TL;DR: CellMiner is a relational database tool for storing, querying, integrating, and downloading molecular profile data on the NCI-60 and other cancer cell types and provides a template to use in providing such functionality for other Molecular profile data generated by academic institutions, public projects, or the private sector.
Abstract: Advances in the high-throughput omic technologies have made it possible to profile cells in a large number of ways at the DNA, RNA, protein, chromosomal, functional, and pharmacological levels. A persistent problem is that some classes of molecular data are labeled with gene identifiers, others with transcript or protein identifiers, and still others with chromosomal locations. What has lagged behind is the ability to integrate the resulting data to uncover complex relationships and patterns. Those issues are reflected in full form by molecular profile data on the panel of 60 diverse human cancer cell lines (the NCI-60) used since 1990 by the U.S. National Cancer Institute to screen compounds for anticancer activity. To our knowledge, CellMiner is the first online database resource for integration of the diverse molecular types of NCI-60 and related meta data. CellMiner enables scientists to perform advanced querying of molecular information on NCI-60 (and additional types) through a single web interface. CellMiner is a freely available tool that organizes and stores raw and normalized data that represent multiple types of molecular characterizations at the DNA, RNA, protein, and pharmacological levels. Annotations for each project, along with associated metadata on the samples and datasets, are stored in a MySQL database and linked to the molecular profile data. Data can be queried and downloaded along with comprehensive information on experimental and analytic methods for each data set. A Data Intersection tool allows selection of a list of genes (proteins) in common between two or more data sets and outputs the data for those genes (proteins) in the respective sets. In addition to its role as an integrative resource for the NCI-60, the CellMiner package also serves as a shell for incorporation of molecular profile data on other cell or tissue sample types. CellMiner is a relational database tool for storing, querying, integrating, and downloading molecular profile data on the NCI-60 and other cancer cell types. More broadly, it provides a template to use in providing such functionality for other molecular profile data generated by academic institutions, public projects, or the private sector. CellMiner is available online at http://discover.nci.nih.gov/cellminer/ .

Journal ArticleDOI
TL;DR: Apart from providing novel insights into sex-specific recombination rates and patterns, the described maps – from a previously genomically uncharacterized superfamily (Corvidae) of passerine birds – provide new insights into avian genome evolution.
Abstract: Background Genomic resources for the majority of free-living vertebrates of ecological and evolutionary importance are scarce. Therefore, linkage maps with high-density genome coverage are needed for progress in genomics of wild species. The Siberian jay (Perisoreus infaustus; Corvidae) is a passerine bird which has been subject to lots of research in the areas of ecology and evolutionary biology. Knowledge of its genome structure and organization is required to advance our understanding of the genetic basis of ecologically important traits in this species, as well as to provide insights into avian genome evolution.

Journal ArticleDOI
TL;DR: Investigation of environmental and other conditions and identity of organisms that show dependence on Ni or Co revealed that host-associated organisms (particularly obligate intracellular parasites and endosymbionts) have a tendency for loss of Ni/Co utilization.
Abstract: Nickel (Ni) and cobalt (Co) are trace elements required for a variety of biological processes. Ni is directly coordinated by proteins, whereas Co is mainly used as a component of vitamin B12. Although a number of Ni and Co-dependent enzymes have been characterized, systematic evolutionary analyses of utilization of these metals are limited. We carried out comparative genomic analyses to examine occurrence and evolutionary dynamics of the use of Ni and Co at the level of (i) transport systems, and (ii) metalloproteomes. Our data show that both metals are widely used in bacteria and archaea. Cbi/NikMNQO is the most common prokaryotic Ni/Co transporter, while Ni-dependent urease and Ni-Fe hydrogenase, and B12-dependent methionine synthase (MetH), ribonucleotide reductase and methylmalonyl-CoA mutase are the most widespread metalloproteins for Ni and Co, respectively. Occurrence of other metalloenzymes showed a mosaic distribution and a new B12-dependent protein family was predicted. Deltaproteobacteria and Methanosarcina generally have larger Ni- and Co-dependent proteomes. On the other hand, utilization of these two metals is limited in eukaryotes, and very few of these organisms utilize both of them. The Ni-utilizing eukaryotes are mostly fungi (except saccharomycotina) and plants, whereas most B12-utilizing organisms are animals. The NiCoT transporter family is the most widespread eukaryotic Ni transporter, and eukaryotic urease and MetH are the most common Ni- and B12-dependent enzymes, respectively. Finally, investigation of environmental and other conditions and identity of organisms that show dependence on Ni or Co revealed that host-associated organisms (particularly obligate intracellular parasites and endosymbionts) have a tendency for loss of Ni/Co utilization. Our data provide information on the evolutionary dynamics of Ni and Co utilization and highlight widespread use of these metals in the three domains of life, yet only a limited number of user proteins.

Journal ArticleDOI
TL;DR: Comparative transcript profiling allowed the identification of differentially expressed genes with potential relevance in regulating the fruit metabolism and phenolic content during ripening and provided large scale information about the structure and putative function of gene transcripts accumulated during fruit development.
Abstract: Despite its primary economic importance, genomic information on olive tree is still lacking. 454 pyrosequencing was used to enrich the very few sequence data currently available for the Olea europaea species and to identify genes involved in expression of fruit quality traits. Fruits of Coratina, a widely cultivated variety characterized by a very high phenolic content, and Tendellone, an oleuropein-lacking natural variant, were used as starting material for monitoring the transcriptome. Four different cDNA libraries were sequenced, respectively at the beginning and at the end of drupe development. A total of 261,485 reads were obtained, for an output of about 58 Mb. Raw sequence data were processed using a four step pipeline procedure and data were stored in a relational database with a web interface. Massively parallel sequencing of different fruit cDNA collections has provided large scale information about the structure and putative function of gene transcripts accumulated during fruit development. Comparative transcript profiling allowed the identification of differentially expressed genes with potential relevance in regulating the fruit metabolism and phenolic content during ripening.

Journal ArticleDOI
TL;DR: Data support the strong similarities between human and canine osteosarcoma and underline the opportunities provided by a comparative oncology approach as a means to improve the understanding of cancer biology and therapies.
Abstract: Background Pulmonary metastasis continues to be the most common cause of death in osteosarcoma. Indeed, the 5-year survival for newly diagnosed osteosarcoma patients has not significantly changed in over 20 years. Further understanding of the mechanisms of metastasis and resistance for this aggressive pediatric cancer is necessary. Pet dogs naturally develop osteosarcoma providing a novel opportunity to model metastasis development and progression. Given the accelerated biology of canine osteosarcoma, we hypothesized that a direct comparison of canine and pediatric osteosarcoma expression profiles may help identify novel metastasis-associated tumor targets that have been missed through the study of the human cancer alone.

Journal ArticleDOI
TL;DR: Nano-cytoplasmic class of sHsps with 9 subfamilies is more complex in rice than in Arabidopsis, and these genes were differentially upregulated at different developmental stages of the rice plant.
Abstract: Heat shock proteins (Hsps) constitute an important component in the heat shock response of all living systems. Among the various plant Hsps (i.e. Hsp100, Hsp90, Hsp70 and Hsp20), Hsp20 or small Hsps (sHsps) are expressed in maximal amounts under high temperature stress. The characteristic feature of the sHsps is the presence of α-crystallin domain (ACD) at the C-terminus. sHsps cooperate with Hsp100/Hsp70 and co-chaperones in ATP-dependent manner in preventing aggregation of cellular proteins and in their subsequent refolding. Database search was performed to investigate the sHsp gene family across rice genome sequence followed by comprehensive expression analysis of these genes. We identified 40 α-crystallin domain containing genes in rice. Phylogenetic analysis showed that 23 out of these 40 genes constitute sHsps. The additional 17 genes containing ACD clustered with Acd proteins of Arabidopsis. Detailed scrutiny of 23 sHsp sequences enabled us to categorize these proteins in a revised scheme of classification constituting of 16 cytoplasmic/nuclear, 2 ER, 3 mitochondrial, 1 plastid and 1 peroxisomal genes. In the new classification proposed herein nucleo-cytoplasmic class of sHsps with 9 subfamilies is more complex in rice than in Arabidopsis. Strikingly, 17 of 23 rice sHsp genes were noted to be intronless. Expression analysis based on microarray and RT-PCR showed that 19 sHsp genes were upregulated by high temperature stress. Besides heat stress, expression of sHsp genes was up or downregulated by other abiotic and biotic stresses. In addition to stress regulation, various sHsp genes were differentially upregulated at different developmental stages of the rice plant. Majority of sHsp genes were expressed in seed. We identified twenty three sHsp genes and seventeen Acd genes in rice. Three nucleocytoplasmic sHsp genes were found only in monocots. Analysis of expression profiling of sHsp genes revealed that these genes are differentially expressed under stress and at different stages in the life cycle of rice plant.

Journal ArticleDOI
TL;DR: This work has extended the analysis of histone modifications to gene deserts, pericentromeres and subtelomeres and found that each of these non-genic regions has a particular profile of hist one modifications that distinguish it from the other non-coding regions.
Abstract: Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) has recently been used to identify the modification patterns for the methylation and acetylation of many different histone tails in genes and enhancers. We have extended the analysis of histone modifications to gene deserts, pericentromeres and subtelomeres. Using data from human CD4+ T cells, we have found that each of these non-genic regions has a particular profile of histone modifications that distinguish it from the other non-coding regions. Different methylation states of H4K20, H3K9 and H3K27 were found to be enriched in each region relative to the other regions. These findings indicate that non-genic regions of the genome are variable with respect to histone modification patterns, rather than being monolithic. We furthermore used consensus sequences for unassembled centromeres and telomeres to identify the significant histone modifications in these regions. Finally, we compared the modification patterns in non-genic regions to those at silent genes and genes with higher levels of expression. For all tested methylations with the exception of H3K27me3, the enrichment level of each modification state for silent genes is between that of non-genic regions and expressed genes. For H3K27me3, the highest levels are found in silent genes. In addition to the histone modification pattern difference between euchromatin and heterochromatin regions, as is illustrated by the enrichment of H3K9me2/3 in non-genic regions while H3K9me1 is enriched at active genes; the chromatin modifications within non-genic (heterochromatin-like) regions (e.g. subtelomeres, pericentromeres and gene deserts) are also quite different.

Journal ArticleDOI
TL;DR: Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chick pea breeding.
Abstract: Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers. A total of 20,162 (18,435 high quality) drought- and salinity- responsive ESTs were generated from ten different root tissue cDNA libraries of chickpea. Sequence editing, clustering and assembly analysis resulted in 6,404 unigenes (1,590 contigs and 4,814 singletons). Functional annotation of unigenes based on BLASTX analysis showed that 46.3% (2,965) had significant similarity (≤1E-05) to sequences in the non-redundant UniProt database. BLASTN analysis of unique sequences with ESTs of four legume species (Medicago, Lotus, soybean and groundnut) and three model plant species (rice, Arabidopsis and poplar) provided insights on conserved genes across legumes as well as novel transcripts for chickpea. Of 2,965 (46.3%) significant unigenes, only 2,071 (32.3%) unigenes could be functionally categorised according to Gene Ontology (GO) descriptions. A total of 2,029 sequences containing 3,728 simple sequence repeats (SSRs) were identified and 177 new EST-SSR markers were developed. Experimental validation of a set of 77 SSR markers on 24 genotypes revealed 230 alleles with an average of 4.6 alleles per marker and average polymorphism information content (PIC) value of 0.43. Besides SSR markers, 21,405 high confidence single nucleotide polymorphisms (SNPs) in 742 contigs (with ≥ 5 ESTs) were also identified. Recognition sites for restriction enzymes were identified for 7,884 SNPs in 240 contigs. Hierarchical clustering of 105 selected contigs provided clues about stress- responsive candidate genes and their expression profile showed predominance in specific stress-challenged libraries. Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species.

Journal ArticleDOI
TL;DR: This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs.
Abstract: The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of "unconventional" transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts. We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas. Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.

Journal ArticleDOI
TL;DR: Silkworms possess a number of OBPs genes similar to other insects, and their expression profiles suggest that many OBPs may be involved in olfaction and gustation as well as general carriers of hydrophobic molecules.
Abstract: Chemosensory systems play key roles in the survival and reproductive success of insects. Insect chemoreception is mediated by two large and diverse gene superfamilies, chemoreceptors and odorant binding proteins (OBPs). OBPs are believed to transport hydrophobic odorants from the environment to the olfactory receptors. We identified a family of OBP-like genes in the silkworm genome and characterized their expression using oligonucleotide microarrays. A total of forty-four OBP genes were annotated, a number comparable to the 57 OBPs known from Anopheles gambiae and 51 from Drosophila melanogaster. As seen in other fully sequenced insect genomes, most silkworm OBP genes are present in large clusters. We defined six subfamilies of OBPs, each of which shows lineage-specific expansion and diversification. EST data and OBP expression profiles from multiple larvae tissues of day three fifth instars demonstrated that many OBPs are expressed in chemosensory-specific tissues although some OBPs are expressed ubiquitously and others exclusively in non-chemosensory tissues. Some atypical OBPs are expressed throughout development. These results reveal that, although many OBPs are chemosensory-specific, others may have more general physiological roles. Silkworms possess a number of OBPs genes similar to other insects. Their expression profiles suggest that many OBPs may be involved in olfaction and gustation as well as general carriers of hydrophobic molecules. The expansion of OBP gene subfamilies and sequence divergence indicate that the silkworm OBP family acquired functional diversity concurrently with functional constraints. Further investigation of the OBPs of the silkworm could give insights in the roles of OBPs in chemoreception.

Journal ArticleDOI
TL;DR: NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals and suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost.
Abstract: Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis.

Journal ArticleDOI
TL;DR: It is suggested for the first time, that CKD/HD patients may have an impaired mitochondrial respiratory system and this condition may be both the consequence and the cause of an enhanced oxidative stress.
Abstract: Chronic renal disease (CKD) is characterized by complex changes in cell metabolism leading to an increased production of oxygen radicals, that, in turn has been suggested to play a key role in numerous clinical complications of this pathological condition. Several reports have focused on the identification of biological elements involved in the development of systemic biochemical alterations in CKD, but this abundant literature results fragmented and not exhaustive. To better define the cellular machinery associated to this condition, we employed a high-throughput genomic approach based on a whole transcriptomic analysis associated with classical molecular methodologies. The genomic screening of peripheral blood mononuclear cells revealed that 44 genes were up-regulated in both CKD patients in conservative treatment (CKD, n = 9) and hemodialysis (HD, n = 17) compared to healthy subjects (HS, n = 8) (p < 0.001, FDR = 1%). Functional analysis demonstrated that 11/44 genes were involved in the oxidative phosphorylation system. Western blotting for COXI and COXIV, key constituents of the complex IV of oxidative phosphorylation system, performed on an independent testing-group (12 healthy subjects, 10 CKD and 14 HD) confirmed an higher synthesis of these subunits in CKD/HD patients compared to the control group. Only for COXI, the comparison between CKD and healthy subjects reached the statistical significance. However, complex IV activity was significantly reduced in CKD/HD patients compared to healthy subjects (p < 0.01). Finally, CKD/HD patients presented higher reactive oxygen species and 8-hydroxydeoxyguanosine levels compared to controls. Taken together these results suggest, for the first time, that CKD/HD patients may have an impaired mitochondrial respiratory system and this condition may be both the consequence and the cause of an enhanced oxidative stress.

Journal ArticleDOI
TL;DR: Maternal HF feeding during pregnancy and lactation induced co-ordinated and long-lasting changes in expression of Igf2, fat metabolic genes and several important miRNAs in the offspring.
Abstract: miRNAs play important roles in the regulation of gene functions. Maternal dietary modifications during pregnancy and gestation have long-term effects on the offspring, but it is not known whether a maternal high fat (HF) diet during pregnancy and lactation alters expression of key miRNAs in the offspring. We studied the effects of maternal HF diet on the adult offspring by feeding mice with either a HF or a chow diet prior to conception, during pregnancy and lactation, and all offspring were weaned onto the same chow diet until adulthood. Maternal HF fed offspring had markedly increased hepatic mRNA levels of peroxisome proliferator activated receptor-alpha (ppar-alpha) and carnitine palmitoyl transferase-1a (cpt-1a) as well as insulin like growth factor-2 (Igf2). A HF diet induced up-regulation of ppar-alpha and cpt-1a expression in the wild type but not in Igf2 knock out mice. Furthermore, hepatic expression of let-7c was also reduced in maternal HF fed offspring. Among 579 miRNAs measured with microarray, ~23 miRNA levels were reduced by ~1.5-4.9-fold. Reduced expression of miR-709 (a highly expressed miRNA), miR-122, miR-192, miR-194, miR-26a, let-7a, let7b and let-7c, miR-494 and miR-483* (reduced by ~4.9 fold) was validated by qPCR. We found that methyl-CpG binding protein 2 was the common predicted target for miR-709, miR-let7s, miR-122, miR-194 and miR-26a using our own purpose-built computer program. Maternal HF feeding during pregnancy and lactation induced co-ordinated and long-lasting changes in expression of Igf2, fat metabolic genes and several important miRNAs in the offspring.

Journal ArticleDOI
TL;DR: A comprehensive survey of genes expressed in glandular trichomes will facilitate new gene discovery and shed light on the regulatory mechanism of artemisinin metabolism and trichome function in A. annua.
Abstract: Glandular trichomes produce a wide variety of commercially important secondary metabolites in many plant species. The most prominent anti-malarial drug artemisinin, a sesquiterpene lactone, is produced in glandular trichomes of Artemisia annua. However, only limited genomic information is currently available in this non-model plant species. We present a global characterization of A. annua glandular trichome transcriptome using 454 pyrosequencing. Sequencing runs using two normalized cDNA collections from glandular trichomes yielded 406,044 expressed sequence tags (average length = 210 nucleotides), which assembled into 42,678 contigs and 147,699 singletons. Performing a second sequencing run only increased the number of genes identified by ~30%, indicating that massively parallel pyrosequencing provides deep coverage of the A. annua trichome transcriptome. By BLAST search against the NCBI non-redundant protein database, putative functions were assigned to over 28,573 unigenes, including previously undescribed enzymes likely involved in sesquiterpene biosynthesis. Comparison with ESTs derived from trichome collections of other plant species revealed expressed genes in common functional categories across different plant species. RT-PCR analysis confirmed the expression of selected unigenes and novel transcripts in A. annua glandular trichomes. The presence of contigs corresponding to enzymes for terpenoids and flavonoids biosynthesis suggests important metabolic activity in A. annua glandular trichomes. Our comprehensive survey of genes expressed in glandular trichome will facilitate new gene discovery and shed light on the regulatory mechanism of artemisinin metabolism and trichome function in A. annua.