scispace - formally typeset
Search or ask a question

Showing papers on "Gene published in 2007"


Journal ArticleDOI
TL;DR: It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available and reveal extensive gene diversity within the current concept of "species".
Abstract: DNA-DNA hybridization (DDH) values have been used by bacterial taxonomists since the 1960s to determine relatedness between strains and are still the most important criterion in the delineation of bacterial species. Since the extent of hybridization between a pair of strains is ultimately governed by their respective genomic sequences, we examined the quantitative relationship between DDH values and genome sequence-derived parameters, such as the average nucleotide identity (ANI) of common genes and the percentage of conserved DNA. A total of 124 DDH values were determined for 28 strains for which genome sequences were available. The strains belong to six important and diverse groups of bacteria for which the intra-group 16S rRNA gene sequence identity was greater than 94 %. The results revealed a close relationship between DDH values and ANI and between DNA-DNA hybridization and the percentage of conserved DNA for each pair of strains. The recommended cut-off point of 70 % DDH for species delineation corresponded to 95 % ANI and 69 % conserved DNA. When the analysis was restricted to the protein-coding portion of the genome, 70 % DDH corresponded to 85 % conserved genes for a pair of strains. These results reveal extensive gene diversity within the current concept of "species". Examination of reciprocal values indicated that the level of experimental error associated with the DDH method is too high to reveal the subtle differences in genome size among the strains sampled. It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available.

3,471 citations


Journal ArticleDOI
TL;DR: This paper found that essential human genes are likely to encode hub proteins and are expressed widely in most tissues, while the vast majority of disease genes are non-essential and show no tendency to encoding hub proteins, and their expression pattern indicates that they are localized in the functional periphery of the network.
Abstract: A network of disorders and disease genes linked by known disorder-gene associations offers a platform to explore in a single graph-theoretic framework all known phenotype and disease gene associations, indicating the common genetic origin of many diseases. Genes associated with similar disorders show both higher likelihood of physical interactions between their products and higher expression profiling similarity for their transcripts, supporting the existence of distinct disease-specific functional modules. We find that essential human genes are likely to encode hub proteins and are expressed widely in most tissues. This suggests that disease genes also would play a central role in the human interactome. In contrast, we find that the vast majority of disease genes are nonessential and show no tendency to encode hub proteins, and their expression pattern indicates that they are localized in the functional periphery of the network. A selection-based model explains the observed difference between essential and disease genes and also suggests that diseases caused by somatic mutations should not be peripheral, a prediction we confirm for cancer genes.

2,793 citations


Journal Article
TL;DR: In this paper, the coding exons of the family of 518 protein kinases were sequenced in 210 cancers of diverse histological types to explore the nature of the information that will be derived from cancer genome sequencing.
Abstract: AACR Centennial Conference: Translational Cancer Medicine-- Nov 4-8, 2007; Singapore PL02-05 All cancers are due to abnormalities in DNA. The availability of the human genome sequence has led to the proposal that resequencing of cancer genomes will reveal the full complement of somatic mutations and hence all the cancer genes. To explore the nature of the information that will be derived from cancer genome sequencing we have sequenced the coding exons of the family of 518 protein kinases, ~1.3Mb DNA per cancer sample, in 210 cancers of diverse histological types. Despite the screen being directed toward the coding regions of a gene family that has previously been strongly implicated in oncogenesis, the results indicate that the majority of somatic mutations detected are “passengers”. There is considerable variation in the number and pattern of these mutations between individual cancers, indicating substantial diversity of processes of molecular evolution between cancers. The imprints of exogenous mutagenic exposures, mutagenic treatment regimes and DNA repair defects can all be seen in the distinctive mutational signatures of individual cancers. This systematic mutation screen and others have previously yielded a number of cancer genes that are frequently mutated in one or more cancer types and which are now anticancer drug targets (for example BRAF , PIK3CA , and EGFR ). However, detailed analyses of the data from our screen additionally suggest that there exist a large number of additional “driver” mutations which are distributed across a substantial number of genes. It therefore appears that cells may be able to utilise mutations in a large repertoire of potential cancer genes to acquire the neoplastic phenotype. However, many of these genes are employed only infrequently. These findings may have implications for future anticancer drug development.

2,737 citations


Journal ArticleDOI
12 Jul 2007-Nature
TL;DR: The generation and validation of a genome-wide library of Drosophila melanogaster RNAi transgenes, enabling the conditional inactivation of gene function in specific tissues of the intact organism and opening up the prospect of systematically analysing gene functions in any tissue and at any stage of the Drosophile lifespan.
Abstract: Forward genetic screens in model organisms have provided important insights into numerous aspects of development, physiology and pathology. With the availability of complete genome sequences and the introduction of RNA-mediated gene interference (RNAi), systematic reverse genetic screens are now also possible. Until now, such genome-wide RNAi screens have mostly been restricted to cultured cells and ubiquitous gene inactivation in Caenorhabditis elegans. This powerful approach has not yet been applied in a tissue-specific manner. Here we report the generation and validation of a genome-wide library of Drosophila melanogaster RNAi transgenes, enabling the conditional inactivation of gene function in specific tissues of the intact organism. Our RNAi transgenes consist of short gene fragments cloned as inverted repeats and expressed using the binary GAL4/UAS system. We generated 22,270 transgenic lines, covering 88% of the predicted protein-coding genes in the Drosophila genome. Molecular and phenotypic assays indicate that the majority of these transgenes are functional. Our transgenic RNAi library thus opens up the prospect of systematically analysing gene functions in any tissue and at any stage of the Drosophila lifespan.

2,721 citations


Journal ArticleDOI
Sabeeha S. Merchant1, Simon E. Prochnik2, Olivier Vallon3, Elizabeth H. Harris4, Steven J. Karpowicz1, George B. Witman5, Astrid Terry2, Asaf Salamov2, Lillian K. Fritz-Laylin6, Laurence Maréchal-Drouard7, Wallace F. Marshall8, Liang-Hu Qu9, David R. Nelson10, Anton A. Sanderfoot11, Martin H. Spalding12, Vladimir V. Kapitonov13, Qinghu Ren, Patrick J. Ferris14, Erika Lindquist2, Harris Shapiro2, Susan Lucas2, Jane Grimwood15, Jeremy Schmutz15, Pierre Cardol16, Pierre Cardol3, Heriberto Cerutti17, Guillaume Chanfreau1, Chun-Long Chen9, Valérie Cognat7, Martin T. Croft18, Rachel M. Dent6, Susan K. Dutcher19, Emilio Fernández20, Hideya Fukuzawa21, David González-Ballester22, Diego González-Halphen23, Armin Hallmann, Marc Hanikenne16, Michael Hippler24, William Inwood6, Kamel Jabbari25, Ming Kalanon26, Richard Kuras3, Paul A. Lefebvre11, Stéphane D. Lemaire27, Alexey V. Lobanov17, Martin Lohr28, Andrea L Manuell29, Iris Meier30, Laurens Mets31, Maria Mittag32, Telsa M. Mittelmeier33, James V. Moroney34, Jeffrey L. Moseley22, Carolyn A. Napoli33, Aurora M. Nedelcu35, Krishna K. Niyogi6, Sergey V. Novoselov17, Ian T. Paulsen, Greg Pazour5, Saul Purton36, Jean-Philippe Ral7, Diego Mauricio Riaño-Pachón37, Wayne R. Riekhof, Linda A. Rymarquis38, Michael Schroda, David B. Stern39, James G. Umen14, Robert D. Willows40, Nedra F. Wilson41, Sara L. Zimmer39, Jens Allmer42, Janneke Balk18, Katerina Bisova43, Chong-Jian Chen9, Marek Eliáš44, Karla C Gendler33, Charles R. Hauser45, Mary Rose Lamb46, Heidi K. Ledford6, Joanne C. Long1, Jun Minagawa47, M. Dudley Page1, Junmin Pan48, Wirulda Pootakham22, Sanja Roje49, Annkatrin Rose50, Eric Stahlberg30, Aimee M. Terauchi1, Pinfen Yang51, Steven G. Ball7, Chris Bowler25, Carol L. Dieckmann33, Vadim N. Gladyshev17, Pamela J. Green38, Richard A. Jorgensen33, Stephen P. Mayfield29, Bernd Mueller-Roeber37, Sathish Rajamani30, Richard T. Sayre30, Peter Brokstein2, Inna Dubchak2, David Goodstein2, Leila Hornick2, Y. Wayne Huang2, Jinal Jhaveri2, Yigong Luo2, Diego Martinez2, Wing Chi Abby Ngau2, Bobby Otillar2, Alexander Poliakov2, Aaron Porter2, Lukasz Szajkowski2, Gregory Werner2, Kemin Zhou2, Igor V. Grigoriev2, Daniel S. Rokhsar2, Daniel S. Rokhsar6, Arthur R. Grossman22 
University of California, Los Angeles1, United States Department of Energy2, University of Paris3, Duke University4, University of Massachusetts Medical School5, University of California, Berkeley6, Centre national de la recherche scientifique7, University of California, San Francisco8, Sun Yat-sen University9, University of Tennessee Health Science Center10, University of Minnesota11, Iowa State University12, Genetic Information Research Institute13, Salk Institute for Biological Studies14, Stanford University15, University of Liège16, University of Nebraska–Lincoln17, University of Cambridge18, Washington University in St. Louis19, University of Córdoba (Spain)20, Kyoto University21, Carnegie Institution for Science22, National Autonomous University of Mexico23, University of Münster24, École Normale Supérieure25, University of Melbourne26, University of Paris-Sud27, University of Mainz28, Scripps Research Institute29, Ohio State University30, University of Chicago31, University of Jena32, University of Arizona33, Louisiana State University34, University of New Brunswick35, University College London36, University of Potsdam37, Delaware Biotechnology Institute38, Boyce Thompson Institute for Plant Research39, Macquarie University40, Oklahoma State University Center for Health Sciences41, İzmir University of Economics42, Academy of Sciences of the Czech Republic43, Charles University in Prague44, St. Edward's University45, University of Puget Sound46, Hokkaido University47, Tsinghua University48, Washington State University49, Appalachian State University50, Marquette University51
12 Oct 2007-Science
TL;DR: Analyses of the Chlamydomonas genome advance the understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella.
Abstract: Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in land plants. We sequenced the approximately 120-megabase nuclear genome of Chlamydomonas and performed comparative phylogenomic analyses, identifying genes encoding uncharacterized proteins that are likely associated with the function and biogenesis of chloroplasts or eukaryotic flagella. Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella.

2,554 citations


Journal ArticleDOI
26 Jan 2007-Science
TL;DR: It is hypothesized that the presence of a rare codon, marked by the synonymous polymorphism, affects the timing of cotranslational folding and insertion of P-gp into the membrane, thereby altering the structure of substrate and inhibitor interaction sites.
Abstract: Synonymous single-nucleotide polymorphisms (SNPs) do not produce altered coding sequences, and therefore they are not expected to change the function of the protein in which they occur. We report that a synonymous SNP in the Multidrug Resistance 1 (MDR1) gene, part of a haplotype previously linked to altered function of the MDR1 gene product P-glycoprotein (P-gp), nonetheless results in P-gp with altered drug and inhibitor interactions. Similar mRNA and protein levels, but altered conformations, were found for wild-type and polymorphic P-gp. We hypothesize that the presence of a rare codon, marked by the synonymous polymorphism, affects the timing of cotranslational folding and insertion of P-gp into the membrane, thereby altering the structure of substrate and inhibitor interaction sites.

2,480 citations


Journal ArticleDOI
TL;DR: The DAVID Gene Functional Classification Tool uses a novel agglomeration algorithm to condense a list of genes or associated biological terms into organized classes of related genes or biology, called biological modules, for efficient interpretation of gene lists in a network context.
Abstract: The DAVID Gene Functional Classification Tool http://david.abcc.ncifcrf.gov uses a novel agglomeration algorithm to condense a list of genes or associated biological terms into organized classes of related genes or biology, called biological modules. This organization is accomplished by mining the complex biological co-occurrences found in multiple sources of functional annotation. It is a powerful method to group functionally related genes and terms into a manageable number of biological modules for efficient interpretation of gene lists in a network context.

2,067 citations


Journal ArticleDOI
TL;DR: These competitive inhibitors are transcripts expressed from strong promoters, containing multiple, tandem binding sites to a microRNA of interest that specifically inhibit microRNAs with a complementary heptameric seed, such that a single sponge can be used to block an entire microRNA seed family.
Abstract: MicroRNAs are predicted to regulate thousands of mammalian genes, but relatively few targets have been experimentally validated and few microRNA loss-of-function phenotypes have been assigned. As an alternative to chemically modified antisense oligonucleotides, we developed microRNA inhibitors that can be expressed in cells, as RNAs produced from transgenes. Termed 'microRNA sponges', these competitive inhibitors are transcripts expressed from strong promoters, containing multiple, tandem binding sites to a microRNA of interest. When vectors encoding these sponges are transiently transfected into cultured cells, sponges derepress microRNA targets at least as strongly as chemically modified antisense oligonucleotides. They specifically inhibit microRNAs with a complementary heptameric seed, such that a single sponge can be used to block an entire microRNA seed family. RNA polymerase II promoter (Pol II)-driven sponges contain a fluorescence reporter gene for identification and sorting of sponge-treated cells. We envision the use of stably expressed sponges in animal models of disease and development.

2,054 citations


Journal ArticleDOI
TL;DR: The capabilities of GOstats, a Bioconductor package written in R, that allows users to test GO terms for over or under-representation using either a classical hypergeometric test or a conditionalhypergeometric that uses the relationships among GO terms to decorrelate the results are discussed.
Abstract: Motivation: Functional analyses based on the association of Gene Ontology (GO) terms to genes in a selected gene list are useful bioinformatic tools and the GOstats package has been widely used to perform such computations. In this paper we report significant improvements and extensions such as support for conditional testing. Results: We discuss the capabilities of GOstats, a Bioconductor package written in R, that allows users to test GO terms for over or under-representation using either a classical hypergeometric test or a conditional hypergeometric that uses the relationships among GO terms to decorrelate the results. Availability: GOstats is available as an R package from the Bioconductor project: http://bioconductor.org Contact: [email protected]

1,890 citations


Journal ArticleDOI
TL;DR: The expanded DAVID Knowledgebase now integrates almost all major and well-known public bioinformatics resources centralized by the DAVID Gene Concept, a single-linkage method to agglomerate tens of millions of diverse gene/protein identifiers and annotation terms from a variety of public bio informatics databases.
Abstract: All tools in the DAVID Bioinformatics Resources aim to provide functional interpretation of large lists of genes derived from genomic studies. The newly updated DAVID Bioinformatics Resources consists of the DAVID Knowledgebase and five integrated, web-based functional annotation tool suites: the DAVID Gene Functional Classification Tool, the DAVID Functional Annotation Tool, the DAVID Gene ID Conversion Tool, the DAVID Gene Name Viewer and the DAVID NIAID Pathogen Genome Browser. The expanded DAVID Knowledgebase now integrates almost all major and well-known public bioinformatics resources centralized by the DAVID Gene Concept, a single-linkage method to agglomerate tens of millions of diverse gene/protein identifiers and annotation terms from a variety of public bioinformatics databases. For any uploaded gene list, the DAVID Resources now provides not only the typical gene-term enrichment analysis, but also new tools and functions that allow users to condense large gene lists into gene functional groups, convert between gene/protein identifiers, visualize many-genes-to-many-terms relationships, cluster redundant and heterogeneous terms into groups, search for interesting and related genes or terms, dynamically view genes from their lists on bio-pathways and more. With DAVID (http://david. niaid.nih.gov), investigators gain more power to interpret the biological mechanisms associated with large gene lists.

1,842 citations


Journal ArticleDOI
TL;DR: It is demonstrated that ATG16L1 is expressed in intestinal epithelial cell lines and that functional knockdown of this gene abrogates autophagy of Salmonella typhimurium, and these findings suggest thatAutophagy and host cell responses to intracellular microbes are involved in the pathogenesis of Crohn disease.
Abstract: We present a genome-wide association study of ileal Crohn disease and two independent replication studies that identify several new regions of association to Crohn disease. Specifically, in addition to the previously established CARD15 and IL23R associations, we identified strong and significantly replicated associations (combined P < 10(-10)) with an intergenic region on 10q21.1 and a coding variant in ATG16L1, the latter of which was also recently reported by another group. We also report strong associations with independent replication to variation in the genomic regions encoding PHOX2B, NCF4 and a predicted gene on 16q24.1 (FAM92B). Finally, we demonstrate that ATG16L1 is expressed in intestinal epithelial cell lines and that functional knockdown of this gene abrogates autophagy of Salmonella typhimurium. Together, these findings suggest that autophagy and host cell responses to intracellular microbes are involved in the pathogenesis of Crohn disease.

Journal ArticleDOI
12 Apr 2007-Nature
TL;DR: It is suggested that direct disruption of pathways controlling B-cell development and differentiation contributes to B-progenitor ALL pathogenesis and the power of high-resolution, genome-wide approaches to identify new molecular lesions in cancer.
Abstract: Chromosomal aberrations are a hallmark of acute lymphoblastic leukaemia (ALL) but alone fail to induce leukaemia. To identify cooperating oncogenic lesions, we performed a genome-wide analysis of leukaemic cells from 242 paediatric ALL patients using high-resolution, single-nucleotide polymorphism arrays and genomic DNA sequencing. Our analyses revealed deletion, amplification, point mutation and structural rearrangement in genes encoding principal regulators of B lymphocyte development and differentiation in 40% of B-progenitor ALL cases. The PAX5 gene was the most frequent target of somatic mutation, being altered in 31.7% of cases. The identified PAX5 mutations resulted in reduced levels of PAX5 protein or the generation of hypomorphic alleles. Deletions were also detected in TCF3 (also known as E2A), EBF1, LEF1, IKZF1 (IKAROS) and IKZF3 (AIOLOS). These findings suggest that direct disruption of pathways controlling B-cell development and differentiation contributes to B-progenitor ALL pathogenesis. Moreover, these data demonstrate the power of high-resolution, genome-wide approaches to identify new molecular lesions in cancer.

Journal ArticleDOI
26 Jul 2007-Nature
TL;DR: The results indicate that genetic variants regulating ORMDL3 expression are determinants of susceptibility to childhood asthma.
Abstract: Rates of childhood asthma diagnosis are rising: 6% of children in the United States are sufferers. Both genetic and environmental factors are clearly important. To discover more about the genetic element, Moffatt et al. looked for genes linked to asthma in a genome-wide association scan. More than a third of children with asthma of onset below the age of seven showed variations in expression of the ORMDL3 gene on chromosome 17. Similar genes are found in yeast and other primitive organisms, suggesting that they may be components of an ancient and conserved immune mechanism. Variations in expression of the gene ORMDL3 were found to be associated with development of childhood asthma, suggesting this gene should be examined in more patient groups. Asthma is caused by a combination of poorly understood genetic and environmental factors1,2. We have systematically mapped the effects of single nucleotide polymorphisms (SNPs) on the presence of childhood onset asthma by genome-wide association. We characterized more than 317,000 SNPs in DNA from 994 patients with childhood onset asthma and 1,243 non-asthmatics, using family and case-referent panels. Here we show multiple markers on chromosome 17q21 to be strongly and reproducibly associated with childhood onset asthma in family and case-referent panels with a combined P value of P < 10-12. In independent replication studies the 17q21 locus showed strong association with diagnosis of childhood asthma in 2,320 subjects from a cohort of German children (P = 0.0003) and in 3,301 subjects from the British 1958 Birth Cohort (P = 0.0005). We systematically evaluated the relationships between markers of the 17q21 locus and transcript levels of genes in Epstein–Barr virus (EBV)-transformed lymphoblastoid cell lines from children in the asthma family panel used in our association study. The SNPs associated with childhood asthma were consistently and strongly associated (P < 10-22) in cis with transcript levels of ORMDL3, a member of a gene family that encodes transmembrane proteins anchored in the endoplasmic reticulum3. The results indicate that genetic variants regulating ORMDL3 expression are determinants of susceptibility to childhood asthma.

Journal ArticleDOI
TL;DR: FlyAtlas provides the most comprehensive view yet of expression in multiple tissues of Drosophila melanogaster, demonstrating the limitations of whole-organism approaches to functional genomics and allowing modeling of a simple tissue fractionation procedure that should improve detection of weak or tissue-specific signals.
Abstract: FlyAtlas, a new online resource, provides the most comprehensive view yet of expression in multiple tissues of Drosophila melanogaster. Meta-analysis of the data shows that a significant fraction of the genome is expressed with great tissue specificity in the adult, demonstrating the need for the functional genomic community to embrace a wide range of functional phenotypes. Well-known developmental genes are often reused in surprising tissues in the adult, suggesting new functions. The homologs of many human genetic disease loci show selective expression in the Drosophila tissues analogous to the affected human tissues, providing a useful filter for potential candidate genes. Additionally, the contributions of each tissue to the whole-fly array signal can be calculated, demonstrating the limitations of whole-organism approaches to functional genomics and allowing modeling of a simple tissue fractionation procedure that should improve detection of weak or tissue-specific signals.

Journal ArticleDOI
TL;DR: This work proposes a simple model that describes the transcriptional regulation of new microRNAs, a large class of small, non-coding RNAs in plants and animals, focusing on the evolution of the individual regulators and their binding sites.
Abstract: Changes in the patterns of gene expression are widely believed to underlie many of the phenotypic differences within and between species. Although much emphasis has been placed on changes in transcriptional regulation, gene expression is regulated at many levels, all of which must ultimately be studied together to obtain a complete picture of the evolution of gene expression. Here we compare the evolution of transcriptional regulation and post-transcriptional regulation that is mediated by microRNAs, a large class of small, non-coding RNAs in plants and animals, focusing on the evolution of the individual regulators and their binding sites. As an initial step towards integrating these mechanisms into a unified framework, we propose a simple model that describes the transcriptional regulation of new microRNA genes.

Journal ArticleDOI
23 Feb 2007-Cell
TL;DR: Polycomb group (PcG) and trithorax group (trxG) proteins are critical regulators of numerous developmental genes and recent work suggests that PcG-mediated gene silencing involves noncoding RNAs and the RNAi machinery.

Journal ArticleDOI
TL;DR: In this paper, gene expression and genetic profiles of cells purified from cancerous and normal breast tissue using markers previously associated with stem-cell-like properties were determined using markers from the TGF-β pathway, where its inhibition induced a more epithelial phenotype.

Journal ArticleDOI
TL;DR: An analysis of available data shows that gene fusions occur in all malignancies, and that they account for 20% of human cancer morbidity, with the advent of new and powerful investigative tools that enable the detection of cytogenetically cryptic rearrangements.
Abstract: Chromosome aberrations, in particular translocations and their corresponding gene fusions, have an important role in the initial steps of tumorigenesis; at present, 358 gene fusions involving 337 different genes have been identified. An increasing number of gene fusions are being recognized as important diagnostic and prognostic parameters in malignant haematological disorders and childhood sarcomas. The biological and clinical impact of gene fusions in the more common solid tumour types has been less appreciated. However, an analysis of available data shows that gene fusions occur in all malignancies, and that they account for 20% of human cancer morbidity. With the advent of new and powerful investigative tools that enable the detection of cytogenetically cryptic rearrangements, this proportion is likely to increase substantially.

Journal ArticleDOI
TL;DR: A ‘Phylogenetic Conservation’ analysis tool was implemented that analyses the potential occurrence of orthologous protein complex subunits in mammals and other selected groups of organisms and allows one to predict the occurrence of protein complexes in different phylogenetic groups.
Abstract: Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The CORUM (http://mips.gsf.de/genre/proj/corum/index.html) database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. For functional annotation, we use the FunCat catalogue that enables to organize the protein complex space into biologically meaningful subsets. The database contains more than 1750 protein complexes that are built from 2400 different genes, thus representing 12% of the protein-coding genes in human. A web-based system is available to query, view and download the data. CORUM provides a comprehensive dataset of protein complexes for discoveries in systems biology, analyses of protein networks and protein complex-associated diseases. Comparable to the MIPS reference dataset of protein complexes from yeast, CORUM intends to serve as a reference for mammalian protein complexes.

Journal ArticleDOI
TL;DR: It is found that copy number of the salivary amylase gene (AMY1) is correlated positively with salivaries protein level and that individuals from populations with high-starch diets have, on average, more AMY1 copies than those with traditionally low-st starch diets.
Abstract: Starch consumption is a prominent characteristic of agricultural societies and hunter-gatherers in arid environments. In contrast, rainforest and circum-arctic hunter-gatherers and some pastoralists consume much less starch. This behavioral variation raises the possibility that different selective pressures have acted on amylase, the enzyme responsible for starch hydrolysis. We found that copy number of the salivary amylase gene (AMY1) is correlated positively with salivary amylase protein level and that individuals from populations with high-starch diets have, on average, more AMY1 copies than those with traditionally low-starch diets. Comparisons with other loci in a subset of these populations suggest that the extent of AMY1 copy number differentiation is highly unusual. This example of positive selection on a copy number-variable gene is, to our knowledge, one of the first discovered in the human genome. Higher AMY1 copy numbers and protein levels probably improve the digestion of starchy foods and may buffer against the fitness-reducing effects of intestinal disease.

Journal ArticleDOI
TL;DR: A robust and generalized method for the culturing of various human breast cell lines in three dimensions is described and the preparation of cellular extracts from these cultures for molecular analyses are described.
Abstract: Extracellular matrix is a key regulator of normal homeostasis and tissue phenotype. Important signals are lost when cells are cultured ex vivo on two-dimensional plastic substrata. Many of these crucial microenvironmental cues may be restored using three-dimensional (3D) cultures of laminin-rich extracellular matrix (lrECM). These 3D culture assays allow phenotypic discrimination between nonmalignant and malignant mammary cells, as the former grown in a 3D context form polarized, growth-arrested acinus-like colonies whereas the latter form disorganized, proliferative and nonpolar colonies. Signaling pathways that function in parallel in cells cultured on plastic become reciprocally integrated when the cells are exposed to basement membrane-like gels. Appropriate 3D culture thus provides a more physiologically relevant approach to the analysis of gene function and cell phenotype ex vivo. We describe here a robust and generalized method for the culturing of various human breast cell lines in three dimensions and describe the preparation of cellular extracts from these cultures for molecular analyses. The procedure below describes the 3D 'embedded' assay, in which cells are cultured embedded in an lrECM gel (Fig. 1). By lrECM, we refer to the solubilized extract derived from the Engelbreth-Holm-Swarm mouse sarcoma cells. For a discussion of user options regarding 3D matrices, see Box 1. Alternatively, the 3D 'on-top' assay, in which cells are cultured on top of a thin lrECM gel overlaid with a dilute solution of lrECM, may be used as described in Box 2 (Fig. 1 and Fig. 2).

Journal ArticleDOI
TL;DR: The results show that M13-tailed primer cocktails are more effective than conventional degenerate primers, allowing barcode work on taxonomically diverse samples to be carried out in a high-throughput fashion.
Abstract: Reliable recovery of the 5′′ region of the cytochrome c oxidase 1 (COI) gene is critical for the ongoing effort to gather DNA barcodes for all fish species. In this study, we develop and test primer cocktails with a view towards increasing the efficiency of barcode recovery. Specifically, we evaluate the success of polymerase chain reaction amplification and the quality of resultant sequences using three primer cocktails on DNA extracts from representatives of 94 fish families. Our results show that M13-tailed primer cocktails are more effective than conventional degenerate primers, allowing barcode work on taxonomically diverse samples to be carried out in a high-throughput fashion.

Journal ArticleDOI
TL;DR: It is shown that the promoter regions (1 kb upstream of the transcription start site TSS) of genes are significantly enriched in quadruplex motifs relative to the rest of the genome, with >40% of human gene promoters containing one or more quadruplexaterials.
Abstract: Certain G-rich DNA sequences readily form four-stranded structures called G-quadruplexes. These sequence motifs are located in telomeres as a repeated unit, and elsewhere in the genome, where their function is currently unknown. It has been proposed that G-quadruplexes may be directly involved in gene regulation at the level of transcription. In support of this hypothesis, we show that the promoter regions (1 kb upstream of the transcription start site TSS) of genes are significantly enriched in quadruplex motifs relative to the rest of the genome, with >40% of human gene promoters containing one or more quadruplex motif. Furthermore, these promoter quadruplexes strongly associate with nuclease hypersensitive sites identified throughout the genome via biochemical measurement. Regions of the human genome that are both nuclease hypersensitive and within promoters show a remarkable (230-fold) enrichment of quadruplex elements, compared to the rest of the genome. These quadruplex motifs identified in promoter regions also show an interesting structural bias towards more stable forms. These observations support the proposal that promoter G-quadruplexes are directly involved in the regulation of gene expression.

Journal ArticleDOI
14 Feb 2007-PLOS ONE
TL;DR: It is suggested that MIRNA genes are undergoing relatively frequent birth and death, with only a subset being stabilized by integration into regulatory networks.
Abstract: In plants, microRNAs (miRNAs) comprise one of two classes of small RNAs that function primarily as negative regulators at the posttranscriptional level. Several MIRNA genes in the plant kingdom are ancient, with conservation extending between angiosperms and the mosses, whereas many others are more recently evolved. Here, we use deep sequencing and computational methods to identify, profile and analyze non-conserved MIRNA genes in Arabidopsis thaliana. 48 non-conserved MIRNA families, nearly all of which were represented by single genes, were identified. Sequence similarity analyses of miRNA precursor foldback arms revealed evidence for recent evolutionary origin of 16 MIRNA loci through inverted duplication events from protein-coding gene sequences. Interestingly, these recently evolved MIRNA genes have taken distinct paths. Whereas some non-conserved miRNAs interact with and regulate target transcripts from gene families that donated parental sequences, others have drifted to the point of non-interaction with parental gene family transcripts. Some young MIRNA loci clearly originated from one gene family but form miRNAs that target transcripts in another family. We suggest that MIRNA genes are undergoing relatively frequent birth and death, with only a subset being stabilized by integration into regulatory networks.

Journal ArticleDOI
TL;DR: The data suggest the miRNA34s might be key effectors of p53 tumor-suppressor function, and their inactivation might contribute to certain cancers.

Journal ArticleDOI
TL;DR: The highly efficient generation of mutant or chimeric genes by this method can easily be accomplished with standard laboratory reagents in approximately 1 week.
Abstract: Extension of overlapping gene segments by PCR is a simple, versatile technique for site-directed mutagenesis and gene splicing. Initial PCRs generate overlapping gene segments that are then used as template DNA for another PCR to create a full-length product. Internal primers generate overlapping, complementary 3' ends on the intermediate segments and introduce nucleotide substitutions, insertions or deletions for site-directed mutagenesis, or for gene splicing, encode the nucleotides found at the junction of adjoining gene segments. Overlapping strands of these intermediate products hybridize at this 3' region in a subsequent PCR and are extended to generate the full-length product amplified by flanking primers that can include restriction enzyme sites for inserting the product into an expression vector for cloning purposes. The highly efficient generation of mutant or chimeric genes by this method can easily be accomplished with standard laboratory reagents in approximately 1 week.

Journal ArticleDOI
TL;DR: It is found that gene expression is heritable and that differentiation between populations is in agreement with earlier small-scale studies, and the results strongly support an abundance of cis-regulatory variation in the human genome.
Abstract: Genetic variation influences gene expression, and this variation in gene expression can be efficiently mapped to specific genomic regions and variants. Here we have used gene expression profiling of Epstein-Barr virus‐transformed lymphoblastoid cell lines of all 270 individuals genotyped in the HapMap Consortium to elucidate the detailed features of genetic variation underlying gene expression variation. We find that gene expression is heritable and that differentiation between populations is in agreement with earlier small-scale studies. A detailed association analysis of over 2.2 million common SNPs per population (5% frequency in HapMap) with gene expression identified at least 1,348 genes with association signals in cis and at least 180 in trans. Replication in at least one independent population was achieved for 37% of cis signals and 15% of trans signals, respectively. Our results strongly support an abundance of cis-regulatory variation in the human genome. Detection of trans effects is limited but suggests that regulatory variation may be the key primary effect contributing to phenotypic variation in humans. We also explore several methodologies that improve the current state of analysis of gene expression variation.

Journal ArticleDOI
07 Jun 2007-Nature
TL;DR: A high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka, revealed that eight major interchromosomal rearrangements took place in a remarkably short period of ∼50 Myr after the whole-genome duplication event in the teleost ancestor and afterwards, intriguingly, the medaka genome preserved its ancestral karyotype for more than 300‬Myr.
Abstract: The medaka fish (Oryzias latipes) is a popular pet in Japan and more recently a laboratory model organism for developmental genetics and evolutionary biology. Now the medaka's genome has been sequenced and analysed by a large Japanese consortium. Cichlids and stickleback, which are emerging model systems for understanding the genetic basis of vertebrate speciation, are evolutionarily closer to medaka than zebrafish, so the medaka's genome sequence will yield valuable insights into 400 million years of vertebrate genome evolution. The medaka fish (Oryzias latipes) has long been a popular pet in Japan and more recently a laboratory model organism; it now has its genome sequenced and analysed by a Japanese consortium. Teleosts comprise more than half of all vertebrate species and have adapted to a variety of marine and freshwater habitats1. Their genome evolution and diversification are important subjects for the understanding of vertebrate evolution. Although draft genome sequences of two pufferfishes have been published2,3, analysis of more fish genomes is desirable. Here we report a high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka (Oryzias latipes). Medaka is native to East Asia and an excellent model system for a wide range of biology, including ecotoxicology, carcinogenesis, sex determination4,5,6 and developmental genetics7. In the assembled medaka genome (700 megabases), which is less than half of the zebrafish genome, we predicted 20,141 genes, including ∼2,900 new genes, using 5′-end serial analysis of gene expression tag information. We found single nucleotide polymorphisms (SNPs) at an average rate of 3.42% between the two inbred strains derived from two regional populations; this is the highest SNP rate seen in any vertebrate species. Analyses based on the dense SNP information show a strict genetic separation of 4 million years (Myr) between the two populations, and suggest that differential selective pressures acted on specific gene categories. Four-way comparisons with the human, pufferfish (Tetraodon), zebrafish and medaka genomes revealed that eight major interchromosomal rearrangements took place in a remarkably short period of ∼50 Myr after the whole-genome duplication event in the teleost ancestor and afterwards, intriguingly, the medaka genome preserved its ancestral karyotype for more than 300 Myr.

Journal ArticleDOI
23 Mar 2007-Cell
TL;DR: 13,804 CTCF-binding sites in potential insulators of the human genome are described, discovered experimentally in primary human fibroblasts and fit to a consensus motif highly conserved and suitable for predicting possible insulators driven by CTCf in other vertebrate genomes.

Journal ArticleDOI
TL;DR: In this paper, an integrated analysis of high-density oligonucleotide array CGH and gene expression profiling data from 155 multiple myeloma samples identified a promiscuous array of abnormalities contributing to the dysregulation of NF-kappaB in approximately 20% of patients.