scispace - formally typeset
Search or ask a question

Showing papers by "Wellcome Trust Sanger Institute published in 2004"


Journal ArticleDOI
TL;DR: This work has predicted target sites on the 3′ untranslated regions of human gene transcripts for all currently known 218 mammalian miRNAs to facilitate focused experiments and suggests that miRNA genes, which are about 1% of all human genes, regulate protein production for 10% or more of allhuman genes.
Abstract: MicroRNAs (miRNAs) interact with target mRNAs at specific sites to induce cleavage of the message or inhibit translation. The specific function of most mammalian miRNAs is unknown. We have predicted target sites on the 3′ untranslated regions of human gene transcripts for all currently known 218 mammalian miRNAs to facilitate focused experiments. We report about 2,000 human genes with miRNA target sites conserved in mammals and about 250 human genes conserved as targets between mammals and fish. The prediction algorithm optimizes sequence complementarity using position-specific rules and relies on strict requirements of interspecies conservation. Experimental support for the validity of the method comes from known targets and from strong enrichment of predicted targets in mRNAs associated with the fragile X mental retardation protein in mammals. This is consistent with the hypothesis that miRNAs act as sequence-specific adaptors in the interaction of ribonuclear particles with translationally regulated messages. Overrepresented groups of targets include mRNAs coding for transcription factors, components of the miRNA machinery, and other proteins involved in translational regulation, as well as components of the ubiquitin machinery, representing novel feedback loops in gene regulation. Detailed information about target genes, target processes, and open-source software for target prediction (miRanda) is available at http://www.microrna.org. Our analysis suggests that miRNA genes, which are about 1% of all human genes, regulate protein production for 10% or more of all human genes.

3,654 citations


Journal ArticleDOI
Midori A. Harris, Jennifer I. Clark1, Ireland A1, Jane Lomax1, Michael Ashburner1, Michael Ashburner2, R. Foulger1, R. Foulger2, Karen Eilbeck1, Karen Eilbeck3, Suzanna E. Lewis3, Suzanna E. Lewis1, B. Marshall1, B. Marshall3, Christopher J. Mungall3, Christopher J. Mungall1, J. Richter3, J. Richter1, Gerald M. Rubin1, Gerald M. Rubin3, Judith A. Blake1, Carol J. Bult1, Dolan M1, Drabkin H1, Janan T. Eppig1, Hill Dp1, L. Ni1, Ringwald M1, Rama Balakrishnan4, Rama Balakrishnan1, J. M. Cherry1, J. M. Cherry4, Karen R. Christie1, Karen R. Christie4, Maria C. Costanzo4, Maria C. Costanzo1, Selina S. Dwight4, Selina S. Dwight1, Stacia R. Engel1, Stacia R. Engel4, Dianna G. Fisk1, Dianna G. Fisk4, Jodi E. Hirschman1, Jodi E. Hirschman4, Eurie L. Hong4, Eurie L. Hong1, Robert S. Nash1, Robert S. Nash4, Anand Sethuraman1, Anand Sethuraman4, Chandra L. Theesfeld1, Chandra L. Theesfeld4, David Botstein5, David Botstein1, Kara Dolinski1, Kara Dolinski5, Becket Feierbach5, Becket Feierbach1, Tanya Z. Berardini6, Tanya Z. Berardini1, S. Mundodi6, S. Mundodi1, Seung Y. Rhee6, Seung Y. Rhee1, Rolf Apweiler1, Daniel Barrell1, Camon E1, E. Dimmer1, Lee1, Rex L. Chisholm, Pascale Gaudet7, Pascale Gaudet1, Warren A. Kibbe7, Warren A. Kibbe1, Ranjana Kishore1, Ranjana Kishore8, Erich M. Schwarz1, Erich M. Schwarz8, Paul W. Sternberg1, Paul W. Sternberg8, M. Gwinn1, Hannick L1, Wortman J1, Matthew Berriman9, Matthew Berriman1, Wood1, Wood9, de la Cruz N10, de la Cruz N1, Peter J. Tonellato10, Peter J. Tonellato1, Pankaj Jaiswal1, Pankaj Jaiswal11, Seigfried T12, Seigfried T1, White R13, White R1 
TL;DR: The Gene Ontology (GO) project as discussed by the authors provides structured, controlled vocabularies and classifications that cover several domains of molecular and cellular biology and are freely available for community use in the annotation of genes, gene products and sequences.
Abstract: The Gene Ontology (GO) project (http://www.geneontology.org/) provides structured, controlled vocabularies and classifications that cover several domains of molecular and cellular biology and are freely available for community use in the annotation of genes, gene products and sequences. Many model organism databases and genome annotation groups use the GO and contribute their annotation sets to the GO resource. The GO database integrates the vocabularies and contributed annotations and provides full access to this information in several formats. Members of the GO Consortium continually work collectively, involving outside experts as needed, to expand and update the GO vocabularies. The GO Web resource also provides access to extensive documentation about the GO project and links to applications that use GO data for functional analyses.

3,565 citations


Journal ArticleDOI
TL;DR: A 'census' of cancer genes is conducted that indicates that mutations in more than 1% of genes contribute to human cancer.
Abstract: A central aim of cancer research has been to identify the mutated genes that are causally implicated in oncogenesis ('cancer genes'). After two decades of searching, how many have been identified and how do they compare to the complete gene set that has been revealed by the human genome sequence? We have conducted a 'census' of cancer genes that indicates that mutations in more than 1% of genes contribute to human cancer. The census illustrates striking features in the types of sequence alteration, cancer classes in which oncogenic mutations have been identified and protein domains that are encoded by cancer genes.

3,136 citations


Journal ArticleDOI
19 Mar 2004-Cell
TL;DR: The high activity mutants signal to ERK by directly phosphorylating MEK, whereas the impaired activity mutants stimulate MEK by activating endogenous C-RAF, possibly via an allosteric or transphosphorylation mechanism.

2,588 citations


Journal ArticleDOI
LaDeana W. Hillier1, Webb Miller2, Ewan Birney, Wesley C. Warren1  +171 moreInstitutions (39)
09 Dec 2004-Nature
TL;DR: A draft genome sequence of the red jungle fowl, Gallus gallus, provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes.
Abstract: We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.

2,579 citations


Journal ArticleDOI
TL;DR: The miRNA Registry provides a service for the assignment of miRNA gene names prior to publication and a comprehensive and searchable database of published miRNA sequences is accessible via a web interface.
Abstract: The miRNA Registry provides a service for the assignment of miRNA gene names prior to publication. A comprehensive and searchable database of published miRNA sequences is accessible via a web interface (http://www.sanger.ac.uk/Software/Rfam/mirna/), and all sequence and annotation data are freely available for download. Release 2.0 of the database contains 506 miRNA entries from six organisms.

2,405 citations


Journal ArticleDOI
TL;DR: The SNAP gene finder is introduced which has been designed to be easily adaptable to a variety of genomes and finds that foreign gene finders are more usefully employed to bootstrap parameter estimation and that the resulting parameters can be highly accurate.
Abstract: Background Computational gene prediction continues to be an important problem, especially for genomes with little experimental data.

2,315 citations


Journal ArticleDOI
Elise A. Feingold1, Peter J. Good1, Mark S. Guyer1, S. Kamholz1  +193 moreInstitutions (19)
22 Oct 2004-Science
TL;DR: The ENCyclopedia Of DNA Elements (ENCODE) Project is organized as an international consortium of computational and laboratory-based scientists working to develop and apply high-throughput approaches for detecting all sequence elements that confer biological function.
Abstract: The ENCyclopedia Of DNA Elements (ENCODE) Project aims to identify all functional elements in the human genome sequence. The pilot phase of the Project is focused on a specified 30 megabases (∼1%) of the human genome sequence and is organized as an international consortium of computational and laboratory-based scientists working to develop and apply high-throughput approaches for detecting all sequence elements that confer biological function. The results of this pilot phase will guide future efforts to analyze the entire human genome.

2,248 citations


Journal ArticleDOI
TL;DR: It is strongly suggested that miRNAs are transcribed in parallel with their host transcripts, and that the two different transcription classes of miRNAAs ('exonic' and 'intronic') identified here may require slightly different mechanisms of biogenesis.
Abstract: To derive a global perspective on the transcription of microRNAs (miRNAs) in mammals, we annotated the genomic position and context of this class of noncoding RNAs (ncRNAs) in the human and mouse genomes. Of the 232 known mammalian miRNAs, we found that 161 overlap with 123 defined transcription units (TUs). We identified miRNAs within introns of 90 protein-coding genes with a broad spectrum of molecular functions, and in both introns and exons of 66 mRNA-like noncoding RNAs (mlncRNAs). In addition, novel families of miRNAs based on host gene identity were identified. The transcription patterns of all miRNA host genes were curated from a variety of sources illustrating spatial, temporal, and physiological regulation of miRNA expression. These findings strongly suggest that miRNAs are transcribed in parallel with their host transcripts, and that the two different transcription classes of miRNAs (`exonic' and `intronic') identified here may require slightly different mechanisms of biogenesis.

2,043 citations


Journal ArticleDOI
01 Apr 2004-Nature
TL;DR: This first comprehensive analysis of the genome sequence of the Brown Norway (BN) rat strain is reported, which is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution.
Abstract: The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution.

1,964 citations


Journal ArticleDOI
23 Jan 2004-Science
TL;DR: A large fraction of the Caenorhabditis elegans interactome network is mapped, starting with a subset of metazoan-specific proteins, and more than 4000 interactions were identified from high-throughput, yeast two-hybrid screens.
Abstract: To initiate studies on how protein-protein interaction (or "interactome") networks relate to multicellular functions, we have mapped a large fraction of the Caenorhabditis elegans interactome network. Starting with a subset of metazoan-specific proteins, more than 4000 interactions were identified from high-throughput, yeast two-hybrid (HT=Y2H) screens. Independent coaffinity purification assays experimentally validated the overall quality of this Y2H data set. Together with already described Y2H interactions and interologs predicted in silico, the current version of the Worm Interactome (WI5) map contains approximately 5500 interactions. Topological and biological features of this interactome network, as well as its integration with phenome and transcriptome data sets, lead to numerous biological hypotheses.

Journal ArticleDOI
TL;DR: The Jalview Java alignment editor is presented here, which enables fast viewing and editing of large multiple sequence alignments.
Abstract: Summary: Multiple sequence alignment remains a crucial method for understanding the function of groups of related nucleic acid and protein sequences. However, it is known that automatic multiple sequence alignments can often be improved by manual editing. Therefore, tools are needed to view and edit multiple sequence alignments. Due to growth in the sequence databases, multiple sequence alignments can often be large and difficult to view efficiently. The Jalview Java alignment editor is presented here, which enables fast viewing and editing of large multiple sequence alignments. Availability: The Jar file and source code for Jalview is freely available via the World Wide Web at http://www.jalview.org. A Jalview mailing list is also available by e-mailing majordomo@sanger.ac.uk with subscribe Jalview in the body of the mail.

Journal ArticleDOI
TL;DR: The Rfam database aims to facilitate the identification and classification of new members of known sequence families, and distributes annotation of ncRNAs in over 200 complete genome sequences.
Abstract: Rfam is a comprehensive collection of non-coding RNA (ncRNA) families, represented by multiple sequence alignments and profile stochastic context-free grammars. Rfam aims to facilitate the identification and classification of new members of known sequence families, and distributes annotation of ncRNAs in over 200 complete genome sequences. The data provide the first glimpses of conservation of multiple ncRNA families across a wide taxonomic range. A small number of large families are essential in all three kingdoms of life, with large numbers of smaller families specific to certain taxa. Recent improvements in the database are discussed, together with challenges for the future. Rfam is available on the Web at http://www.sanger.ac.uk/Software/Rfam/ and http://rfam.wustl.edu/.

Journal ArticleDOI
TL;DR: The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website have been developed to store somatic mutation data in a single location and display the data and other information related to human cancer.
Abstract: The discovery of mutations in cancer genes has advanced our understanding of cancer. These results are dispersed across the scientific literature and with the availability of the human genome sequence will continue to accrue. The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website have been developed to store somatic mutation data in a single location and display the data and other information related to human cancer. To populate this resource, data has currently been extracted from reports in the scientific literature for somatic mutations in four genes, BRAF, HRAS, KRAS2 and NRAS. At present, the database holds information on 66 634 samples and reports a total of 10 647 mutations. Through the web pages, these data can be queried, displayed as figures or tables and exported in a number of formats. COSMIC is an ongoing project that will continue to curate somatic mutation data and release it through the website.

Journal ArticleDOI
TL;DR: A gene map of the xMHC is presented and its content in relation to paralogy, polymorphism, immune function and disease is reviewed.
Abstract: The major histocompatibility complex (MHC) is the most important region in the vertebrate genome with respect to infection and autoimmunity, and is crucial in adaptive and innate immunity. Decades of biomedical research have revealed many MHC genes that are duplicated, polymorphic and associated with more diseases than any other region of the human genome. The recent completion of several large-scale studies offers the opportunity to assimilate the latest data into an integrated gene map of the extended human MHC. Here, we present this map and review its content in relation to paralogy, polymorphism, immune function and disease.

Journal ArticleDOI
23 Apr 2004-Science
TL;DR: It is demonstrated that recombination hotspots are a ubiquitous feature of the human genome, occurring on average every 200 kilobases or less, but recombination occurs preferentially outside genes.
Abstract: The nature and scale of recombination rate variation are largely unknown for most species. In humans, pedigree analysis has documented variation at the chromosomal level, and sperm studies have identified specific hotspots in which crossing-over events cluster. To address whether this picture is representative of the genome as a whole, we have developed and validated a method for estimating recombination rates from patterns of genetic variation. From extensive single-nucleotide polymorphism surveys in European and African populations, we find evidence for extreme local rate variation spanning four orders in magnitude, in which 50% of all recombination events take place in less than 10% of the sequence. We demonstrate that recombination hotspots are a ubiquitous feature of the human genome, occurring on average every 200 kilobases or less, but recombination occurs preferentially outside genes.

Journal ArticleDOI
TL;DR: The crucial role that accessory elements play in the rapid evolution of S. aureus is clearly illustrated by comparing the MSSA476 genome with that of an extremely closely related MRSA community-acquired strain; the differential distribution of large mobile elements carrying virulence and drug-resistance determinants may be responsible for the clinically important phenotypic differences in these strains.
Abstract: Staphylococcus aureus is an important nosocomial and community-acquired pathogen. Its genetic plasticity has facilitated the evolution of many virulent and drug-resistant strains, presenting a major and constantly changing clinical challenge. We sequenced the ≈2.8-Mbp genomes of two disease-causing S. aureus strains isolated from distinct clinical settings: a recent hospital-acquired representative of the epidemic methicillin-resistant S. aureus EMRSA-16 clone (MRSA252), a clinically important and globally prevalent lineage; and a representative of an invasive community-acquired methicillin-susceptible S. aureus clone (MSSA476). A comparative-genomics approach was used to explore the mechanisms of evolution of clinically important S. aureus genomes and to identify regions affecting virulence and drug resistance. The genome sequences of MRSA252 and MSSA476 have a well conserved core region but differ markedly in their accessory genetic elements. MRSA252 is the most genetically diverse S. aureus strain sequenced to date: ≈6% of the genome is novel compared with other published genomes, and it contains several unique genetic elements. MSSA476 is methicillin-susceptible, but it contains a novel Staphylococcal chromosomal cassette (SCC) mec-like element (designated SCC476), which is integrated at the same site on the chromosome as SCCmec elements in MRSA strains but encodes a putative fusidic acid resistance protein. The crucial role that accessory elements play in the rapid evolution of S. aureus is clearly illustrated by comparing the MSSA476 genome with that of an extremely closely related MRSA community-acquired strain; the differential distribution of large mobile elements carrying virulence and drug-resistance determinants may be responsible for the clinically important phenotypic differences in these strains.

Journal ArticleDOI
30 Sep 2004-Nature
TL;DR: The protein-kinase family is the most frequently mutated gene family found in human cancer and faulty kinase enzymes are being investigated as promising targets for the design of antitumour therapies as mentioned in this paper.
Abstract: The protein-kinase family is the most frequently mutated gene family found in human cancer and faulty kinase enzymes are being investigated as promising targets for the design of antitumour therapies. We have sequenced the gene encoding the transmembrane protein tyrosine kinase ERBB2 (also known as HER2 or Neu) from 120 primary lung tumours and identified 4% that have mutations within the kinase domain; in the adenocarcinoma subtype of lung cancer, 10% of cases had mutations. ERBB2 inhibitors, which have so far proved to be ineffective in treating lung cancer, should now be clinically re-evaluated in the specific subset of patients with lung cancer whose tumours carry ERBB2 mutations.

Journal ArticleDOI
TL;DR: It is proposed that variable horizontal gene acquisition by B. pseudomallei is an important feature of recent genetic evolution and that this has resulted in a genetically diverse pathogenic species.
Abstract: Burkholderia pseudomallei is a recognized biothreat agent and the causative agent of melioidosis. This Gram-negative bacterium exists as a soil saprophyte in melioidosis-endemic areas of the world and accounts for 20% of community-acquired septicaemias in northeastern Thailand where half of those affected die. Here we report the complete genome of B. pseudomallei, which is composed of two chromosomes of 4.07 megabase pairs and 3.17 megabase pairs, showing significant functional partitioning of genes between them. The large chromosome encodes many of the core functions associated with central metabolism and cell growth, whereas the small chromosome carries more accessory functions associated with adaptation and survival in different niches. Genomic comparisons with closely and more distantly related bacteria revealed a greater level of gene order conservation and a greater number of orthologous genes on the large chromosome, suggesting that the two replicons have distinct evolutionary origins. A striking feature of the genome was the presence of 16 genomic islands (GIs) that together made up 6.1% of the genome. Further analysis revealed these islands to be variably present in a collection of invasive and soil isolates but entirely absent from the clonally related organism B. mallei. We propose that variable horizontal gene acquisition by B. pseudomallei is an important feature of recent genetic evolution and that this has resulted in a genetically diverse pathogenic species.

Journal ArticleDOI
TL;DR: It is time to harness new technologies and efficiencies of production to mount a high-throughput international effort to produce and phenotype knockouts for all mouse genes, and place these resources into the public domain.
Abstract: Mouse knockout technology provides a powerful means of elucidating gene function in vivo, and a publicly available genome-wide collection of mouse knockouts would be significantly enabling for biomedical discovery. To date, published knockouts exist for only about 10% of mouse genes. Furthermore, many of these are limited in utility because they have not been made or phenotyped in standardized ways, and many are not freely available to researchers. It is time to harness new technologies and efficiencies of production to mount a high-throughput international effort to produce and phenotype knockouts for all mouse genes, and place these resources into the public domain.

Journal ArticleDOI
TL;DR: Comparative genomics is beginning to identify the functional components of the chromosome and that in turn will set the stage for the functional characterization of the sequences.
Abstract: The sequence of chromosome 21 was a turning point for the understanding of Down syndrome. Comparative genomics is beginning to identify the functional components of the chromosome and that in turn will set the stage for the functional characterization of the sequences. Animal models combined with genome-wide analytical methods have proved indispensable for unravelling the mysteries of gene dosage imbalance.

Journal ArticleDOI
TL;DR: A system wherein the inhibitor units of the peptidase inhibitors are assigned to 48 families on the basis of similarities detectable at the level of amino acid sequence, and a simple system of nomenclature is introduced for reference to each clan, family and inhibitor.
Abstract: The proteins that inhibit peptidases are of great importance in medicine and biotechnology, but there has never been a comprehensive system of classification for them. Some of the terminology currently in use is potentially confusing. In the hope of facilitating the exchange, storage and retrieval of information about this important group of proteins, we now describe a system wherein the inhibitor units of the peptidase inhibitors are assigned to 48 families on the basis of similarities detectable at the level of amino acid sequence. Then, on the basis of three-dimensional structures, 31 of the families are assigned to 26 clans. A simple system of nomenclature is introduced for reference to each clan, family and inhibitor. We briefly discuss the specificities and mechanisms of the interactions of the inhibitors in the various families with their target enzymes. The system of families and clans of inhibitors described has been implemented in the MEROPS peptidase database (http://merops.sanger.ac.uk/), and this will provide a mechanism for updating it as new information becomes available.

Journal ArticleDOI
24 Nov 2004-Cell
TL;DR: It is argued that H4-K20 methylation functions as a "histone mark" required for the recruitment of the checkpoint protein Crb2, a homolog of the mammalian checkpoint protein 53BP1.

Journal ArticleDOI
TL;DR: A GAL4 knock-in approach as well as the chromosome conformation capture technique are used to show that the differentially methylated regions in the imprinted genes Igf2 and H19 interact in mice and partition maternal and paternal chromatin into distinct loops.
Abstract: Imprinted genes are expressed from only one of the parental alleles and are marked epigenetically by DNA methylation and histone modifications. The paternally expressed gene insulin-like growth-factor 2 (Igf2) is separated by approximately 100 kb from the maternally expressed noncoding gene H19 on mouse distal chromosome 7. Differentially methylated regions in Igf2 and H19 contain chromatin boundaries, silencers and activators and regulate the reciprocal expression of the two genes in a methylation-sensitive manner by allowing them exclusive access to a shared set of enhancers. Various chromatin models have been proposed that separate Igf2 and H19 into active and silent domains. Here we used a GAL4 knock-in approach as well as the chromosome conformation capture technique to show that the differentially methylated regions in the imprinted genes Igf2 and H19 interact in mice. These interactions are epigenetically regulated and partition maternal and paternal chromatin into distinct loops. This generates a simple epigenetic switch for Igf2 through which it moves between an active and a silent chromatin domain.

Journal ArticleDOI
28 May 2004-Science
TL;DR: A mutation in the gene encoding the protein kinase AKT2/PKBβ in a family that shows autosomal dominant inheritance of severe insulin resistance and diabetes mellitus is described, demonstrating the central importance of AKT signaling to insulin sensitivity in humans.
Abstract: Inherited defects in signaling pathways downstream of the insulin receptor have long been suggested to contribute to human type 2 diabetes mellitus. Here we describe a mutation in the gene encoding the protein kinase AKT2/PKBbeta in a family that shows autosomal dominant inheritance of severe insulin resistance and diabetes mellitus. Expression of the mutant kinase in cultured cells disrupted insulin signaling to metabolic end points and inhibited the function of coexpressed, wild-type AKT. These findings demonstrate the central importance of AKT signaling to insulin sensitivity in humans.

Journal ArticleDOI
03 Sep 2004-Cell
TL;DR: It is suggested that domains of open chromatin may create an environment that facilitates transcriptional activation and could provide an evolutionary constraint to maintain clusters of genes together along chromosomes.

Journal ArticleDOI
TL;DR: The genome-wide transcriptional program of the Schizosaccharomyces pombe cell cycle was studied in this paper, identifying 407 periodically expressed genes of which 136 show high-amplitude changes.
Abstract: Cell-cycle control of transcription seems to be universal, but little is known about its global conservation and biological significance. We report on the genome-wide transcriptional program of the Schizosaccharomyces pombe cell cycle, identifying 407 periodically expressed genes of which 136 show high-amplitude changes. These genes cluster in four major waves of expression. The forkhead protein Sep1p regulates mitotic genes in the first cluster, including Ace2p, which activates transcription in the second cluster during the M-G1 transition and cytokinesis. Other genes in the second cluster, which are required for G1-S progression, are regulated by the MBF complex independently of Sep1p and Ace2p. The third cluster coincides with S phase and a fourth cluster contains genes weakly regulated during G2 phase. Despite conserved cell-cycle transcription factors, differences in regulatory circuits between fission and budding yeasts are evident, revealing evolutionary plasticity of transcriptional control. Periodic transcription of most genes is not conserved between the two yeasts, except for a core set of approximately 40 genes that seem to be universally regulated during the eukaryotic cell cycle and may have key roles in cell-cycle progression.

Journal ArticleDOI
TL;DR: An evolutionary tree is proposed for these populations, rooted on Yersinia pseudotuberculosis, which invokes microevolution over millennia, during which enzootic pestoides isolates evolved and led to populations that are more frequently associated with human disease.
Abstract: The association of historical plague pandemics with Yersinia pestis remains controversial, partly because the evolutionary history of this largely monomorphic bacterium was unknown. The microevolution of Y. pestis was therefore investigated by three different multilocus molecular methods, targeting genomewide synonymous SNPs, variation in number of tandem repeats, and insertion of IS100 insertion elements. Eight populations were recognized by the three methods, and we propose an evolutionary tree for these populations, rooted on Yersinia pseudotuberculosis. The tree invokes microevolution over millennia, during which enzootic pestoides isolates evolved. This initial phase was followed by a binary split 6,500 years ago, which led to populations that are more frequently associated with human disease. These populations do not correspond directly to classical biovars that are based on phenotypic properties. Thus, we recommend that henceforth groupings should be based on molecular signatures. The age of Y. pestis inferred here is compatible with the dates of historical pandemic plague. However, it is premature to infer an association between any modern molecular grouping and a particular pandemic wave that occurred before the 20th century.

Journal ArticleDOI
TL;DR: The recent sequencing of seven strains of S. aureus provides unprecedented information about its genome diversity, and dramatic differences in the carriage and spread of accessory genes, including those involved in virulence and resistance, contribute to the emergence of new strains with healthcare implications.

Journal ArticleDOI
TL;DR: The studies demonstrate that combining the genotype and copy number analyses gives greater insight into the underlying genetic alterations in cancer cells with identification of complex events including loss and reduplication of loci.
Abstract: Genomic copy number alterations are a feature of many human diseases including cancer. We have evaluated the effectiveness of an oligonucleotide array, originally designed to detect single-nucleotide polymorphisms, to assess DNA copy number. We first showed that fluorescent signal from the oligonucleotide array varies in proportion to both decreases and increases in copy number. Subsequently we applied the system to a series of 20 cancer cell lines. All of the putative homozygous deletions (10) and high-level amplifications (12; putative copy number >4) tested were confirmed by PCR (either qPCR or normal PCR) analysis. Low-level copy number changes for two of the lines under analysis were compared with BAC array CGH; 77% (n = 44) of the autosomal chromosomes used in the comparison showed consistent patterns of LOH (loss of heterozygosity) and low-level amplification. Of the remaining 10 comparisons that were discordant, eight were caused by low SNP densities and failed in both lines. The studies demonstrate that combining the genotype and copy number analyses gives greater insight into the underlying genetic alterations in cancer cells with identification of complex events including loss and reduplication of loci.