Showing papers by "Wellcome Trust Sanger Institute published in 2005"
••
27 Oct 2005
TL;DR: A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.
Abstract: Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution.
5,479 citations
••
TL;DR: Detailed polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
Abstract: This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
3,412 citations
••
Kerstin Lindblad-Toh1, Claire M. Wade2, Claire M. Wade1, Tarjei S. Mikkelsen1 +238 more•Institutions (11)
TL;DR: A high-quality draft genome sequence of the domestic dog is reported, together with a dense map of single nucleotide polymorphisms (SNPs) across breeds, to shed light on the structure and evolution of genomes and genes.
Abstract: Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
2,431 citations
••
Wellcome Trust Sanger Institute1, George Washington University2, J. Craig Venter Institute3, University of Glasgow4, University of Oxford5, Newcastle University6, University of Bordeaux7, University of Cambridge8, Oregon Health & Science University9, University of Dundee10, Imperial College London11, Case Western Reserve University12, Yale University13, Université catholique de Louvain14, University of Iowa15, Wellcome Trust16
TL;DR: Comparisons of the cytoskeleton and endocytic trafficking systems of Trypanosoma brucei with those of humans and other eukaryotic organisms reveal major differences.
Abstract: African trypanosomes cause human sleeping sickness and livestock trypanosomiasis in sub-Saharan Africa. We present the sequence and analysis of the 11 megabase-sized chromosomes of Trypanosoma brucei. The 26-megabase genome contains 9068 predicted genes, including ∼900 pseudogenes and ∼1700 T. brucei–specific genes. Large subtelomeric arrays contain an archive of 806 variant surface glycoprotein (VSG) genes used by the parasite to evade the mammalian immune system. Most VSG genes are pseudogenes, which may be used to generate expressed mosaic genes by ectopic recombination. Comparisons of the cytoskeleton and endocytic trafficking systems with those of humans and other eukaryotic organisms reveal major differences. A comparison of metabolic pathways encoded by the genomes of T. brucei, T. cruzi, and Leishmania major reveals the least overall metabolic capability in T. brucei and the greatest in L. major. Horizontal transfer of genes of bacterial origin has contributed to some of the metabolic differences in these parasites, and a number of novel potential drug targets have been identified.
1,631 citations
••
TL;DR: The Artemis Comparison Tool (ACT) allows an interactive visualisation of comparisons between complete genome sequences and associated annotations and so inherits powerful searching and analysis tools.
Abstract: The Artemis Comparison Tool (ACT) allows an interactive visualisation of comparisons between complete genome sequences and associated annotations. The comparison data can be generated with several different programs; BLASTN, TBLASTX or Mummer comparisons between genomic DNA sequences, or orthologue tables generated by reciprocal FASTA comparison between protein sets. It is possible to identify regions of similarity, insertions and rearrangements at any level from the whole genome to base-pair differences. ACT uses Artemis components to display the sequences and so inherits powerful searching and analysis tools. ACT is part of the Artemis distribution and is similarly open source, written in Java and can run on any Java enabled platform, including UNIX, Macintosh and Windows.
Availability: ACT is freely available (under a GPL licence) for download from the Sanger Institute web site, http://www.sanger.ac.uk
Contact: artemis@sanger.ac.uk
1,565 citations
••
Wellcome Trust Sanger Institute1, Seattle Biomed2, Katholieke Universiteit Leuven3, GATC Biotech4, Max Planck Society5, Washington University in St. Louis6, University of Trieste7, International Centre for Genetic Engineering and Biotechnology8, European Bioinformatics Institute9, University of São Paulo10, National Scientific and Technical Research Council11, Université catholique de Louvain12, University of London13, University of Edinburgh14, University of Glasgow15, University of Wisconsin-Madison16, University of York17, University of Cambridge18, University of Washington19
TL;DR: The organization of protein-coding genes into long, strand-specific, polycistronic clusters and lack of general transcription factors in the L. major, Trypanosoma brucei, and Tritryp genomes suggest that the mechanisms regulating RNA polymerase II–directed transcription are distinct from those operating in other eukaryotes, although the trypanosomatids appear capable of chromatin remodeling.
Abstract: Leishmania species cause a spectrum of human diseases in tropical and subtropical regions of the world. We have sequenced the 36 chromosomes of the 32.8-megabase haploid genome of Leishmania major (Friedlin strain) and predict 911 RNA genes, 39 pseudogenes, and 8272 protein-coding genes, of which 36% can be ascribed a putative function. These include genes involved in host-pathogen interactions, such as proteolytic enzymes, and extensive machinery for synthesis of complex surface glycoconjugates. The organization of protein-coding genes into long, strand-specific, polycistronic clusters and lack of general transcription factors in the L. major, Trypanosoma brucei, and Trypanosoma cruzi (Tritryp) genomes suggest that the mechanisms regulating RNA polymerase II-directed transcription are distinct from those operating in other eukaryotes, although the trypanosomatids appear capable of chromatin remodeling. Abundant RNA-binding proteins are encoded in the Tritryp genomes, consistent with active posttranscriptional regulation of gene expression.
1,357 citations
••
Washington University in St. Louis1, J. Craig Venter Institute2, Wellcome Trust Sanger Institute3, University of Manchester4, Complutense University of Madrid5, Tohoku University6, University of Nottingham7, Tulane University8, University of Kentucky9, Max Planck Society10, Spanish National Research Council11, University of Salamanca12, University of São Paulo13, Innsbruck Medical University14, University of Wisconsin-Madison15, University of Tokyo16, Nagoya University17, National Institute of Advanced Industrial Science and Technology18, Pasteur Institute19, University of Texas MD Anderson Cancer Center20, University of Idaho21, University of Lausanne22, University of Göttingen23, Tokyo University of Agriculture and Technology24, University of Sheffield25, Broad Institute26
TL;DR: The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus and revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype.
Abstract: Aspergillus fumigatus is exceptional among microorganisms in being both a primary and opportunistic pathogen as well as a major allergen. Its conidia production is prolific, and so human respiratory tract exposure is almost constant. A. fumigatus is isolated from human habitats and vegetable compost heaps. In immunocompromised individuals, the incidence of invasive infection can be as high as 50% and the mortality rate is often about 50% (ref. 2). The interaction of A. fumigatus and other airborne fungi with the immune system is increasingly linked to severe asthma and sinusitis. Although the burden of invasive disease caused by A. fumigatus is substantial, the basic biology of the organism is mostly obscure. Here we show the complete 29.4-megabase genome sequence of the clinical isolate Af293, which consists of eight chromosomes containing 9,926 predicted genes. Microarray analysis revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype. The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus.
1,356 citations
••
Broad Institute1, J. Craig Venter Institute2, Stanford University3, Oregon Health & Science University4, University of Glasgow5, Genetic Information Research Institute6, Institut Universitaire de France7, University of Kentucky8, University of Nebraska–Lincoln9, University of Göttingen10, Pasteur Institute11, University of São Paulo12, Texas A&M University13, Wellcome Trust Sanger Institute14, John Innes Centre15, University of Wisconsin-Madison16, Max Planck Society17, University of Oregon18, University of Nottingham19, Spanish National Research Council20, Ohio State University21, University of Georgia22, Tokyo Institute of Technology23, National Institute of Advanced Industrial Science and Technology24, George Washington University25, University of Manchester26, University of Liverpool27, University of Melbourne28, Karlsruhe Institute of Technology29, University of Idaho30
TL;DR: The aspergilli comprise a diverse group of filamentous fungi spanning over 200 million years of evolution, and a comparative study with Aspergillus fumigatus and As pergillus oryzae, used in the production of sake, miso and soy sauce, provides new insight into eukaryotic genome evolution and gene regulation.
Abstract: The aspergilli comprise a diverse group of filamentous fungi spanning over 200 million years of evolution. Here we report the genome sequence of the model organism Aspergillus nidulans, and a comparative study with Aspergillus fumigatus, a serious human pathogen, and Aspergillus oryzae, used in the production of sake, miso and soy sauce. Our analysis of genome structure provided a quantitative evaluation of forces driving long-term eukaryotic genome evolution. It also led to an experimentally validated model of mating-type locus evolution, suggesting the potential for sexual reproduction in A. fumigatus and A. oryzae. Our analysis of sequence conservation revealed over 5,000 non-coding regions actively conserved across all three species. Within these regions, we identified potential functional elements including a previously uncharacterized TPP riboswitch and motifs suggesting regulation in filamentous fungi by Puf family genes. We further obtained comparative and experimental evidence indicating widespread translational regulation by upstream open reading frames. These results enhance our understanding of these widely studied fungi as well as provide new insight into eukaryotic genome evolution and gene regulation.
1,297 citations
••
TL;DR: Injection of miR-430 miRNAs rescues the brain defects in MZdicer mutants, revealing essential roles for miRNas during morphogenesis.
Abstract: MicroRNAs (miRNAs) are small RNAs that regulate gene expression posttranscriptionally. To block all miRNA formation in zebrafish, we generated maternal-zygotic dicer (MZdicer) mutants that disrupt the Dicer ribonuclease III and double-stranded RNA-binding domains. Mutant embryos do not process precursor miRNAs into mature miRNAs, but injection of preprocessed miRNAs restores gene silencing, indicating that the disrupted domains are dispensable for later steps in silencing. MZdicer mutants undergo axis formation and differentiate multiple cell types but display abnormal morphogenesis during gastrulation, brain formation, somitogenesis, and heart development. Injection of miR-430 miRNAs rescues the brain defects in MZdicer mutants, revealing essential roles for miRNAs during morphogenesis.
1,292 citations
••
University of Cologne1, Laboratory of Molecular Biology2, Wellcome Trust Sanger Institute3, Baylor College of Medicine4, University of California, San Diego5, Northwestern University6, University of Tsukuba7, Ludwig Maximilian University of Munich8, University of Cambridge9, Hokkaido University10, Pasteur Institute11, University of York12, National Institute of Genetics13, University of Tokyo14, Princeton University15, University of Dundee16
TL;DR: A proteome-based phylogeny shows that the amoebozoa diverged from the animal–fungal lineage after the plant–animal split, but Dictyostelium seems to have retained more of the diversity of the ancestral genome than have plants, animals or fungi.
Abstract: The social amoebae are exceptional in their ability to alternate between unicellular and multicellular forms. Here we describe the genome of the best-studied member of this group, Dictyostelium discoideum. The gene-dense chromosomes of this organism encode approximately 12,500 predicted proteins, a high proportion of which have long, repetitive amino acid tracts. There are many genes for polyketide synthases and ABC transporters, suggesting an extensive secondary metabolism for producing and exporting small molecules. The genome is rich in complex repeats, one class of which is clustered and may serve as centromeres. Partial copies of the extrachromosomal ribosomal DNA (rDNA) element are found at the ends of each chromosome, suggesting a novel telomere structure and the use of a common mechanism to maintain both the rDNA and chromosomal termini. A proteome-based phylogeny shows that the amoebozoa diverged from the animal-fungal lineage after the plant-animal split, but Dictyostelium seems to have retained more of the diversity of the ancestral genome than have plants, animals or fungi.
1,289 citations
••
TL;DR: This analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome.
Abstract: The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.
••
Université de Montréal1, Max Planck Society2, European Bioinformatics Institute3, Dresden University of Technology4, Instituto Gulbenkian de Ciência5, BC Cancer Research Centre6, Wellcome Trust Sanger Institute7, Simon Fraser University8, New York University9, ISREC10, Laboratory of Molecular Biology11
TL;DR: This work used RNA-mediated interference to target 98% of all genes predicted in the C. elegans genome in combination with differential interference contrast time-lapse microscopy and developed a phenotypic profiling system, which shows high correlation with cellular processes and biochemical pathways, thus enabling to predict new functions for previously uncharacterized genes.
Abstract: A key challenge of functional genomics today is to generate well-annotated data sets that can be interpreted across different platforms and technologies. Large-scale functional genomics data often fail to connect to standard experimental approaches of gene characterization in individual laboratories. Furthermore, a lack of universal annotation standards for phenotypic data sets makes it difficult to compare different screening approaches. Here we address this problem in a screen designed to identify all genes required for the first two rounds of cell division in the Caenorhabditis elegans embryo. We used RNA-mediated interference to target 98% of all genes predicted in the C. elegans genome in combination with differential interference contrast time-lapse microscopy. Through systematic annotation of the resulting movies, we developed a phenotypic profiling system, which shows high correlation with cellular processes and biochemical pathways, thus enabling us to predict new functions for previously uncharacterized genes.
••
TL;DR: It is observed posttranscriptional gene silencing through translational repression of messenger RNA during sexual development, and a 47-base 3′ untranslated region motif is implicated in this process.
Abstract: Plasmodium berghei and Plasmodium chabaudi are widely used model malaria species. Comparison of their genomes, integrated with proteomic and microarray data, with the genomes of Plasmodium falciparum and Plasmodium yoelii revealed a conserved core of 4500 Plasmodium genes in the central regions of the 14 chromosomes and highlighted genes evolving rapidly because of stage-specific selective pressures. Four strategies for gene expression are apparent during the parasites' life cycle: (i) housekeeping; (ii) host-related; (iii) strategy-specific related to invasion, asexual replication, and sexual development; and (iv) stage-specific. We observed posttranscriptional gene silencing through translational repression of messenger RNA during sexual development, and a 47-base 3' untranslated region motif is implicated in this process.
••
TL;DR: The Sequence Ontology is a structured controlled vocabulary for the parts of a genomic annotation that provides a common set of terms and definitions that will facilitate the exchange, analysis and management of genomic data.
Abstract: The Sequence Ontology (SO) is a structured controlled vocabulary for the parts of a genomic annotation. SO provides a common set of terms and definitions that will facilitate the exchange, analysis and management of genomic data. Because SO treats part-whole relationships rigorously, data described with it can become substrates for automated reasoning, and instances of sequence features described by the SO can be subjected to a group of logical operations termed extensional mereology operators.
••
15 Apr 2005TL;DR: The basic contents and availability of the Pfam database are described, and the new resource that describes domain–domain interactions at the molecular level is called iPfam, a protein families database that contains information on domain– domain interactions.
Abstract: Systematic analysis has shown that the majority of proteins can be grouped into approximately 1000 sequence families. These sequence families are often representative of domains. Pfam is a protein families database. The basic contents and availability of the Pfam database are described. Genome sequencing projects, including the human and fly, have used Pfam extensively for large-scale functional annotation of genomic data, while smaller research groups, devoted to a single protein or biochemical pathway, frequently use Pfam for their analyses. Typically, Pfam matches between 55 and 90% of proteins from complete proteome sets. Pfam also allows the domain distributions to be compared for completed genomes. In addition to sequence domain annotation, Pfam also contains information on domain–domain interactions. The new resource that describes domain–domain interactions at the molecular level is called iPfam. The contents of iPfam are briefly outlined.
Keywords:
Pfam;
genome annotation;
HMM;
Markov;
protein interaction
••
George Washington University1, Seattle Biomed2, University of Washington3, J. Craig Venter Institute4, Wellcome Trust Sanger Institute5, Karolinska Institutet6, Newcastle University7, Centre national de la recherche scientifique8, Universidade Federal de Minas Gerais9, Medical Research Council10, University of Cambridge11, University of Iowa12
TL;DR: No evidence that these species are descended from an ancestor that contained a photosynthetic endosymbiont is revealed, and a conserved core proteome of about 6200 genes in large syntenic polycistronic gene clusters is revealed.
Abstract: A comparison of gene content and genome architecture of Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major, three related pathogens with different life cycles and disease pathology, revealed a conserved core proteome of about 6200 genes in large syntenic polycistronic gene clusters. Many species-specific genes, especially large surface antigen families, occur at nonsyntenic chromosome-internal and subtelomeric regions. Retroelements, structural RNAs, and gene family expansion are often associated with syntenic discontinuities that-along with gene divergence, acquisition and loss, and rearrangement within the syntenic regions-have shaped the genomes of each parasite. Contrary to recent reports, our analyses reveal no evidence that these species are descended from an ancestor that contained a photosynthetic endosymbiont.
••
TL;DR: The SET-domain protein methyltransferase superfamily includes all but one of the proteins known to methylate histones on lysine.
Abstract: The SET-domain protein methyltransferase superfamily includes all but one of the proteins known to methylate histones on lysine. Histone methylation is important in the regulation of chromatin and gene expression.
••
TL;DR: The results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I) HapMap has sufficient density to enable linkage disequilibrium mapping in humans.
Abstract: The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12–13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs) with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis-) to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I) HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.
••
TL;DR: It is shown that expression of clock genes in osteoblasts is regulated by the sympathetic nervous system and leptin, which determines the extent of bone formation by modulating, via sympathetic signaling, osteoblast proliferation through two antagonistic pathways, one of which involves the molecular clock.
••
TL;DR: It is reported here that inner ears of Lcc/Lcc mice fail to establish a prosensory domain and neither hair cells nor supporting cells differentiate, resulting in a severe inner ear malformation, whereas the sensory epithelium of Ysb/Ysb mice shows abnormal development with disorganized and fewer hair cells.
Abstract: Sensory hair cells and their associated non-sensory supporting cells in the inner ear are fundamental for hearing and balance. They arise from a common progenitor, but little is known about the molecular events specifying this cell lineage. We recently identified two allelic mouse mutants, light coat and circling (Lcc) and yellow submarine (Ysb), that show hearing and balance impairment. Lcc/Lcc mice are completely deaf, whereas Ysb/Ysb mice are severely hearing impaired. We report here that inner ears of Lcc/Lcc mice fail to establish a prosensory domain and neither hair cells nor supporting cells differentiate, resulting in a severe inner ear malformation, whereas the sensory epithelium of Ysb/Ysb mice shows abnormal development with disorganized and fewer hair cells. These phenotypes are due to the absence (in Lcc mutants) or reduced expression (in Ysb mutants) of the transcription factor SOX2, specifically within the developing inner ear. SOX2 continues to be expressed in the inner ears of mice lacking Math1 (also known as Atoh1 and HATH1), a gene essential for hair cell differentiation, whereas Math1 expression is absent in Lcc mutants, suggesting that Sox2 acts upstream of Math1.
••
TL;DR: The literature of C. rodentium is reviewed from its emergence in the mid‐1960s to the most contemporary reports of colonization, pathogenesis, transmission and immunity, providing an excellent in vivo model for A/E lesion forming pathogens.
Abstract: The major classes of enteric bacteria harbour a conserved core genomic structure, common to both commensal and pathogenic strains, that is most likely optimized to a life style involving colonization of the host intestine and transmission via the environment. In pathogenic bacteria this core genome framework is decorated with novel genetic islands that are often associated with adaptive phenotypes such as virulence. This classical genome organization is well illustrated by a group of extracellular enteric pathogens, which includes enteropathogenic Escherichia coli (EPEC), enterohaemorrhagic E. coli (EHEC) and Citrobacter rodentium, all of which use attaching and effacing (A/E) lesion formation as a major mechanism of tissue targeting and infection. Both EHEC and EPEC are poorly pathogenic in mice but infect humans and domestic animals. In contrast, C. rodentium is a natural mouse pathogen that is related to E. coli, hence providing an excellent in vivo model for A/E lesion forming pathogens. C. rodentium also provides a model of infections that are mainly restricted to the lumen of the intestine. The mechanism's by which the immune system deals with such infections has become a topic of great interest in recent years. Here we review the literature of C. rodentium from its emergence in the mid-1960s to the most contemporary reports of colonization, pathogenesis, transmission and immunity.
••
Wellcome Trust Sanger Institute1, Wellcome Trust2, Ludwig Institute for Cancer Research3, University College London4, Cambridge University Hospitals NHS Foundation Trust5, St James's University Hospital6, University of Hong Kong7, Erasmus University Rotterdam8, University of Pennsylvania9, Van Andel Institute10
TL;DR: The results suggest that several mutated protein kinases may be contributing to lung cancer development, but that mutations in each one are infrequent.
Abstract: Protein kinases are frequently mutated in human cancer and inhibitors of mutant protein kinases have proven to be effective anticancer drugs. We screened the coding sequences of 518 protein kinases (approximately 1.3 Mb of DNA per sample) for somatic mutations in 26 primary lung neoplasms and seven lung cancer cell lines. One hundred eighty-eight somatic mutations were detected in 141 genes. Of these, 35 were synonymous (silent) changes. This result indicates that most of the 188 mutations were "passenger" mutations that are not causally implicated in oncogenesis. However, an excess of approximately 40 nonsynonymous substitutions compared with that expected by chance (P = 0.07) suggests that some nonsynonymous mutations have been selected and are contributing to oncogenesis. There was considerable variation between individual lung cancers in the number of mutations observed and no mutations were found in lung carcinoids. The mutational spectra of most lung cancers were characterized by a high proportion of C:G > A:T transversions, compatible with the mutagenic effects of tobacco carcinogens. However, one neuroendocrine cancer cell line had a distinctive mutational spectrum reminiscent of UV-induced DNA damage. The results suggest that several mutated protein kinases may be contributing to lung cancer development, but that mutations in each one are infrequent.
••
TL;DR: A novel combination of factors that explains almost 60% of variable response to warfarin are reported, andotype-based dose predictions may in future enable personalised drug treatment from the start of warFarin therapy.
Abstract: We report a novel combination of factors that explains almost 60% of variable response to warfarin. Warfarin is a widely used anticoagulant, which acts through interference with vitamin K epoxide reductase that is encoded by VKORC1. In the next step of the vitamin K cycle, gamma-glutamyl carboxylase encoded by GGCX uses reduced vitamin K to activate clotting factors. We genotyped 201 warfarin-treated patients for common polymorphisms in VKORC1 and GGCX. All the five VKORC1 single-nucleotide polymorphisms covary significantly with warfarin dose, and explain 29–30% of variance in dose. Thus, VKORC1 has a larger impact than cytochrome P450 2C9, which explains 12% of variance in dose. In addition, one GGCX SNP showed a small but significant effect on warfarin dose. Incorrect dosage, especially during the initial phase of treatment, carries a high risk of either severe bleeding or failure to prevent thromboembolism. Genotype-based dose predictions may in future enable personalised drug treatment from the start of warfarin therapy.
••
Swedish Defence Research Agency1, Defence Science and Technology Laboratory2, Lawrence Livermore National Laboratory3, Centers for Disease Control and Prevention4, Uppsala University5, SRI International6, Walter Reed Army Institute of Research7, Umeå University8, Wellcome Trust Sanger Institute9, University of London10
TL;DR: The complete genome sequence of a highly virulent isolate of F. tularensis is reported and an unexpectedly high proportion of disrupted pathways are found, explaining the fastidious nutritional requirements of the bacterium.
Abstract: Francisella tularensis is one of the most infectious human pathogens known. In the past, both the former Soviet Union and the US had programs to develop weapons containing the bacterium. We report the complete genome sequence of a highly virulent isolate of F. tularensis (1,892,819 bp). The sequence uncovers previously uncharacterized genes encoding type IV pili, a surface polysaccharide and iron-acquisition systems. Several virulence-associated genes were located in a putative pathogenicity island, which was duplicated in the genome. More than 10% of the putative coding sequences contained insertion-deletion or substitution mutations and seemed to be deteriorating. The genome is rich in IS elements, including IS630 Tc-1 mariner family transposons, which are not expected in a prokaryote. We used a computational method for predicting metabolic pathways and found an unexpectedly high proportion of disrupted pathways, explaining the fastidious nutritional requirements of the bacterium. The loss of biosynthetic pathways indicates that F. tularensis is an obligate host-dependent bacterium in its natural life cycle. Our results have implications for our understanding of how highly virulent human pathogens evolve and will expedite strategies to combat them.
••
TL;DR: Cytogenetic analysis now extends beyond the simple description of the chromosomal status of a genome and allows the study of fundamental biological questions, such as the nature of inherited syndromes, the genomic changes that are involved in tumorigenesis and the three-dimensional organization of the human genome.
Abstract: Exciting advances in fluorescence in situ hybridization and array-based techniques are changing the nature of cytogenetics, in both basic research and molecular diagnostics. Cytogenetic analysis now extends beyond the simple description of the chromosomal status of a genome and allows the study of fundamental biological questions, such as the nature of inherited syndromes, the genomic changes that are involved in tumorigenesis and the three-dimensional organization of the human genome. The high resolution that is achieved by these techniques, particularly by microarray technologies such as array comparative genomic hybridization, is blurring the traditional distinction between cytogenetics and molecular biology.
••
TL;DR: A previously unknown class of immunoglobulin ζ is identified, expressed in zebrafish and other teleosts, and raises questions concerning the evolution of Immunoglobulins and the regulation of the differential expression of ighz and ighm.
Abstract: The only immunoglobulin heavy-chain classes known so far in teleosts have been mu and delta. We identify here a previously unknown class, immunoglobulin zeta, expressed in zebrafish and other teleosts. In the zebrafish heavy-chain locus, variable (V) gene segments lie upstream of two tandem diversity, joining and constant (DJC) clusters, resembling the mouse T cell receptor alpha (Tcra) and delta (Tcrd) locus. V genes rearrange to (DJC)(zeta) or to (DJC)(mu) without evidence of switch rearrangement. The zebrafish immunoglobulin zeta gene (ighz) and mouse Tcrd, which are proximal to the V gene array, are expressed earlier in development. In adults, ighz was expressed only in kidney and thymus, which are primary lymphoid organs in teleosts. This additional class adds complexity to the immunoglobulin repertoire and raises questions concerning the evolution of immunoglobulins and the regulation of the differential expression of ighz and ighm.
••
TL;DR: The embryonic origin, signalling roles and ultimate fate of the notochord are discussed, with an emphasis on structural aspects ofNotochord biology.
Abstract: The notochord is the defining structure of the chordates, and has essential roles in vertebrate development. It serves as a source of midline signals that pattern surrounding tissues and as a major skeletal element of the developing embryo. Genetic and embryological studies over the past decade have informed us about the development and function of the notochord. In this review, I discuss the embryonic origin, signalling roles and ultimate fate of the notochord, with an emphasis on structural aspects of notochord biology.
••
TL;DR: In many tumors, the coding sequence of 518 protein kinases was examined, and a few had numerous somatic mutations with distinctive patterns indicative of either a mutator phenotype or a past exposure.
Abstract: We examined the coding sequence of 518 protein kinases, approximately 1.3 Mb of DNA per sample, in 25 breast cancers. In many tumors, we detected no somatic mutations. But a few had numerous somatic mutations with distinctive patterns indicative of either a mutator phenotype or a past exposure.
••
TL;DR: The genome sequence of Theileria parva is reported, an apicomplexan pathogen causing economic losses to smallholder farmers in Africa, and its plastid-like genome represents the first example where all apicoplast genes are encoded on one DNA strand.
Abstract: We report the genome sequence of Theileria parva, an apicomplexan pathogen causing economic losses to smallholder farmers in Africa. The parasite chromosomes exhibit limited conservation of gene synteny with Plasmodium falciparum, and its plastid-like genome represents the first example where all apicoplast genes are encoded on one DNA strand. We tentatively identify proteins that facilitate parasite segregation during host cell cytokinesis and contribute to persistent infection of transformed host cells. Several biosynthetic pathways are incomplete or absent, suggesting substantial metabolic dependence on the host cell. One protein family that may generate parasite antigenic diversity is not telomere-associated.
••
University of California, San Francisco1, National Research Council2, Pasteur Institute3, Wellcome Trust Sanger Institute4, University of Texas Health Science Center at Houston5, University of Aberdeen6, University of Illinois at Urbana–Champaign7, University of Würzburg8, Université de Montréal9, McGill University10, Stanford University11, Weizmann Institute of Science12, University of Minnesota13, University of Lausanne14, Columbia University15
TL;DR: Improved annotation permitted a detailed analysis of several multigene families, and comparative genomic studies showed that C. albicans has a far greater catabolic range, encoding respiratory Complex 1, several novel oxidoreductases and ketone body degrading enzymes, malonyl- CoA and enoyl-CoA carriers, and numerous transporters to assimilate the resulting nutrients.
Abstract: Recent sequencing and assembly of the genome for the fungal pathogen Candida albicans used simple automated procedures for the identification of putative genes. We have reviewed the entire assembly, both by hand and with additional bioinformatic resources, to accurately map and describe 6,354 genes and to identify 246 genes whose original database entries contained sequencing errors (or possibly mutations) that affect their reading frame. Comparison with other fungal genomes permitted the identification of numerous fungus-specific genes that might be targeted for antifungal therapy. We also observed that, compared to other fungi, the protein-coding sequences in the C. albicans genome are especially rich in short sequence repeats. Finally, our improved annotation permitted a detailed analysis of several multigene families, and comparative genomic studies showed that C. albicans has a far greater catabolic range, encoding respiratory Complex 1, several novel oxidoreductases and ketone body degrading enzymes, malonyl-CoA and enoyl-CoA carriers, several novel amino acid degrading enzymes, a variety of secreted catabolic lipases and proteases, and numerous transporters to assimilate the resulting nutrients. The results of these efforts will ensure that the Candida research community has uniform and comprehensive genomic information for medical research as well as for future diagnostic and therapeutic applications.