scispace - formally typeset
Search or ask a question
Topic

Genome

About: Genome is a research topic. Over the lifetime, 74231 publications have been published within this topic receiving 3819713 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.
Abstract: A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome assembly and a regional chromosome assembly—were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ∼12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

1,674 citations

Journal ArticleDOI
26 Oct 2006-Nature
TL;DR: The genome sequence of the honeybee Apis mellifera is reported, suggesting a novel African origin for the species A. melliferA and insights into whether Africanized bees spread throughout the New World via hybridization or displacement.
Abstract: Here we report the genome sequence of the honeybee Apis mellifera, a key model for social behaviour and essential to global ecology through pollination. Compared with other sequenced insect genomes, the A. mellifera genome has high A+T and CpG contents, lacks major transposon families, evolves more slowly, and is more similar to vertebrates for circadian rhythm, RNA interference and DNA methylation genes, among others. Furthermore, A. mellifera has fewer genes for innate immunity, detoxification enzymes, cuticle-forming proteins and gustatory receptors, more genes for odorant receptors, and novel genes for nectar and pollen utilization, consistent with its ecology and social organization. Compared to Drosophila, genes in early developmental pathways differ in Apis, whereas similarities exist for functions that differ markedly, such as sex determination, brain function and behaviour. Population genetics suggests a novel African origin for the species A. mellifera and insights into whether Africanized bees spread throughout the New World via hybridization or displacement.

1,673 citations

Journal ArticleDOI
Yasushi Okazaki, Masaaki Furuno, Takeya Kasukawa1, Jun Adachi, Hidemasa Bono, S. Kondo, Itoshi Nikaido2, Naoki Osato, Rintaro Saito3, Harukazu Suzuki, Itaru Yamanaka, H. Kiyosawa2, Ken Yagi, Yasuhiro Tomaru4, Yuki Hasegawa2, A. Nogami2, Christian Schönbach, Takashi Gojobori, Richard M. Baldarelli, David P. Hill, Carol J. Bult, David A. Hume5, John Quackenbush6, Lynn M. Schriml7, Alexander Kanapin, Hideo Matsuda8, Serge Batalov9, Kirk W. Beisel10, Judith A. Blake, Dirck W. Bradt, Vladimir Brusic, Cyrus Chothia11, Lori E. Corbani, S. Cousins, Emiliano Dalla, Tommaso A. Dragani, Colin F. Fletcher9, Colin F. Fletcher12, Alistair R. R. Forrest5, K. S. Frazer13, Terry Gaasterland14, Manuela Gariboldi, Carmela Gissi15, Adam Godzik16, Julian Gough11, Sean M. Grimmond5, Stefano Gustincich17, Nobutaka Hirokawa18, Ian J. Jackson19, Erich D. Jarvis20, Akio Kanai3, Hideya Kawaji8, Hideya Kawaji1, Yuka Imamura Kawasawa21, Rafal M. Kedzierski21, Benjamin L. King, Akihiko Konagaya, Igor V. Kurochkin, Yong-Hwan Lee6, Boris Lenhard22, Paul A. Lyons23, Donna Maglott7, Lois J. Maltais, Luigi Marchionni, Louise M. McKenzie, Harukata Miki18, Takeshi Nagashima, Koji Numata3, Toshihisa Okido, William J. Pavan7, Geo Pertea6, Graziano Pesole15, Nikolai Petrovsky24, Ramesh S. Pillai, Joan Pontius7, D. Qi, Sridhar Ramachandran, Timothy Ravasi5, Jonathan C. Reed16, Deborah J Reed, Jeffrey G. Reid, Brian Z. Ring, M. Ringwald, Albin Sandelin22, Claudio Schneider, Colin A. Semple19, Mitsutoshi Setou18, K. Shimada25, Razvan Sultana6, Yoichi Takenaka8, Martin S. Taylor19, Rohan D. Teasdale5, Masaru Tomita3, Roberto Verardo, Lukas Wagner7, Claes Wahlestedt22, Y. Wang6, Yoshiki Watanabe25, Christine A. Wells5, Laurens G. Wilming26, Anthony Wynshaw-Boris27, Masashi Yanagisawa21, Ivana V. Yang6, L. Yang, Zheng Yuan5, Mihaela Zavolan14, Yunhui Zhu, Anne M. Zimmer28, Piero Carninci, N. Hayatsu, Tomoko Hirozane-Kishikawa, Hideaki Konno, M. Nakamura, Naoko Sakazume, K. Sato4, Toshiyuki Shiraki, Kazunori Waki, Jun Kawai, Katsunori Aizawa, Takahiro Arakawa, S. Fukuda, A. Hara, W. Hashizume, K. Imotani, Y. Ishii, Masayoshi Itoh, Ikuko Kagawa, A. Miyazaki, K. Sakai, D. Sasaki, K. Shibata, Akira Shinagawa, Ayako Yasunishi, Masayasu Yoshino, Robert H. Waterston29, Eric S. Lander30, Jane Rogers26, Ewan Birney, Yoshihide Hayashizaki 
05 Dec 2002-Nature
TL;DR: The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.
Abstract: Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences These are clustered into 33,409 'transcriptional units', contributing 901% of a newly established mouse transcriptome database Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome 41% of all transcriptional units showed evidence of alternative splicing In protein-coding transcripts, 79% of splice variations altered the protein product Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics

1,663 citations

Journal ArticleDOI
TL;DR: The web application GeSeq combines batch processing with a fully customizable reference sequence selection of organellar genome records from NCBI and/or references uploaded by the user to support high-quality annotations of chloroplast genomes.
Abstract: We have developed the web application GeSeq (https://chlorobox.mpimp-golm.mpg.de/geseq.html) for the rapid and accurate annotation of organellar genome sequences, in particular chloroplast genomes. In contrast to existing tools, GeSeq combines batch processing with a fully customizable reference sequence selection of organellar genome records from NCBI and/or references uploaded by the user. For the annotation of chloroplast genomes, the application additionally provides an integrated database of manually curated reference sequences. GeSeq identifies genes or other feature-encoding regions by BLAT-based homology searches and additionally, by profile HMM searches for protein and rRNA coding genes and two de novo predictors for tRNA genes. These unique features enable the user to conveniently compare the annotations of different state-of-the-art methods, thus supporting high-quality annotations. The main output of GeSeq is a GenBank file that usually requires only little curation and is instantly visualized by OGDRAW. GeSeq also offers a variety of optional additional outputs that facilitate downstream analyzes, for example comparative genomic or phylogenetic studies.

1,663 citations

Journal ArticleDOI
TL;DR: The likelihood that this ancient gene superfamily has existed for more than 3.5 billion years, and that the rate of P450 gene evolution appears to be quite nonlinear, is discussed.
Abstract: We provide here a list of 221 P450 genes and 12 putative pseudogenes that have been characterized as of December 14, 1992. These genes have been described in 31 eukaryotes (including 11 mammalian and 3 plant species) and 11 prokaryotes. Of 36 gene families so far described, 12 families exist in all mammals examined to date. These 12 families comprise 22 mammalian subfamilies, of which 17 and 15 have been mapped in the human and mouse genome, respectively. To date, each subfamily appears to represent a cluster of tightly linked genes. This revision supersedes the previous updates [Nebert et al., DNA 6, 1–11, 1987; Nebert et al., DNA 8, 1–13, 1989; Nebert et al., DNA Cell Biol. 10, 1–14 (1991)] in which a nomenclature system, based on divergent evolution of the superfamily, has been described. For the gene and cDNA, we recommend that the italicized root symbol "CYP" for human ("Cyp" for mouse), representing "cytochrome P450," be followed by an Arabic number denoting the family, a letter designating...

1,660 citations


Network Information
Related Topics (5)
Gene
211.7K papers, 10.3M citations
96% related
Transcription (biology)
56.5K papers, 2.9M citations
92% related
RNA
111.6K papers, 5.4M citations
91% related
Regulation of gene expression
85.4K papers, 5.8M citations
91% related
Gene expression
113.3K papers, 5.5M citations
90% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20242
20237,313
202214,209
20214,955
20205,080
20194,839