Home
/
Authors
/
Brian Smith-White

Author

Brian Smith-White

Bio: Brian Smith-White is an academic researcher from National Institutes of Health. The author has contributed to research in topics: Genome & Genome project. The author has an hindex of 5, co-authored 5 publications receiving 3475 citations.

Topics: Genome, Genome project, RefSeq, Biological database, GenBank ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

[...]

Nuala A. O'Leary¹, Mathew W. Wright¹, J. Rodney Brister¹, Stacy Ciufo¹, Diana Haddad¹, Richard McVeigh¹, Bhanu Rajput¹, Barbara Robbertse¹, Brian Smith-White¹, Danso Ako-adjei¹, Alexander Astashyn¹, Azat Badretdin¹, Yiming Bao¹, Olga Blinkova¹, Vyacheslav Brover¹, Vyacheslav Chetvernin¹, Jinna Choi¹, Eric Cox¹, Olga Ermolaeva¹, Catherine M. Farrell¹, Tamara Goldfarb¹, Tripti Gupta¹, Daniel H. Haft¹, Eneida L. Hatcher¹, Wratko Hlavina¹, Vinita Joardar¹, Vamsi K. Kodali¹, Wenjun Li¹, Donna Maglott¹, Patrick Masterson¹, Kelly M. McGarvey¹, Michael R. Murphy¹, Kathleen O'Neill¹, Shashikant Pujar¹, Sanjida H. Rangwala¹, Daniel Rausch¹, Lillian D. Riddick¹, Conrad L. Schoch¹, Andrei Shkeda¹, Susan S. Storz¹, Hanzhen Sun¹, Françoise Thibaud-Nissen¹, Igor Tolstoy¹, Raymond E. Tully¹, Anjana R. Vatsan¹, Craig Wallin¹, David Webb¹, Wendy Wu¹, Melissa J. Landrum¹, Avi Kimchi¹, Tatiana Tatusova¹, Michael DiCuccio¹, Paul Kitts¹, Terence Murphy¹, Kim D. Pruitt¹ - Show less +51 more•Institutions (1)

National Institutes of Health¹

04 Jan 2016-Nucleic Acids Research

TL;DR: The approach to utilizing available RNA-Seq and other data types in the authors' manual curation process for vertebrate, plant, and other species is summarized, and a new direction for prokaryotic genomes and protein name management is described.

...read moreread less

Abstract: The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.

...read moreread less

4,104 citations

Journal Article•DOI•

The Rice Annotation Project Database (RAP-DB): 2008 update

[...]

Tsuyoshi Tanaka¹, Baltazar A. Antonio¹, Shoshi Kikuchi¹, Takashi Matsumoto¹, Yoshiaki Nagamura¹, Hisataka Numa¹, Hiroaki Sakai¹, Jianzhong Wu¹, Takeshi Itoh¹, Takeshi Itoh², Takuji Sasaki¹, Ryo Aono, Yasuyuki Fujii³, Takuya Habara, Erimi Harada, Masako Kanno, Yoshihiro Kawahara⁴, Hiroaki Kawashima, Hiromi Kubooka, Akihiro Matsuya, Hajime Nakaoka, Naomi Saichi, Ryoko Sanbonmatsu, Yoshiharu Sato, Yuji Shinso, Mami Suzuki, Jun-ichi Takeda, Motohiko Tanino, Fusano Todokoro, Kaori Yamaguchi, Naoyuki Yamamoto, Chisato Yamasaki, Tadashi Imanishi², Toshihisa Okido, Masahito Tada, Kazuho Ikeo, Yoshio Tateno, Takashi Gojobori, Yao-Cheng Lin⁵, Fu Jin Wei⁵, Yue-Ie C. Hsing⁵, Qiang Zhao, Bin Han, Melissa Kramer⁶, Richard W. McCombie⁶, David Lonsdale⁷, Claire O'Donovan⁷, Eleanor J. Whitfield⁷, Rolf Apweiler⁷, Kanako O. Koyanagi⁸, Jitendra P. Khurana⁹, Saurabh Raghuvanshi⁹, Nagendra K. Singh¹⁰, Akhilesh K. Tyagi⁹, Georg Haberer, Masaki Fujisawa, Satomi Hosokawa, Yukiyo Ito, Hiroshi Ikawa, Michie Shibata, Mayu Yamamoto, Richard Bruskiewich¹¹, Douglas R. Hoen¹², Thomas E. Bureau¹², Nobukazu Namiki¹³, Hajime Ohyanagi¹³, Yasumichi Sakai¹³, Satoshi Nobushima¹³, Katsumi Sakata¹³, Roberto A. Barrero¹⁴, Yutaka Sato¹⁵, Alexandre Souvorov¹⁶, Brian Smith-White¹⁶, Tatiana Tatusova¹⁶, Suyoung An¹⁷, Gynheung An¹⁷, Satoshi Oota, Galina Fuks¹⁸, Joachim Messing, Karen R. Christie¹⁹, Damien Lieberherr²⁰, Hyeran Kim²¹, Andrea Zuccolo²¹, Rod A. Wing, Kan Nobuta²², Pamela J. Green²², Cheng Lu²², Blake C. Meyers²², Cristian Chaparro²³, Benoît Piégu²³, Olivier Panaud²³, Manuel Echeverria²³ - Show less +88 more•Institutions (23)

University of Tsukuba¹, National Institute of Advanced Industrial Science and Technology², Okayama University³, Tokyo Metropolitan University⁴, Academia Sinica⁵, Cold Spring Harbor Laboratory⁶, Wellcome Trust⁷, Hokkaido University⁸, University of Delhi⁹, Indian Council of Agricultural Research¹⁰, International Rice Research Institute¹¹, McGill University¹², Mitsubishi¹³, Murdoch University¹⁴, Nagoya University¹⁵, National Institutes of Health¹⁶, Pohang University of Science and Technology¹⁷, Rutgers University¹⁸, Stanford University¹⁹, Swiss Institute of Bioinformatics²⁰, University of Arizona²¹, University of Delaware²², University of Perpignan²³

17 Dec 2007-Nucleic Acids Research

TL;DR: The latest version of the RAP-DB contains a variety of annotation data as follows: clone positions, structures and functions of 31 439 genes validated by cDNAs, RNA genes detected by massively parallel signature sequencing (MPSS) technology and sequence similarity, flanking sequences of mutant lines, transposable elements, etc.

...read moreread less

Abstract: The Rice Annotation Project Database (RAP-DB) was created to provide the genome sequence assembly of the International Rice Genome Sequencing Project (IRGSP), manually curated annotation of the sequence, and other genomics information that could be useful for comprehensive understanding of the rice biology. Since the last publication of the RAP-DB, the IRGSP genome has been revised and reassembled. In addition, a large number of rice-expressed sequence tags have been released, and functional genomics resources have been produced worldwide. Thus, we have thoroughly updated our genome annotation by manual curation of all the functional descriptions of rice genes. The latest version of the RAP-DB contains a variety of annotation data as follows: clone positions, structures and functions of 31 439 genes validated by cDNAs, RNA genes detected by massively parallel signature sequencing (MPSS) technology and sequence similarity, flanking sequences of mutant lines, transposable elements, etc. Other annotation data such as Gnomon can be displayed along with those of RAP for comparison. We have also developed a new keyword search system to allow the user to access useful information. The RAP-DB is available at: http://rapdb.dna.affrc.go.jp/ and http://rapdb.lab.nig.ac.jp/.

...read moreread less

342 citations

Journal Article•DOI•

Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana

[...]

Takeshi Itoh¹, Takeshi Itoh², Tsuyoshi Tanaka², Roberto A. Barrero, Chisato Yamasaki¹, Yasuyuki Fujii¹, Phillip Hilton¹, Baltazar A. Antonio², Hideo Aono, Rolf Apweiler, Richard Bruskiewich³, Thomas E. Bureau⁴, Frances A. Burr⁵, Antonio Costa de Oliveira⁶, Galina Fuks⁷, Takuya Habara¹, Georg Haberer, Bin Han, Erimi Harada¹, Aiko T. Hiraki¹, Hirohiko Hirochika², Douglas R. Hoen⁴, Hiroki Hokari¹, Satomi Hosokawa, Yue-Ie C. Hsing⁸, Hiroshi Ikawa⁹, Kazuho Ikeo, Tadashi Imanishi¹⁰, Tadashi Imanishi¹, Yukiyo Ito, Pankaj Jaiswal¹¹, Masako Kanno¹, Yoshihiro Kawahara¹, Yoshihiro Kawahara¹², Toshiyuki Kawamura¹, Hiroaki Kawashima¹, Jitendra P. Khurana¹³, Shoshi Kikuchi², Setsuko Komatsu², Kanako O. Koyanagi¹⁰, Hiromi Kubooka¹, Damien Lieberherr¹⁴, Yao-Cheng Lin⁸, David M. Lonsdale, Takashi Matsumoto², Akihiro Matsuya¹, W. Richard McCombie¹⁵, Joachim Messing⁷, Akio Miyao², Nicola Mulder, Yoshiaki Nagamura², Jongmin Nam¹⁶, Jongmin Nam¹⁷, Nobukazu Namiki, Hisataka Numa², Shin Nurimoto¹, Claire O'Donovan, Hajime Ohyanagi⁹, Toshihisa Okido, Satoshi Oota, Naoki Osato, Lance E. Palmer¹⁸, Lance E. Palmer¹⁵, Francis Quetier¹⁹, Saurabh Raghuvanshi¹³, Naomi Saichi¹, Hiroaki Sakai¹, Hiroaki Sakai², Yasumichi Sakai⁹, Katsumi Sakata⁹, Tetsuya Sakurai, Fumihiko Sato¹, Yoshiharu Sato¹, Heiko Schoof²⁰, Heiko Schoof²¹, Motoaki Seki, Michie Shibata, Yuji Shimizu⁹, Kazuo Shinozaki, Yuji Shinso¹, Nagendra K. Singh²², Brian Smith-White²³, Jun-ichi Takeda¹, Motohiko Tanino¹, Tatiana Tatusova²³, Supat Thongjuea²⁴, Fusano Todokoro¹, Mika Tsugane, Akhilesh K. Tyagi¹³, Apichart Vanavichit²⁴, Aihui Wang²⁵, Rod A. Wing, Kaori Yamaguchi¹, Mayu Yamamoto, Naoyuki Yamamoto¹, Yeisoo Yu²⁶, Hao Zhang¹, Qiang Zhao, Kenichi Higo², Benjamin Burr⁵, Takashi Gojobori¹, Takuji Sasaki² - Show less +98 more•Institutions (26)

National Institute of Advanced Industrial Science and Technology¹, University of Tsukuba², International Rice Research Institute³, McGill University⁴, Brookhaven National Laboratory⁵, University of Georgia⁶, Rutgers University⁷, Academia Sinica⁸, Mitsubishi⁹, Hokkaido University¹⁰, Cornell University¹¹, Tokyo Metropolitan University¹², University of Delhi¹³, Swiss Institute of Bioinformatics¹⁴, Cold Spring Harbor Laboratory¹⁵, California Institute of Technology¹⁶, Pennsylvania State University¹⁷, Stony Brook University¹⁸, Université Paris-Saclay¹⁹, Max Planck Society²⁰, Technische Universität München²¹, Indian Council of Agricultural Research²², National Institutes of Health²³, Kasetsart University²⁴, J. Craig Venter Institute²⁵, University of Arizona²⁶

01 Feb 2007-Genome Research

TL;DR: The results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene.

...read moreread less

Abstract: We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ∼32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene.

...read moreread less

254 citations

Journal Article•DOI•

Plant Genome Resources at the National Center for Biotechnology Information

[...]

David L. Wheeler¹, Brian Smith-White¹, Vyacheslav Chetvernin¹, Sergei Resenchuk¹, Susan M. Dombrowski¹, Steven W. Pechous¹, Tatiana Tatusova¹, James Ostell¹ - Show less +4 more•Institutions (1)

National Institutes of Health¹

01 Jul 2005-Plant Physiology

TL;DR: The National Center for Biotechnology Information (NCBI) integrates data from more than 20 biological databases through a flexible search and retrieval system called Entrez, which makes Entrez a powerful system for genomic research.

...read moreread less

Abstract: The National Center for Biotechnology Information (NCBI) integrates data from more than 20 biological databases through a flexible search and retrieval system called Entrez. A core Entrez database, Entrez Nucleotide, includes GenBank and is tightly linked to the NCBI Taxonomy database, the Entrez Protein database, and the scientific literature in PubMed. A suite of more specialized databases for genomes, genes, gene families, gene expression, gene variation, and protein domains dovetails with the core databases to make Entrez a powerful system for genomic research. Linked to the full range of Entrez databases is the NCBI Map Viewer, which displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allow maps from all plant genomes covered by the Map Viewer to be searched in tandem to produce a display of aligned maps from several species. PlantBLAST searches against the sequences shown in the Map Viewer allow BLAST alignments to be viewed within a genomic context. In addition, precomputed sequence similarities, such as those for proteins offered by BLAST Link, enable fluid navigation from unannotated to annotated sequences, quickening the pace of discovery. NCBI Web pages for plants, such as Plant Genome Central, complete the system by providing centralized access to NCBI's genomic resources as well as links to organism-specific Web pages beyond NCBI.

...read moreread less

19 citations

Book Chapter•DOI•

A collection of plant-specific genomic data and resources at NCBI.

[...]

Tatiana Tatusova¹, Brian Smith-White¹, James Ostell¹•Institutions (1)

National Institutes of Health¹

01 Jan 2007-Methods of Molecular Biology

TL;DR: The National Center for Biotechnology Information provides a data-rich environment in support of genomic research by collecting the biological data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains and integrating the data with analytical, search, and retrieval resources through the NCBI Web site.

...read moreread less

Abstract: The National Center for Biotechnology Information (NCBI) provides a data-rich environment in support of genomic research by collecting the biological data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains and integrating the data with analytical, search, and retrieval resources through the NCBI Web site. Entrez, an integrated search and retrieval system, enables text searches across various diverse biological databases maintained at NCBI. Map Viewer, the genome browser developed at NCBI, displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allows maps from all plant genomes available in the Map Viewer to be searched to produce a display of aligned maps from several species. Customized Plant Basic Local Alignment Search Tool (PlantBLAST) allows the user to perform sequence similarity searches in a special collection of mapped plant sequence data and to view the resulting alignments within a genomic context using Map Viewer. In addition, pre-computed sequence similarities, such as those for proteins offered by BLAST Link (BLink), enable fluid navigation from un-annotated to annotated sequences, quickening the pace of discovery. Plant Genome Central (PGC) is a Web portal that provides centralized access to all NCBI plant genome resources. Also, there are links to plant-specific Web resources external to NCBI such as organism-specific databases, genome-sequencing project Web pages, and homepages of genomic bioinformatics organizations.

...read moreread less

16 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

KEGG: new perspectives on genomes, pathways, diseases and drugs

[...]

Minoru Kanehisa¹, Miho Furumichi¹, Mao Tanabe¹, Yoko Sato², Kanae Morishima¹ - Show less +1 more•Institutions (2)

Kyoto University¹, Fujitsu²

04 Jan 2017-Nucleic Acids Research

TL;DR: The content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases, and the newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined.

...read moreread less

Abstract: KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an encyclopedia of genes and genomes. Assigning functional meanings to genes and genomes both at the molecular and higher levels is the primary objective of the KEGG database project. Molecular-level functions are stored in the KO (KEGG Orthology) database, where each KO is defined as a functional ortholog of genes and proteins. Higher-level functions are represented by networks of molecular interactions, reactions and relations in the forms of KEGG pathway maps, BRITE hierarchies and KEGG modules. In the past the KO database was developed for the purpose of defining nodes of molecular networks, but now the content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases. The newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined. Furthermore, the DISEASE and DRUG databases have been improved by systematic analysis of drug labels for better integration of diseases and drugs with the KEGG molecular networks. KEGG is moving towards becoming a comprehensive knowledge base for both functional interpretation and practical application of genomic information.

...read moreread less

5,741 citations

Journal Article•DOI•

The Sorghum bicolor genome and the diversification of grasses

[...]

Andrew H. Paterson¹, John E. Bowers¹, Rémy Bruggmann², Inna Dubchak³, Jane Grimwood⁴, Heidrun Gundlach, Georg Haberer, Uffe Hellsten³, Therese Mitros⁵, Alexander Poliakov³, Jeremy Schmutz⁴, Manuel Spannagl, Haibao Tang¹, Xiyin Wang¹, Xiyin Wang⁶, Thomas Wicker⁷, Arvind K. Bharti², Jarrod Chapman³, F. Alex Feltus⁸, F. Alex Feltus¹, Udo Gowik⁹, Igor V. Grigoriev³, Eric Lyons⁵, Christopher G. Maher¹⁰, Mihaela Martis, Apurva Narechania¹⁰, Robert Otillar³, Bryan W. Penning¹¹, Asaf Salamov³, Yu Wang, Lifang Zhang¹⁰, Nicholas C. Carpita¹¹, Michael Freeling⁵, Alan R. Gingle¹, C. Thomas Hash¹², Beat Keller⁷, Patricia E. Klein¹³, Stephen Kresovich¹⁴, Maureen C. McCann¹¹, Ray Ming¹⁵, Daniel G. Peterson¹⁶, Daniel G. Peterson¹, Mehboob-ur-Rahman¹⁷, Mehboob-ur-Rahman¹, Doreen Ware¹⁸, Doreen Ware¹⁰, Peter Westhoff⁹, Klaus F. X. Mayer, Joachim Messing², Daniel S. Rokhsar³, Daniel S. Rokhsar⁴ - Show less +47 more•Institutions (18)

University of Georgia¹, Rutgers University², United States Department of Energy³, Stanford University⁴, University of California, Berkeley⁵, North China University of Science and Technology⁶, University of Zurich⁷, Clemson University⁸, University of Düsseldorf⁹, Cold Spring Harbor Laboratory¹⁰, Purdue University¹¹, International Crops Research Institute for the Semi-Arid Tropics¹², Texas A&M University¹³, Cornell University¹⁴, University of Illinois at Urbana–Champaign¹⁵, Mississippi State University¹⁶, National Institute for Biotechnology and Genetic Engineering¹⁷, United States Department of Agriculture¹⁸

29 Jan 2009-Nature

TL;DR: An initial analysis of the ∼730-megabase Sorghum bicolor (L.) Moench genome is presented, placing ∼98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information.

...read moreread less

Abstract: Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approximately 730-megabase Sorghum bicolor (L.) Moench genome, placing approximately 98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approximately 75% larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approximately 70 million years ago, most duplicated gene sets lost one member before the sorghum-rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24% of genes are grass-specific and 7% are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.

...read moreread less

2,809 citations

Journal Article•DOI•

InterPro: the integrative protein signature database

[...]

Sarah Hunter¹, Rolf Apweiler, Teresa K. Attwood, Amos Marc Bairoch, Alex Bateman, David Binns, Peer Bork, Ujjwal Das, Louise C. Daugherty, Lauranne Duquenne, Robert D. Finn, Julian Gough, Daniel H. Haft, Nicolas Hulo, Daniel Kahn, Elizabeth Kelly, Aurélie Laugraud, Ivica Letunic, David M. Lonsdale, Rodrigo Lopez, Martin Madera, John Maslen, Craig McAnulla, Jennifer McDowall, Jaina Mistry, Alex L. Mitchell, Nicola Mulder, Darren A. Natale, Christine A. Orengo, Antony F. Quinn, Jeremy D. Selengut, Christian J. A. Sigrist, Manjula Thimma, Paul Thomas, Franck Valentin, Derek Wilson, Cathy H. Wu, Corin Yeats - Show less +34 more•Institutions (1)

European Bioinformatics Institute¹

01 Jan 2009-Nucleic Acids Research

TL;DR: The InterPro database integrates together predictive models or ‘signatures’ representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs.

...read moreread less

Abstract: The InterPro database (http://www.ebi.ac.uk/interpro/) integrates together predictive models or 'signatures' representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. Integration is performed manually and approximately half of the total approximately 58,000 signatures available in the source databases belong to an InterPro entry. Recently, we have started to also display the remaining un-integrated signatures via our web interface. Other developments include the provision of non-signature data, such as structural data, in new XML files on our FTP site, as well as the inclusion of matchless UniProtKB proteins in the existing match XML files. The web interface has been extended and now links out to the ADAN predicted protein-protein interaction database and the SPICE and Dasty viewers. The latest public release (v18.0) covers 79.8% of UniProtKB (v14.1) and consists of 16 549 entries. InterPro data may be accessed either via the web address above, via web services, by downloading files by anonymous FTP or by using the InterProScan search software (http://www.ebi.ac.uk/Tools/InterProScan/).

...read moreread less

1,834 citations

Journal Article•DOI•

A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core

[...]

Lukas Zimmermann¹, Andrew Stephens¹, Seung-Zin Nam¹, David Rau¹, Jonas M. Kübler¹, Marko Lozajic¹, Felix Gabler¹, Johannes Söding¹, Andrei N. Lupas¹, Vikram Alva¹ - Show less +6 more•Institutions (1)

Max Planck Society¹

01 Dec 2017-Journal of Molecular Biology

TL;DR: The new version of the MPI Bioinformatics Toolkit is introduced, focusing on improved features for the comprehensive analysis of proteins, as well as on promoting teaching.

...read moreread less

1,757 citations

Journal Article•DOI•

Genome sequencing and analysis of the model grass Brachypodium distachyon

[...]

John P. Vogel¹, David F. Garvin², Todd C. Mockler², Jeremy Schmutz, Daniel S. Rokhsar³, Michael W. Bevan⁴, Kerrie Barry⁵, Susan Lucas⁵, Miranda Harmon-Smith⁵, Kathleen Lail⁵, Hope Tice⁵, Jane Grimwood, Neil McKenzie⁴, Naxin Huo⁶, Yong Q. Gu⁶, Gerard R. Lazo⁶, Olin D. Anderson⁶, Frank M. You⁷, Ming-Cheng Luo⁷, Jan Dvorak⁷, Jonathan M. Wright⁴, Melanie Febrer⁴, Dominika Idziak⁸, Robert Hasterok⁸, Erika Lindquist⁵, Mei Wang⁵, Samuel E. Fox², Henry D. Priest², Sergei A. Filichkin², Scott A. Givan², Douglas W. Bryant², Jeff H. Chang², Haiyan Wu⁹, Wei Wu¹⁰, An-Ping Hsia¹⁰, Patrick S. Schnable⁹, Anantharaman Kalyanaraman¹¹, Brad Barbazuk¹², Todd P. Michael, Samuel P. Hazen¹³, Jennifer N. Bragg⁶, Debbie Laudencia-Chingcuanco⁶, Yiqun Weng¹⁴, Georg Haberer, Manuel Spannagl, Klaus F. X. Mayer, Thomas Rattei¹⁵, Therese Mitros³, Sang-Jik Lee¹⁶, Jocelyn K. C. Rose¹⁶, Lukas A. Mueller¹⁶, Thomas L. York¹⁶, Thomas Wicker¹⁷, Jan P. Buchmann¹⁷, Jaakko Tanskanen¹⁸, Alan H. Schulman¹⁸, Heidrun Gundlach, Michael W. Bevan⁴, Antonio Costa de Oliveira¹⁹, Luciano da C. Maia¹⁹, William R. Belknap⁶, Ning Jiang, Jinsheng Lai⁹, Liucun Zhu²⁰, Jianxin Ma²⁰, Cheng Sun²¹, Ellen J. Pritham²¹, Jérôme Salse, Florent Murat, Michael Abrouk, Rémy Bruggmann, Joachim Messing, Noah Fahlgren², Christopher M. Sullivan², James C. Carrington², Elisabeth J. Chapman, Greg D. May²², Jixian Zhai²³, Matthias Ganssmann²³, Sai Guna Ranjan Gurazada²³, Marcelo A German²³, Blake C. Meyers²³, Pamela J. Green²³, Ludmila Tyler³, Jiajie Wu⁷, James A. Thomson⁶, Shan Chen¹³, Henrik Vibe Scheller²⁴, Jesper Harholt²⁵, Peter Ulvskov²⁵, Jeffrey A. Kimbrel², Laura E. Bartley²⁴, Peijian Cao²⁴, Ki-Hong Jung²⁶, Manoj Sharma²⁴, Miguel E. Vega-Sánchez²⁴, Pamela C. Ronald²⁴, Chris Dardick⁶, Stefanie De Bodt²⁷, Wim Verelst²⁷, Dirk Inzé²⁷, Maren Heese²⁸, Arp Schnittger²⁸, Xiaohan Yang²⁹, Udaya C. Kalluri²⁹, Gerald A. Tuskan²⁹, Zhihua Hua¹⁴, Richard D. Vierstra¹⁴, Yu Cui⁹, Shuhong Ouyang⁹, Qixin Sun⁹, Zhiyong Liu⁹, Alper Yilmaz³⁰, Erich Grotewold³⁰, Richard Sibout³¹, Kian Hématy³¹, Grégory Mouille³¹, Herman Höfte³¹, Todd P. Michael, Jérôme Pelloux³², Devin O'Connor³, James C. Schnable³, Scott C. Rowe³, Frank G. Harmon³, Cynthia L. Cass³³, John C. Sedbrook³³, Mary E. Byrne⁴, Sean Walsh⁴, Janet Higgins⁴, Pinghua Li¹⁶, Thomas P. Brutnell¹⁶, Turgay Unver³⁴, Hikmet Budak³⁴, Harry Belcram, Mathieu Charles, Boulos Chalhoub, Ivan Baxter³⁵ - Show less +133 more•Institutions (35)

Agricultural Research Service¹, Oregon State University², University of California, Berkeley³, John Innes Centre⁴, United States Department of Energy⁵, United States Department of Agriculture⁶, University of California, Davis⁷, University of Silesia in Katowice⁸, China Agricultural University⁹, Iowa State University¹⁰, Washington State University¹¹, University of Florida¹², University of Massachusetts Amherst¹³, University of Wisconsin-Madison¹⁴, Technische Universität München¹⁵, Cornell University¹⁶, University of Zurich¹⁷, University of Helsinki¹⁸, Universidade Federal de Pelotas¹⁹, Purdue University²⁰, University of Texas at Arlington²¹, National Center for Genome Resources²², University of Delaware²³, Joint BioEnergy Institute²⁴, University of Copenhagen²⁵, Kyung Hee University²⁶, Ghent University²⁷, Centre national de la recherche scientifique²⁸, Oak Ridge National Laboratory²⁹, Ohio State University³⁰, Institut national de la recherche agronomique³¹, University of Picardie Jules Verne³², Illinois State University³³, Sabancı University³⁴, Donald Danforth Plant Science Center³⁵

11 Feb 2010-Nature

TL;DR: The high-quality genome sequence will help Brachypodium reach its potential as an important model system for developing new energy and food crops and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat.

...read moreread less

Abstract: Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

...read moreread less

1,603 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse