Home
/
Authors
/
Yoko Sato

Author

Yoko Sato

Bio: Yoko Sato is an academic researcher from Fujitsu. The author has contributed to research in topics: KEGG & Genome. The author has an hindex of 10, co-authored 14 publications receiving 13681 citations. Previous affiliations of Yoko Sato include Kyoto University.

Topics: KEGG, Genome, Comparative genomics, Faraday cage, The Internet ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

KEGG: new perspectives on genomes, pathways, diseases and drugs

[...]

Minoru Kanehisa¹, Miho Furumichi¹, Mao Tanabe¹, Yoko Sato², Kanae Morishima¹ - Show less +1 more•Institutions (2)

Kyoto University¹, Fujitsu²

04 Jan 2017-Nucleic Acids Research

TL;DR: The content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases, and the newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined.

...read moreread less

Abstract: KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an encyclopedia of genes and genomes. Assigning functional meanings to genes and genomes both at the molecular and higher levels is the primary objective of the KEGG database project. Molecular-level functions are stored in the KO (KEGG Orthology) database, where each KO is defined as a functional ortholog of genes and proteins. Higher-level functions are represented by networks of molecular interactions, reactions and relations in the forms of KEGG pathway maps, BRITE hierarchies and KEGG modules. In the past the KO database was developed for the purpose of defining nodes of molecular networks, but now the content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases. The newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined. Furthermore, the DISEASE and DRUG databases have been improved by systematic analysis of drug labels for better integration of diseases and drugs with the KEGG molecular networks. KEGG is moving towards becoming a comprehensive knowledge base for both functional interpretation and practical application of genomic information.

...read moreread less

5,741 citations

Journal Article•DOI•

KEGG as a reference resource for gene and protein annotation

[...]

Minoru Kanehisa¹, Yoko Sato², Masayuki Kawashima², Miho Furumichi¹, Mao Tanabe¹ - Show less +1 more•Institutions (2)

Kyoto University¹, Fujitsu²

04 Jan 2016-Nucleic Acids Research

TL;DR: The KEGG GENES database now includes viruses, plasmids, and the addendum category for functionally characterized proteins that are not represented in complete genomes, and new automatic annotation servers, BlastKOalA and GhostKOALA, are made available utilizing the non-redundant pangenome data set generated from theGENES database.

...read moreread less

Abstract: KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an integrated database resource for biological interpretation of genome sequences and other high-throughput data. Molecular functions of genes and proteins are associated with ortholog groups and stored in the KEGG Orthology (KO) database. The KEGG pathway maps, BRITE hierarchies and KEGG modules are developed as networks of KO nodes, representing high-level functions of the cell and the organism. Currently, more than 4000 complete genomes are annotated with KOs in the KEGG GENES database, which can be used as a reference data set for KO assignment and subsequent reconstruction of KEGG pathways and other molecular networks. As an annotation resource, the following improvements have been made. First, each KO record is re-examined and associated with protein sequence data used in experiments of functional characterization. Second, the GENES database now includes viruses, plasmids, and the addendum category for functionally characterized proteins that are not represented in complete genomes. Third, new automatic annotation servers, BlastKOALA and GhostKOALA, are made available utilizing the non-redundant pangenome data set generated from the GENES database. As a resource for translational bioinformatics, various data sets are created for antimicrobial resistance and drug interaction networks.

...read moreread less

4,847 citations

Journal Article•DOI•

Data, information, knowledge and principle: back to metabolism in KEGG

[...]

Minoru Kanehisa¹, Susumu Goto¹, Yoko Sato¹, Masayuki Kawashima¹, Miho Furumichi¹, Mao Tanabe¹ - Show less +2 more•Institutions (1)

Kyoto University¹

01 Jan 2014-Nucleic Acids Research

TL;DR: The reaction modules, which represent chemical units of reactions, have been used to analyze design principles of metabolic networks and also to improve the definition of K numbers and associated annotations for translational bioinformatics.

...read moreread less

Abstract: In the hierarchy of data, information and knowledge, computational methods play a major role in the initial processing of data to extract information, but they alone become less effective to compile knowledge from information. The Kyoto Encyclopedia of Genes and Genomes (KEGG) resource (http://www.kegg.jp/ or http://www.genome.jp/kegg/) has been developed as a reference knowledge base to assist this latter process. In particular, the KEGG pathway maps are widely used for biological interpretation of genome sequences and other high-throughput data. The link from genomes to pathways is made through the KEGG Orthology system, a collection of manually defined ortholog groups identified by K numbers. To better automate this interpretation process the KEGG modules defined by Boolean expressions of K numbers have been expanded and improved. Once genes in a genome are annotated with K numbers, the KEGG modules can be computationally evaluated revealing metabolic capacities and other phenotypic features. The reaction modules, which represent chemical units of reactions, have been used to analyze design principles of metabolic networks and also to improve the definition of K numbers and associated annotations. For translational bioinformatics, the KEGG MEDICUS resource has been developed by integrating drug labels (package inserts) used in society.

...read moreread less

2,808 citations

Journal Article•DOI•

BlastKOALA and GhostKOALA: KEGG Tools for Functional Characterization of Genome and Metagenome Sequences.

[...]

Minoru Kanehisa¹, Yoko Sato², Kanae Morishima¹•Institutions (2)

Kyoto University¹, Fujitsu²

22 Feb 2016-Journal of Molecular Biology

TL;DR: Both BlastKOALA and GhostKOalA are automatic annotation servers for genome and metagenome sequences, which perform KO (KEGG Orthology) assignments to characterize individual gene functions and reconstruct KEGG pathways, BRITE hierarchies and K EGG modules to infer high-level functions of the organism or the ecosystem.

...read moreread less

2,247 citations

Journal Article•DOI•

KEGG: integrating viruses and cellular organisms.

[...]

Minoru Kanehisa¹, Miho Furumichi¹, Yoko Sato², Mari Ishiguro-Watanabe³, Mao Tanabe¹ - Show less +1 more•Institutions (3)

Kyoto University¹, Fujitsu², University of Tokyo³

08 Jan 2021-Nucleic Acids Research

TL;DR: The K EGG pathway maps are now integrated with network variation maps in the NETWORK database, as well as with conserved functional units of KEGG modules and reaction modules in the MODULE database, and the KO database for functional orthologs continues to be improved.

...read moreread less

Abstract: KEGG (https://www.kegg.jp/) is a manually curated resource integrating eighteen databases categorized into systems, genomic, chemical and health information. It also provides KEGG mapping tools, which enable understanding of cellular and organism-level functions from genome sequences and other molecular datasets. KEGG mapping is a predictive method of reconstructing molecular network systems from molecular building blocks based on the concept of functional orthologs. Since the introduction of the KEGG NETWORK database, various diseases have been associated with network variants, which are perturbed molecular networks caused by human gene variants, viruses, other pathogens and environmental factors. The network variation maps are created as aligned sets of related networks showing, for example, how different viruses inhibit or activate specific cellular signaling pathways. The KEGG pathway maps are now integrated with network variation maps in the NETWORK database, as well as with conserved functional units of KEGG modules and reaction modules in the MODULE database. The KO database for functional orthologs continues to be improved and virus KOs are being expanded for better understanding of virus-cell interactions and for enabling prediction of viral perturbations.

...read moreread less

2,087 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.

[...]

Damian Szklarczyk¹, Annika L. Gable¹, David Lyon¹, Alexander Junge², Stefan Wyder¹, Jaime Huerta-Cepas³, Milan Simonovic¹, Nadezhda Tsankova Doncheva², John H. Morris⁴, Peer Bork, Lars Juhl Jensen², Christian von Mering¹ - Show less +8 more•Institutions (4)

Swiss Institute of Bioinformatics¹, University of Copenhagen², Technical University of Madrid³, University of California, San Francisco⁴

08 Jan 2019-Nucleic Acids Research

TL;DR: The latest version of STRING more than doubles the number of organisms it covers, and offers an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input.

...read moreread less

Abstract: Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.

...read moreread less

10,584 citations

Journal Article•DOI•

STRING v10: protein–protein interaction networks, integrated over the tree of life

[...]

Damian Szklarczyk¹, Andrea Franceschini¹, Stefan Wyder¹, Kristoffer Forslund, Davide Heller¹, Jaime Huerta-Cepas, Milan Simonovic¹, Alexander Roth¹, Alberto Santos², Kalliopi Tsafou², Michael Kuhn³, Peer Bork, Lars Juhl Jensen², Christian von Mering¹ - Show less +10 more•Institutions (3)

Swiss Institute of Bioinformatics¹, University of Copenhagen², Dresden University of Technology³

28 Jan 2015-Nucleic Acids Research

TL;DR: H hierarchical and self-consistent orthology annotations are introduced for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resolution in the STRING database.

...read moreread less

Abstract: The many functional partnerships and interactions that occur between proteins are at the core of cellular processing and their systematic characterization helps to provide context in molecular systems biology. However, known and predicted interactions are scattered over multiple resources, and the available data exhibit notable differences in terms of quality and completeness. The STRING database (http://string-db.org) aims to provide a critical assessment and integration of protein-protein interactions, including direct (physical) as well as indirect (functional) associations. The new version 10.0 of STRING covers more than 2000 organisms, which has necessitated novel, scalable algorithms for transferring interaction information between organisms. For this purpose, we have introduced hierarchical and self-consistent orthology annotations for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resolution. Further improvements in version 10.0 include a completely redesigned prediction pipeline for inferring protein-protein associations from co-expression data, an API interface for the R computing environment and improved statistical analysis for enrichment tests in user-provided networks.

...read moreread less

8,224 citations

Journal Article•DOI•

KEGG: new perspectives on genomes, pathways, diseases and drugs

[...]

Minoru Kanehisa¹, Miho Furumichi¹, Mao Tanabe¹, Yoko Sato², Kanae Morishima¹ - Show less +1 more•Institutions (2)

Kyoto University¹, Fujitsu²

04 Jan 2017-Nucleic Acids Research

...read moreread less

5,741 citations

Journal Article•DOI•

The Perseus computational platform for comprehensive analysis of (prote)omics data.

[...]

Stefka Tyanova¹, Tikira Temu¹, Pavel Sinitcyn¹, Arthur Carlson¹, Marco Y. Hein², Tamar Geiger³, Matthias Mann¹, Jürgen Cox¹ - Show less +4 more•Institutions (3)

Max Planck Society¹, University of California, San Francisco², Tel Aviv University³

01 Sep 2016-Nature Methods

TL;DR: The Perseus software platform was developed to support biological and biomedical researchers in interpreting protein quantification, interaction and post-translational modification data and it is anticipated that Perseus's arsenal of algorithms and its intuitive usability will empower interdisciplinary analysis of complex large data sets.

...read moreread less

Abstract: A main bottleneck in proteomics is the downstream biological analysis of highly multivariate quantitative protein abundance data generated using mass-spectrometry-based analysis. We developed the Perseus software platform (http://www.perseus-framework.org) to support biological and biomedical researchers in interpreting protein quantification, interaction and post-translational modification data. Perseus contains a comprehensive portfolio of statistical tools for high-dimensional omics data analysis covering normalization, pattern recognition, time-series analysis, cross-omics comparisons and multiple-hypothesis testing. A machine learning module supports the classification and validation of patient groups for diagnosis and prognosis, and it also detects predictive protein signatures. Central to Perseus is a user-friendly, interactive workflow environment that provides complete documentation of computational methods used in a publication. All activities in Perseus are realized as plugins, and users can extend the software by programming their own, which can be shared through a plugin store. We anticipate that Perseus's arsenal of algorithms and its intuitive usability will empower interdisciplinary analysis of complex large data sets.

...read moreread less

5,165 citations