Author
Alejandra Gonzalez-Beltran
Other affiliations: University College London, European Bioinformatics Institute, Queen's University Belfast ...read more
Bio: Alejandra Gonzalez-Beltran is an academic researcher from Science and Technology Facilities Council. The author has contributed to research in topics: Metadata & Ontology (information science). The author has an hindex of 23, co-authored 79 publications receiving 7248 citations. Previous affiliations of Alejandra Gonzalez-Beltran include University College London & European Bioinformatics Institute.
Papers
More filters
••
Technical University of Madrid1, Stanford University2, Elsevier3, VU University Amsterdam4, National Institutes of Health5, University of Leicester6, Harvard University7, Beijing Genomics Institute8, Maastricht University9, Wageningen University and Research Centre10, University of Oxford11, Heriot-Watt University12, University of Manchester13, University of California, San Diego14, Leiden University Medical Center15, Leiden University16, Federal University of São Paulo17, Science for Life Laboratory18, Bayer19, Swiss Institute of Bioinformatics20, Cray21, University Medical Center Groningen22, Erasmus University Rotterdam23
TL;DR: The FAIR Data Principles as mentioned in this paper are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Abstract: There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
7,602 citations
••
TL;DR: The MetaboLights repository, powered by the open source ISA framework, is cross-species and cross-technique and will cover metabolite structures and their reference spectra as well as their biological roles, locations, concentrations and raw data from metabolic experiments.
Abstract: MetaboLights (http://www.ebi.ac.uk/metabolights) is the first general-purpose, open-access repository for metabolomics studies, their raw experimental data and associated metadata, maintained by one of the major open-access data providers in molecular biology. Metabolomic profiling is an important tool for research into biological functioning and into the systemic perturbations caused by diseases, diet and the environment. The effectiveness of such methods depends on the availability of public open data across a broad range of experimental methods and conditions. The MetaboLights repository, powered by the open source ISA framework, is cross-species and cross-technique. It will cover metabolite structures and their reference spectra as well as their biological roles, locations, concentrations and raw data from metabolic experiments. Studies automatically receive a stable unique accession number that can be used as a publication reference (e.g. MTBLS1). At present, the repository includes 15 submitted studies, encompassing 93 protocols for 714 assays, and span over 8 different species including human, Caenorhabditis elegans, Mus musculus and Arabidopsis thaliana. Eight hundred twenty-seven of the metabolites identified in these studies have been mapped to ChEBI. These studies cover a variety of techniques, including NMR spectroscopy and mass spectrometry.
585 citations
••
University of California, San Diego1, BC Cancer Research Centre2, University of Arkansas for Medical Sciences3, Oregon Health & Science University4, Drexel University5, University of Maryland, Baltimore6, Thermo Fisher Scientific7, Simon Fraser University8, Vrije Universiteit Brussel9, Stanford University10, Research Triangle Park11, National Institutes of Health12, Royal Society of Chemistry13, University of Oxford14, University of Michigan15, University at Buffalo16, Newcastle University17, European Bioinformatics Institute18, University of Pennsylvania19, Southern Methodist University20, University of Manchester21, La Jolla Institute for Allergy and Immunology22, J. Craig Venter Institute23, Leibniz Association24, Brunel University London25, Georgia State University26
TL;DR: The state of OBI and several applications that are using it are described, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources.
Abstract: The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, Chemical Entities of Biological Interest (ChEBI) and Phenotype Attribute and Trait Ontology (PATO) without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. OBI has also spawned other ontologies (Information Artifact Ontology) and methods for importing parts of ontologies (Minimum information to reference an external ontology term (MIREOT)). The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. To date, OBI has created 2366 classes and 40 relations along with textual and formal definitions. The OBI Consortium maintains a web resource (http://obi-ontology.org) providing details on the people, policies, and issues being addressed in association with OBI. The current release of OBI is available at http://purl.obolibrary.org/obo/obi.owl.
265 citations
••
Technical University of Madrid1, Stanford University2, Elsevier3, VU University Amsterdam4, National Institutes of Health5, University of Leicester6, Harvard University7, Beijing Genomics Institute8, Maastricht University9, Wageningen University and Research Centre10, University of Oxford11, Heriot-Watt University12, University of Manchester13, University of California, San Diego14, Leiden University Medical Center15, Leiden University16, Federal University of São Paulo17, Science for Life Laboratory18, Bayer19, Swiss Institute of Bioinformatics20, Cray21, University Medical Center Groningen22, Erasmus University Rotterdam23
TL;DR: The FAIR Data Principles as discussed by the authors are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Abstract: There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders-representing academia, industry, funding agencies, and scholarly publishers-have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
220 citations
Cited by
More filters
••
Technical University of Madrid1, Stanford University2, Elsevier3, VU University Amsterdam4, National Institutes of Health5, University of Leicester6, Harvard University7, Beijing Genomics Institute8, Maastricht University9, Wageningen University and Research Centre10, University of Oxford11, Heriot-Watt University12, University of Manchester13, University of California, San Diego14, Leiden University Medical Center15, Leiden University16, Federal University of São Paulo17, Science for Life Laboratory18, Bayer19, Swiss Institute of Bioinformatics20, Cray21, University Medical Center Groningen22, Erasmus University Rotterdam23
TL;DR: The FAIR Data Principles as mentioned in this paper are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Abstract: There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
7,602 citations
••
TL;DR: A significant comparison to the structural classification database that led to the creation of 825 new families based on their set of uncharacterized families (EUFs) was carried out and Pfam entries were connected to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms.
Abstract: The last few years have witnessed significant changes in Pfam (https://pfam.xfam.org). The number of families has grown substantially to a total of 17,929 in release 32.0. New additions have been coupled with efforts to improve existing families, including refinement of domain boundaries, their classification into Pfam clans, as well as their functional annotation. We recently began to collaborate with the RepeatsDB resource to improve the definition of tandem repeat families within Pfam. We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on their set of uncharacterized families (EUFs). Furthermore, we also connected Pfam entries to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms. Since Pfam has many community contributors, we recently enabled the linking between authorship of all Pfam entries with the corresponding authors' ORCID identifiers. This effectively permits authors to claim credit for their Pfam curation and link them to their ORCID record.
3,617 citations
••
TL;DR: Improved data access is improved with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database.
Abstract: The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability: https://www.ebi.ac.uk/gwas/.
2,878 citations
••
University of California, San Diego1, University of Montana2, Stanford University3, Scripps Institution of Oceanography4, National Autonomous University of Mexico5, Salk Institute for Biological Studies6, San Diego State University7, Strathclyde Institute of Pharmacy and Biomedical Sciences8, Lawrence Berkeley National Laboratory9, Harvard University10, University of Rennes11, University of Minnesota12, University of Lorraine13, Technical University of Denmark14, J. Craig Venter Institute15, University of California, Los Angeles16, University of Washington17, ETH Zurich18, University of Illinois at Chicago19, National Sun Yat-sen University20, Academia Sinica21, University of Münster22, Victoria University of Wellington23, University of North Carolina at Chapel Hill24, Indiana University25, Smithsonian Tropical Research Institute26, Federal University of Mato Grosso do Sul27, University of São Paulo28, University of Notre Dame29, University of California, Santa Cruz30, Oregon State University31, University of California, Berkeley32, Florida International University33, University of Hawaii at Manoa34, University of Geneva35, Institut de Chimie des Substances Naturelles36, Pacific Northwest National Laboratory37, National Institutes of Health38, Chinese Academy of Sciences39
TL;DR: In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations and data-driven social-networking should facilitate identification of spectra and foster collaborations.
Abstract: The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry (MS) techniques are well-suited to high-throughput characterization of NP, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social Molecular Networking (GNPS; http://gnps.ucsd.edu), an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of 'living data' through continuous reanalysis of deposited data.
2,365 citations
••
Monash University1, University of Ottawa2, University of Amsterdam3, University of Paris4, Bond University5, University of Texas Health Science Center at San Antonio6, American University of Beirut7, Oregon Health & Science University8, University of York9, Ottawa Hospital Research Institute10, University of Southern Denmark11, Johns Hopkins University12, Brigham and Women's Hospital13, Indiana University14, University of Bristol15, University College London16, University of Toronto17
TL;DR: The preferred reporting items for systematic reviews and meta-analyses (PRISMA 2020) as mentioned in this paper was developed to facilitate transparent and complete reporting of systematic reviews, and has been updated to reflect recent advances in systematic review methodology and terminology.
Abstract: The methods and results of systematic reviews should be reported in sufficient detail to allow users to assess the trustworthiness and applicability of the review findings. The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement was developed to facilitate transparent and complete reporting of systematic reviews and has been updated (to PRISMA 2020) to reflect recent advances in systematic review methodology and terminology. Here, we present the explanation and elaboration paper for PRISMA 2020, where we explain why reporting of each item is recommended, present bullet points that detail the reporting recommendations, and present examples from published reviews. We hope that changes to the content and structure of PRISMA 2020 will facilitate uptake of the guideline and lead to more transparent, complete, and accurate reporting of systematic reviews.
2,217 citations