scispace - formally typeset
Search or ask a question
Author

Alejandra Gonzalez-Beltran

Bio: Alejandra Gonzalez-Beltran is an academic researcher from Science and Technology Facilities Council. The author has contributed to research in topics: Metadata & Ontology (information science). The author has an hindex of 23, co-authored 79 publications receiving 7248 citations. Previous affiliations of Alejandra Gonzalez-Beltran include University College London & European Bioinformatics Institute.


Papers
More filters
Journal ArticleDOI
TL;DR: The FAIR Data Principles as mentioned in this paper are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Abstract: There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

7,602 citations

Journal ArticleDOI
TL;DR: The MetaboLights repository, powered by the open source ISA framework, is cross-species and cross-technique and will cover metabolite structures and their reference spectra as well as their biological roles, locations, concentrations and raw data from metabolic experiments.
Abstract: MetaboLights (http://www.ebi.ac.uk/metabolights) is the first general-purpose, open-access repository for metabolomics studies, their raw experimental data and associated metadata, maintained by one of the major open-access data providers in molecular biology. Metabolomic profiling is an important tool for research into biological functioning and into the systemic perturbations caused by diseases, diet and the environment. The effectiveness of such methods depends on the availability of public open data across a broad range of experimental methods and conditions. The MetaboLights repository, powered by the open source ISA framework, is cross-species and cross-technique. It will cover metabolite structures and their reference spectra as well as their biological roles, locations, concentrations and raw data from metabolic experiments. Studies automatically receive a stable unique accession number that can be used as a publication reference (e.g. MTBLS1). At present, the repository includes 15 submitted studies, encompassing 93 protocols for 714 assays, and span over 8 different species including human, Caenorhabditis elegans, Mus musculus and Arabidopsis thaliana. Eight hundred twenty-seven of the metabolites identified in these studies have been mapped to ChEBI. These studies cover a variety of techniques, including NMR spectroscopy and mass spectrometry.

585 citations

Journal ArticleDOI
29 Apr 2016-PLOS ONE
TL;DR: The state of OBI and several applications that are using it are described, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources.
Abstract: The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, Chemical Entities of Biological Interest (ChEBI) and Phenotype Attribute and Trait Ontology (PATO) without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. OBI has also spawned other ontologies (Information Artifact Ontology) and methods for importing parts of ontologies (Minimum information to reference an external ontology term (MIREOT)). The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. To date, OBI has created 2366 classes and 40 relations along with textual and formal definitions. The OBI Consortium maintains a web resource (http://obi-ontology.org) providing details on the people, policies, and issues being addressed in association with OBI. The current release of OBI is available at http://purl.obolibrary.org/obo/obi.owl.

265 citations

Journal ArticleDOI
TL;DR: The FAIR Data Principles as discussed by the authors are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Abstract: There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders-representing academia, industry, funding agencies, and scholarly publishers-have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

220 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The FAIR Data Principles as mentioned in this paper are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Abstract: There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

7,602 citations

Journal ArticleDOI
TL;DR: A significant comparison to the structural classification database that led to the creation of 825 new families based on their set of uncharacterized families (EUFs) was carried out and Pfam entries were connected to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms.
Abstract: The last few years have witnessed significant changes in Pfam (https://pfam.xfam.org). The number of families has grown substantially to a total of 17,929 in release 32.0. New additions have been coupled with efforts to improve existing families, including refinement of domain boundaries, their classification into Pfam clans, as well as their functional annotation. We recently began to collaborate with the RepeatsDB resource to improve the definition of tandem repeat families within Pfam. We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on their set of uncharacterized families (EUFs). Furthermore, we also connected Pfam entries to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms. Since Pfam has many community contributors, we recently enabled the linking between authorship of all Pfam entries with the corresponding authors' ORCID identifiers. This effectively permits authors to claim credit for their Pfam curation and link them to their ORCID record.

3,617 citations

Journal ArticleDOI
TL;DR: Improved data access is improved with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database.
Abstract: The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability: https://www.ebi.ac.uk/gwas/.

2,878 citations

Journal ArticleDOI
Mingxun Wang1, Jeremy Carver1, Vanessa V. Phelan2, Laura M. Sanchez2, Neha Garg2, Yao Peng1, Don D. Nguyen1, Jeramie D. Watrous2, Clifford A. Kapono1, Tal Luzzatto-Knaan2, Carla Porto2, Amina Bouslimani2, Alexey V. Melnik2, Michael J. Meehan2, Wei-Ting Liu3, Max Crüsemann4, Paul D. Boudreau4, Eduardo Esquenazi, Mario Sandoval-Calderón5, Roland D. Kersten6, Laura A. Pace2, Robert A. Quinn7, Katherine R. Duncan8, Cheng-Chih Hsu1, Dimitrios J. Floros1, Ronnie G. Gavilan, Karin Kleigrewe4, Trent R. Northen9, Rachel J. Dutton10, Delphine Parrot11, Erin E. Carlson12, Bertrand Aigle13, Charlotte Frydenlund Michelsen14, Lars Jelsbak14, Christian Sohlenkamp5, Pavel A. Pevzner1, Anna Edlund15, Anna Edlund16, Jeffrey S. McLean17, Jeffrey S. McLean16, Jörn Piel18, Brian T. Murphy19, Lena Gerwick4, Chih-Chuang Liaw20, Yu-Liang Yang21, Hans-Ulrich Humpf22, Maria Maansson14, Robert A. Keyzers23, Amy C. Sims24, Andrew R. Johnson25, Ashley M. Sidebottom25, Brian E. Sedio26, Andreas Klitgaard14, Charles B. Larson2, Charles B. Larson4, Cristopher A. Boya P., Daniel Torres-Mendoza, David Gonzalez2, Denise Brentan Silva27, Denise Brentan Silva28, Lucas Miranda Marques28, Daniel P. Demarque28, Egle Pociute, Ellis C. O’Neill4, Enora Briand11, Enora Briand4, Eric J. N. Helfrich18, Eve A. Granatosky29, Evgenia Glukhov4, Florian Ryffel18, Hailey Houson, Hosein Mohimani1, Jenan J. Kharbush4, Yi Zeng1, Julia A. Vorholt18, Kenji L. Kurita30, Pep Charusanti1, Kerry L. McPhail31, Kristian Fog Nielsen14, Lisa Vuong, Maryam Elfeki19, Matthew F. Traxler32, Niclas Engene33, Nobuhiro Koyama2, Oliver B. Vining31, Ralph S. Baric24, Ricardo Pianta Rodrigues da Silva28, Samantha J. Mascuch4, Sophie Tomasi11, Stefan Jenkins9, Venkat R. Macherla, Thomas Hoffman, Vinayak Agarwal4, Philip G. Williams34, Jingqui Dai34, Ram P. Neupane34, Joshua R. Gurr34, Andrés M. C. Rodríguez28, Anne Lamsa1, Chen Zhang1, Kathleen Dorrestein2, Brendan M. Duggan2, Jehad Almaliti2, Pierre-Marie Allard35, Prasad Phapale, Louis-Félix Nothias36, Theodore Alexandrov, Marc Litaudon36, Jean-Luc Wolfender35, Jennifer E. Kyle37, Thomas O. Metz37, Tyler Peryea38, Dac-Trung Nguyen38, Danielle VanLeer38, Paul Shinn38, Ajit Jadhav38, Rolf Müller, Katrina M. Waters37, Wenyuan Shi16, Xueting Liu39, Lixin Zhang39, Rob Knight1, Paul R. Jensen4, Bernhard O. Palsson1, Kit Pogliano1, Roger G. Linington30, Marcelino Gutiérrez, Norberto Peporine Lopes28, William H. Gerwick2, William H. Gerwick4, Bradley S. Moore2, Bradley S. Moore4, Pieter C. Dorrestein2, Pieter C. Dorrestein4, Nuno Bandeira1, Nuno Bandeira2 
TL;DR: In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations and data-driven social-networking should facilitate identification of spectra and foster collaborations.
Abstract: The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry (MS) techniques are well-suited to high-throughput characterization of NP, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social Molecular Networking (GNPS; http://gnps.ucsd.edu), an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of 'living data' through continuous reanalysis of deposited data.

2,365 citations

Journal ArticleDOI
29 Mar 2021-BMJ
TL;DR: The preferred reporting items for systematic reviews and meta-analyses (PRISMA 2020) as mentioned in this paper was developed to facilitate transparent and complete reporting of systematic reviews, and has been updated to reflect recent advances in systematic review methodology and terminology.
Abstract: The methods and results of systematic reviews should be reported in sufficient detail to allow users to assess the trustworthiness and applicability of the review findings. The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement was developed to facilitate transparent and complete reporting of systematic reviews and has been updated (to PRISMA 2020) to reflect recent advances in systematic review methodology and terminology. Here, we present the explanation and elaboration paper for PRISMA 2020, where we explain why reporting of each item is recommended, present bullet points that detail the reporting recommendations, and present examples from published reviews. We hope that changes to the content and structure of PRISMA 2020 will facilitate uptake of the guideline and lead to more transparent, complete, and accurate reporting of systematic reviews.

2,217 citations