Author
Chris F. Taylor
Other affiliations: Laboratory of Molecular Biology, Wellcome Trust, Natural Environment Research Council ...read more
Bio: Chris F. Taylor is an academic researcher from European Bioinformatics Institute. The author has contributed to research in topics: Proteomics Standards Initiative & Ontology (information science). The author has an hindex of 37, co-authored 73 publications receiving 9535 citations. Previous affiliations of Chris F. Taylor include Laboratory of Molecular Biology & Wellcome Trust.
Papers
More filters
30 Oct 2006
TL;DR: In this paper, the authors examine the evidence on the economic impacts of climate change itself, and explore the economics of stabilizing greenhouse gases in the atmosphere, concluding that the benefits of strong, early action on climate change considerably outweigh the costs.
Abstract: The Review's executive summary states that "the Review first examines the evidence on the economic impacts of climate change itself, and explores the economics of stabilizing greenhouse gases in the atmosphere. The second half of the Review considers the complex policy challenges involved in managing the transition to a low-carbon economy and in ensuring that societies can adapt to the consequences of climate change that can no longer be avoided".
The report's main conclusion is that the benefits of strong, early action on climate change considerably outweigh the costs.
1,472 citations
••
Michigan State University1, J. Craig Venter Institute2, National Institutes of Health3, Wellcome Trust Sanger Institute4, Plymouth Marine Laboratory5, University of Maryland, Baltimore6, University of Cambridge7, University of York8, United States Department of Energy9, Ghent University10, Pennsylvania State University11, Argonne National Laboratory12, University of California, San Diego13, Jacobs University Bremen14, University of Colorado Boulder15, National Science Foundation16, Edinburgh Napier University17, Boston Children's Hospital18, University of Georgia19, University of California, Berkeley20, Newcastle University21, Lawrence Berkeley National Laboratory22, University of California, Irvine23, University of Oxford24, Howard University25, Abertay University26, University of Manchester27, Technical University of Denmark28, University of Wyoming29, University of Pennsylvania30, University of New Mexico31
TL;DR: Here, the minimum information about a genome sequence (MIGS) specification is introduced with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange.
Abstract: With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases.
1,097 citations
••
TL;DR: The 'mzXML' format is introduced, an open, generic XML (extensible markup language) representation of MS data that will facilitate data management, interpretation and dissemination in proteomics research.
Abstract: A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary. The diverse, nontransparent nature of the data structure complicates the integration of new instruments into preexisting infrastructure, impedes the analysis, exchange, comparison and publication of results from different experiments and laboratories, and prevents the bioinformatics community from accessing data sets required for software development. Here, we introduce the 'mzXML' format, an open, generic XML (extensible markup language) representation of MS data. We have also developed an accompanying suite of supporting programs. We expect that this format will facilitate data management, interpretation and dissemination in proteomics research.
788 citations
••
Wellcome Trust1, European Bioinformatics Institute2, University of Manchester3, University of Cambridge4, Swiss Institute of Bioinformatics5, École Polytechnique Fédérale de Lausanne6, Institute for Systems Biology7, University College Dublin8, Utrecht University9, University of Vienna10, Max Planck Society11, New York University12, Amgen13, University of California, Los Angeles14, Applied Biosystems15, Semel Institute for Neuroscience and Human Behavior16, Flanders Institute for Biotechnology17, University of New South Wales18, Scripps Research Institute19
TL;DR: The processes and principles underpinning the development of guidance modules for reporting the use of techniques such as gel electrophoresis and mass spectrometry are described and the ramifications for various interest groups such as experimentalists, funders, publishers and the private sector are discussed.
Abstract: Both the generation and the analysis of proteomics data are now widespread, and high-throughput approaches are commonplace. Protocols continue to increase in complexity as methods and technologies evolve and diversify. To encourage the standardized collection, integration, storage and dissemination of proteomics data, the Human Proteome Organization's Proteomics Standards Initiative develops guidance modules for reporting the use of techniques such as gel electrophoresis and mass spectrometry. This paper describes the processes and principles underpinning the development of these modules; discusses the ramifications for various interest groups such as experimentalists, funders, publishers and the private sector; addresses the issue of overlap with other reporting guidelines; and highlights the criticality of appropriate tools and resources in enabling 'MIAPE-compliant' reporting.
703 citations
••
TL;DR: The proteomics identifications (PRIDE) database is proposed as a means to finally turn publicly available data into publicly accessible data and offers a web‐based query interface, a user‐friendly data upload facility, and a documented application programming interface for direct computational access.
Abstract: The advent of high-throughput proteomics has enabled the identification of ever increasing numbers of proteins. Correspondingly, the number of publications centered on these protein identifications has increased dramatically. With the first results of the HUPO Plasma Proteome Project being analyzed and many other large-scale proteomics projects about to disseminate their data, this trend is not likely to flatten out any time soon. However, the publication mechanism of these identified proteins has lagged behind in technical terms. Often very long lists of identifications are either published directly with the article, resulting in both a voluminous and rather tedious read, or are included on the publisher's website as supplementary information. In either case, these lists are typically only provided as portable document format documents with a custom-made layout, making it practically impossible for computer programs to interpret them, let alone efficiently query them. Here we propose the proteomics identifications (PRIDE) database (http://www.ebi.ac.uk/pride) as a means to finally turn publicly available data into publicly accessible data. PRIDE offers a web-based query interface, a user-friendly data upload facility, and a documented application programming interface for direct computational access. The complete PRIDE database, source code, data, and support tools are freely available for web access or download and local installation.
566 citations
Cited by
More filters
••
TL;DR: The Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines target the reliability of results to help ensure the integrity of the scientific literature, promote consistency between laboratories, and increase experimental transparency.
Abstract: Background: Currently, a lack of consensus exists on how best to perform and interpret quantitative real-time PCR (qPCR) experiments. The problem is exacerbated by a lack of sufficient experimental detail in many publications, which impedes a reader’s ability to evaluate critically the quality of the results presented or to repeat the experiments.
Content: The Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines target the reliability of results to help ensure the integrity of the scientific literature, promote consistency between laboratories, and increase experimental transparency. MIQE is a set of guidelines that describe the minimum information necessary for evaluating qPCR experiments. Included is a checklist to accompany the initial submission of a manuscript to the publisher. By providing all relevant experimental conditions and assay characteristics, reviewers can assess the validity of the protocols used. Full disclosure of all reagents, sequences, and analysis methods is necessary to enable other investigators to reproduce results. MIQE details should be published either in abbreviated form or as an online supplement.
Summary: Following these guidelines will encourage better experimental practice, allowing more reliable and unequivocal interpretation of qPCR results.
12,469 citations
••
Technical University of Madrid1, Stanford University2, Elsevier3, VU University Amsterdam4, National Institutes of Health5, University of Leicester6, Harvard University7, Beijing Genomics Institute8, Maastricht University9, Wageningen University and Research Centre10, University of Oxford11, Heriot-Watt University12, University of Manchester13, University of California, San Diego14, Leiden University Medical Center15, Leiden University16, Federal University of São Paulo17, Science for Life Laboratory18, Bayer19, Swiss Institute of Bioinformatics20, Cray21, University Medical Center Groningen22, Erasmus University Rotterdam23
TL;DR: The FAIR Data Principles as mentioned in this paper are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Abstract: There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
7,602 citations
••
TL;DR: Primer3’s current capabilities are described, including more accurate thermodynamic models in the primer design process, both to improve melting temperature prediction and to reduce the likelihood that primers will form hairpins or dimers.
Abstract: Polymerase chain reaction (PCR) is a basic molecular biology technique with a multiplicity of uses, including deoxyribonucleic acid cloning and sequencing, functional analysis of genes, diagnosis of diseases, genotyping and discovery of genetic variants. Reliable primer design is crucial for successful PCR, and for over a decade, the open-source Primer3 software has been widely used for primer design, often in high-throughput genomics applications. It has also been incorporated into numerous publicly available software packages and web services. During this period, we have greatly expanded Primer3’s functionality. In this article, we describe Primer3’s current capabilities, emphasizing recent improvements. The most notable enhancements incorporate more accurate thermodynamic models in the primer design process, both to improve melting temperature prediction and to reduce the likelihood that primers will form hairpins or dimers. Additional enhancements include more precise control of primer placement—a change motivated partly by opportunities to use whole-genome sequences to improve primer specificity. We also added features to increase ease of use, including the ability to save and re-use parameter settings and the ability to require that individual primers not be used in more than one primer pair. We have made the core code more modular and provided cleaner programming interfaces to further ease integration with other software. These improvements position Primer3 for continued use with genome-scale data in the decade ahead.
7,286 citations
••
TL;DR: An objective measure of genome quality is proposed that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities and is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches.
Abstract: Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of “marker” genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities.
5,788 citations
••
TL;DR: Key statistics on the current data contents and volume of downloads are outlined, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas are outlined.
Abstract: The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world’s largest data repository of mass spectrometry-based proteomics data, and is one of the founding members of the global ProteomeXchange (PX) consortium. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2016. In the last 3 years, public data sharing through PRIDE (as part of PX) has definitely become the norm in the field. In parallel, data re-use of public proteomics data has increased enormously, with multiple applications. We first describe the new architecture of PRIDE Archive, the archival component of PRIDE. PRIDE Archive and the related data submission framework have been further developed to support the increase in submitted data volumes and additional data types. A new scalable and fault tolerant storage backend, Application Programming Interface and web interface have been implemented, as a part of an ongoing process. Additionally, we emphasize the improved support for quantitative proteomics data through the mzTab format. At last, we outline key statistics on the current data contents and volume of downloads, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas.
5,735 citations