Author
John A. Wrobel
Other affiliations: College of the Holy Cross, Florida State University College of Arts and Sciences
Bio: John A. Wrobel is an academic researcher from University of North Carolina at Chapel Hill. The author has contributed to research in topics: Human genome & Proteomics. The author has an hindex of 10, co-authored 18 publications receiving 7205 citations. Previous affiliations of John A. Wrobel include College of the Holy Cross & Florida State University College of Arts and Sciences.
Papers
More filters
••
Cold Spring Harbor Laboratory1, University of California, Irvine2, California Institute of Technology3, Florida State University College of Arts and Sciences4, Yale University5, Wellcome Trust Sanger Institute6, Norwegian University of Science and Technology7, Affymetrix8, University of North Carolina at Chapel Hill9, University of Lausanne10, University of Geneva11, Genome Institute of Singapore12, Stanford University13, Pompeu Fabra University14
TL;DR: Evidence that three-quarters of the human genome is capable of being transcribed is reported, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs that prompt a redefinition of the concept of a gene.
Abstract: Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.
4,450 citations
01 Sep 2012
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
2,767 citations
••
University of Washington1, Fred Hutchinson Cancer Research Center2, Broad Institute3, Pacific Northwest National Laboratory4, Johns Hopkins University5, Washington University in St. Louis6, Vanderbilt University7, National Institutes of Health8, University of North Carolina at Chapel Hill9, Eli Lilly and Company10, ETH Zurich11, Mayo Clinic12, National Institute of Standards and Technology13, Quest Diagnostics14, Institute for Systems Biology15, University of Utah16, Pfizer17, Thermo Fisher Scientific18, Cedars-Sinai Medical Center19, Wake Forest University20, Baylor College of Medicine21
TL;DR: The Clinical Proteomic Tumor Analysis Consortium of the National Cancer Institute has collaborated with clinical laboratorians, peptide manufacturers, metrologists, representatives of the pharmaceutical industry, and other professionals to develop a consensus set of recommendations for peptide procurement, characterization, storage, and handling.
Abstract: BACKGROUND: For many years, basic and clinical researchers have taken advantage of the analytical sensitivity and specificity afforded by mass spectrometry in the measurement of proteins. Clinical laboratories are now beginning to deploy these work flows as well. For assays that use proteolysis to generate peptides for protein quantification and characterization, synthetic stable isotope–labeled internal standard peptides are of central importance. No general recommendations are currently available surrounding the use of peptides in protein mass spectrometric assays.
CONTENT: The Clinical Proteomic Tumor Analysis Consortium of the National Cancer Institute has collaborated with clinical laboratorians, peptide manufacturers, metrologists, representatives of the pharmaceutical industry, and other professionals to develop a consensus set of recommendations for peptide procurement, characterization, storage, and handling, as well as approaches to the interpretation of the data generated by mass spectrometric protein assays. Additionally, the importance of carefully characterized reference materials—in particular, peptide standards for the improved concordance of amino acid analysis methods across the industry—is highlighted. The alignment of practices around the use of peptides and the transparency of sample preparation protocols should allow for the harmonization of peptide and protein quantification in research and clinical care.
176 citations
••
Fred Hutchinson Cancer Research Center1, Leidos2, University of Washington3, University of North Carolina at Chapel Hill4, Broad Institute5, Vanderbilt University6, Washington University in St. Louis7, National Institutes of Health8, Johns Hopkins University9, New York University10, Pacific Northwest National Laboratory11
TL;DR: To address these issues, the Clinical Proteomic Tumor Analysis Consortium (CPTAC) of the National Cancer Institute has launched an Assay Portal to serve as a public repository of well-characterized quantitative, MS-based, targeted proteomic assays.
Abstract: To address these issues, the Clinical Proteomic Tumor Analysis Consortium (CPTAC) of the National Cancer Institute (NCI) has launched an Assay Portal (http://assays.cancer.gov) to serve as a public repository of well-characterized quantitative, MS-based, targeted proteomic assays. The purpose of the CPTAC Assay Portal is to facilitate widespread adoption of targeted MS assays by disseminating SOPs, reagents, and assay characterization data for highly characterized assays. A primary aim of the NCI-supported portal is to bring together clinicians or biologists and analytical chemists to answer hypothesis-driven questions using targeted, MS-based assays. Assay content is easily accessed through queries and filters, enabling investigators to find assays to proteins relevant to their areas of interest. Detailed characterization data are available for each assay, enabling researchers to evaluate assay performance prior to launching the assay in their own laboratory.
141 citations
••
TL;DR: This study demonstrates that MS-based proteomics can identify therapeutic targets and highlights the potential of PDX drug response evaluation to annotate MS- based pathway activities.
Abstract: Recent advances in mass spectrometry (MS) have enabled extensive analysis of cancer proteomes. Here, we employed quantitative proteomics to profile protein expression across 24 breast cancer patient-derived xenograft (PDX) models. Integrated proteogenomic analysis shows positive correlation between expression measurements from transcriptomic and proteomic analyses; further, gene expression-based intrinsic subtypes are largely re-capitulated using non-stromal protein markers. Proteogenomic analysis also validates a number of predicted genomic targets in multiple receptor tyrosine kinases. However, several protein/phosphoprotein events such as overexpression of AKT proteins and ARAF, BRAF, HSP90AB1 phosphosites are not readily explainable by genomic analysis, suggesting that druggable translational and/or post-translational regulatory events may be uniquely diagnosed by MS. Drug treatment experiments targeting HER2 and components of the PI3K pathway supported proteogenomic response predictions in seven xenograft models. Our study demonstrates that MS-based proteomics can identify therapeutic targets and highlights the potential of PDX drug response evaluation to annotate MS-based pathway activities.
114 citations
Cited by
More filters
••
TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.
Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.
30,684 citations
••
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.
13,548 citations
••
TL;DR: In this paper, a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level.
Abstract: Resolving the molecular details of proteome variation in the different tissues and organs of the human body will greatly increase our knowledge of human biology and disease. Here, we present a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level. Our tissue-based analysis detected more than 90% of the putative protein-coding genes. We used this approach to explore the human secretome, the membrane proteome, the druggable proteome, the cancer proteome, and the metabolic functions in 32 different tissues and organs. All the data are integrated in an interactive Web-based database that allows exploration of individual proteins, as well as navigation of global expression patterns, in all major tissues and organs in the human body.
9,745 citations
•
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.
8,106 citations
••
TL;DR: The Gene Expression Omnibus is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community and supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable.
Abstract: The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.
6,683 citations