scispace - formally typeset
Search or ask a question
Author

Arun H. Patil

Bio: Arun H. Patil is an academic researcher from Yenepoya University. The author has contributed to research in topics: Proteomics & Proteome. The author has an hindex of 16, co-authored 42 publications receiving 2425 citations. Previous affiliations of Arun H. Patil include Johns Hopkins University School of Medicine & Johns Hopkins University.

Papers
More filters
Journal ArticleDOI
29 May 2014-Nature
TL;DR: A draft map of the human proteome is presented using high-resolution Fourier-transform mass spectrometry to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-c coding RNAs and upstream open reading frames.
Abstract: The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here we present a draft map of the human proteome using high-resolution Fourier-transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames. This large human proteome catalogue (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease.

1,965 citations

Journal ArticleDOI
TL;DR: This project evaluated 8 billion small RNA-seq reads and identified both specific and ubiquitous patterns of expression that strongly correlate with adjacent superenhancer activity, establishing the landscape of human cell-specific microRNA expression.
Abstract: MicroRNAs are short RNAs that serve as regulators of gene expression and are essential components of normal development as well as modulators of disease. MicroRNAs generally act cell-autonomously, and thus their localization to specific cell types is needed to guide our understanding of microRNA activity. Current tissue-level data have caused considerable confusion, and comprehensive cell-level data do not yet exist. Here, we establish the landscape of human cell-specific microRNA expression. This project evaluated 8 billion small RNA-seq reads from 46 primary cell types, 42 cancer or immortalized cell lines, and 26 tissues. It identified both specific and ubiquitous patterns of expression that strongly correlate with adjacent superenhancer activity. Analysis of unaligned RNA reads uncovered 207 unknown minor strand (passenger) microRNAs of known microRNA loci and 495 novel putative microRNA loci. Although cancer cell lines generally recapitulated the expression patterns of matched primary cells, their isomiR sequence families exhibited increased disorder, suggesting DROSHA- and DICER1-dependent microRNA processing variability. Cell-specific patterns of microRNA expression were used to de-convolute variable cellular composition of colon and adipose tissue samples, highlighting one use of these cell-specific microRNA expression data. Characterization of cellular microRNA expression across a wide variety of cell types provides a new understanding of this critical regulatory RNA species.

135 citations

Journal ArticleDOI
TL;DR: This large catalog of vitreous proteins should facilitate biomedical research into pathological conditions of the eye including diabetic retinopathy, retinal detachment and cataract.
Abstract: Background The vitreous humor is a transparent, gelatinous mass whose main constituent is water. It plays an important role in providing metabolic nutrient requirements of the lens, coordinating eye growth and providing support to the retina. It is in close proximity to the retina and reflects many of the changes occurring in this tissue. The biochemical changes occurring in the vitreous could provide a better understanding about the pathophysiological processes that occur in vitreoretinopathy. In this study, we investigated the proteome of normal human vitreous humor using high resolution Fourier transform mass spectrometry.

118 citations

Journal ArticleDOI
TL;DR: The data provide a framework for how future genome sequencing efforts should incorporate transcriptomic and proteomic analysis in combination with simultaneous manual curation to achieve near complete assembly and accurate annotation of genomes.
Abstract: Complementing genome sequence with deep transcriptome and proteome data could enable more accurate assembly and annotation of newly sequenced genomes. Here, we provide a proof-of-concept of an integrated approach for analysis of the genome and proteome of Anopheles stephensi, which is one of the most important vectors of the malaria parasite. To achieve broad coverage of genes, we carried out transcriptome sequencing and deep proteome profiling of multiple anatomically distinct sites. Based on transcriptomic data alone, we identified and corrected 535 events of incomplete genome assembly involving 1196 scaffolds and 868 protein-coding gene models. This proteogenomic approach enabled us to add 365 genes that were missed during genome annotation and identify 917 gene correction events through discovery of 151 novel exons, 297 protein extensions, 231 exon extensions, 192 novel protein start sites, 19 novel translational frames, 28 events of joining of exons, and 76 events of joining of adjacent genes as a single gene. Incorporation of proteomic evidence allowed us to change the designation of more than 87 predicted "noncoding RNAs" to conventional mRNAs coded by protein-coding genes. Importantly, extension of the newly corrected genome assemblies and gene models to 15 other newly assembled Anopheline genomes led to the discovery of a large number of apparent discrepancies in assembly and annotation of these genomes. Our data provide a framework for how future genome sequencing efforts should incorporate transcriptomic and proteomic analysis in combination with simultaneous manual curation to achieve near complete assembly and accurate annotation of genomes.

55 citations

Journal ArticleDOI
TL;DR: The findings significantly expand the understanding of IL‐33‐mediated signaling events and have the potential to provide novel therapeutic targets pertaining to immune‐related diseases such as asthma where dysregulation of IL•33 is observed.
Abstract: Interleukin-33 (IL-33) is a novel member of the IL-1 family of cytokines that plays diverse roles in the regulation of immune responses. IL-33 exerts its effects through a heterodimeric receptor complex resulting in the production and release of proinflammatory cytokines. A detailed understanding of the signaling pathways activated by IL-33 is still unclear. To gain insights into the IL-33-mediated signaling mechanisms, we carried out a SILAC-based global quantitative phosphoproteomic analysis that resulted in the identification of 7191 phosphorylation sites derived from 2746 proteins. We observed alterations in the level of phosphorylation in 1050 sites corresponding to 672 proteins upon IL-33 stimulation. We report, for the first time, phosphorylation of multiple protein kinases, including mitogen-activated protein kinase activated protein kinase 2 (Mapkapk2), receptor (TNFRSF) interacting serine-threonine kinase 1 (Ripk1), and NAD kinase (Nadk) that are induced by IL-33. In addition, we observed IL-33-induced phosphorylation of several protein phosphatases including protein tyrosine phosphatase, nonreceptor-type 12 (Ptpn12), and inositol polyphosphate-5-phosphatase D (Inpp5d), which have not been reported previously. Network analysis revealed an enrichment of actin binding and cytoskeleton reorganization that could be important in macrophage activation induced by IL-33. Our study is the first quantitative analysis of IL-33-regulated phosphoproteome. Our findings significantly expand the understanding of IL-33-mediated signaling events and have the potential to provide novel therapeutic targets pertaining to immune-related diseases such as asthma where dysregulation of IL-33 is observed. All MS data have been deposited in the ProteomeXchange with identifier PXD000984 (http://proteomecentral.proteomexchange.org/dataset/PXD000984).

51 citations


Cited by
More filters
Journal ArticleDOI
23 Jan 2015-Science
TL;DR: In this paper, a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level.
Abstract: Resolving the molecular details of proteome variation in the different tissues and organs of the human body will greatly increase our knowledge of human biology and disease. Here, we present a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level. Our tissue-based analysis detected more than 90% of the putative protein-coding genes. We used this approach to explore the human secretome, the membrane proteome, the druggable proteome, the cancer proteome, and the metabolic functions in 32 different tissues and organs. All the data are integrated in an interactive Web-based database that allows exploration of individual proteins, as well as navigation of global expression patterns, in all major tissues and organs in the human body.

9,745 citations

Journal ArticleDOI
TL;DR: A significant update to one of the tools in this domain called Enrichr, a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries is presented.
Abstract: Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.

6,201 citations

Journal ArticleDOI
TL;DR: The developments in PRIDE resources and related tools are summarized and a brief update on the resources under development 'PRIDE Cluster' and 'PRide Proteomes', which provide a complementary view and quality-scored information of the peptide and protein identification data available inPRIDE Archive are given.
Abstract: The PRoteomics IDEntifications (PRIDE) database is one of the world-leading data repositories of mass spectrometry (MS)-based proteomics data Since the beginning of 2014, PRIDE Archive (http://wwwebiacuk/pride/archive/) is the new PRIDE archival system, replacing the original PRIDE database Here we summarize the developments in PRIDE resources and related tools since the previous update manuscript in the Database Issue in 2013 PRIDE Archive constitutes a complete redevelopment of the original PRIDE, comprising a new storage backend, data submission system and web interface, among other components PRIDE Archive supports the most-widely used PSI (Proteomics Standards Initiative) data standard formats (mzML and mzIdentML) and implements the data requirements and guidelines of the ProteomeXchange Consortium The wide adoption of ProteomeXchange within the community has triggered an unprecedented increase in the number of submitted data sets (around 150 data sets per month) We outline some statistics on the current PRIDE Archive data contents We also report on the status of the PRIDE related stand-alone tools: PRIDE Inspector, PRIDE Converter 2 and the ProteomeXchange submission tool Finally, we will give a brief update on the resources under development 'PRIDE Cluster' and 'PRIDE Proteomes', which provide a complementary view and quality-scored information of the peptide and protein identification data available in PRIDE Archive

3,375 citations

Journal ArticleDOI
TL;DR: The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.
Abstract: Long noncoding RNAs (lncRNAs) are emerging as important regulators of tissue physiology and disease processes including cancer. To delineate genome-wide lncRNA expression, we curated 7,256 RNA sequencing (RNA-seq) libraries from tumors, normal tissues and cell lines comprising over 43 Tb of sequence from 25 independent studies. We applied ab initio assembly methodology to this data set, yielding a consensus human transcriptome of 91,013 expressed genes. Over 68% (58,648) of genes were classified as lncRNAs, of which 79% were previously unannotated. About 1% (597) of the lncRNAs harbored ultraconserved elements, and 7% (3,900) overlapped disease-associated SNPs. To prioritize lineage-specific, disease-associated lncRNA expression, we employed non-parametric differential expression testing and nominated 7,942 lineage- or cancer-associated lncRNA genes. The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.

2,209 citations

01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations