Home
/
Authors
/
Angel Pizarro

Author

Angel Pizarro

Other affiliations: Amazon.com, Genomics Institute of the Novartis Research Foundation

Bio: Angel Pizarro is an academic researcher from University of Pennsylvania. The author has contributed to research in topics: Proteomics Standards Initiative & Mass spectrometry data format. The author has an hindex of 21, co-authored 39 publications receiving 2997 citations. Previous affiliations of Angel Pizarro include Amazon.com & Genomics Institute of the Novartis Research Foundation.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

mzML - a Community Standard for Mass Spectrometry Data

[...]

Lennart Martens, Matthew C. Chambers¹, Marc Sturm², Darren Kessner³, Fredrik Levander⁴, Jim Shofstahl⁵, Wilfred H. Tang⁶, Andreas Römpp⁷, Steffen Neumann⁸, Angel Pizarro⁹, Luisa Montecchi-Palazzi¹⁰, Natalie Tasman, Michael K. Coleman¹¹, Florian Reisinger¹⁰, Puneet Souda, Henning Hermjakob¹⁰, Pierre-Alain Binz¹², Eric W. Deutsch¹³ - Show less +14 more•Institutions (13)

Vanderbilt University¹, University of Tübingen², University of Southern California³, Lund University⁴, Thermo Fisher Scientific⁵, Agilent Technologies⁶, University of Giessen⁷, Leibniz Association⁸, University of Pennsylvania⁹, European Bioinformatics Institute¹⁰, Stowers Institute for Medical Research¹¹, Swiss Institute of Bioinformatics¹², Institute for Systems Biology¹³

01 Jan 2011-Molecular & Cellular Proteomics

TL;DR: The resulting standard data format, mzML, is a well tested open-source format formass spectrometer output files that can be readily utilized by the community and easily adapted for incremental advances in mass spectrometry technology.

...read moreread less

627 citations

Journal Article•DOI•

Design and implementation of microarray gene expression markup language (MAGE-ML)

[...]

Paul T. Spellman¹, Michael W. Miller, Jason E. Stewart, Charles Troup², Ugis Sarkans³, Steve Chervitz⁴, Derek Bernhart⁴, Gavin Sherlock⁵, Catherine A. Ball⁵, Marc Lepage, Marcin Swiatek, WL Marks, Jason Goncalves, Scott Markel, Daniel Iordan, Mohammadreza Shojatalab³, Angel Pizarro⁶, Joseph White⁷, Robert Hubley⁸, Eric W. Deutsch⁸, Martin Senger⁹, Bruce J. Aronow⁹, Alan J. Robinson³, Doug Bassett, Christian J. Stoeckert⁶, Alvis Brazma³ - Show less +22 more•Institutions (9)

University of California, Berkeley¹, Agilent Technologies², European Bioinformatics Institute³, Affymetrix⁴, Stanford University⁵, University of Pennsylvania⁶, Research Medical Center⁷, Institute for Systems Biology⁸, University of Cincinnati Academic Health Center⁹

23 Aug 2002-Genome Biology

TL;DR: MAGE will help microarray data producers and users to exchange information by providing a common platform for data exchange, and MAGE-STK will make the adoption of MAGE easier.

...read moreread less

Abstract: Background Meaningful exchange of microarray data is currently difficult because it is rare that published data provide sufficient information depth or are even in the same format from one publication to another. Only when data can be easily exchanged will the entire biological community be able to derive the full benefit from such microarray studies.

...read moreread less

474 citations

Journal Article•DOI•

Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM)

[...]

Gregory R. Grant¹, Michael H. Farkas¹, Angel Pizarro¹, Nicholas F. Lahens¹, Jonathan Schug¹, Brian P. Brunk¹, Christian J. Stoeckert¹, John B. Hogenesch¹, Eric A. Pierce¹ - Show less +5 more•Institutions (1)

University of Pennsylvania¹

01 Sep 2011-Bioinformatics

TL;DR: An RNA-Seq simulator is developed that models the main impediments to RNA alignment, including alternative splicing, insertions, deletions, substitutions, sequencing errors and intron signal, and a pipeline based on BLAT is developed to explore the performance of established tools for this problem, and to compare it to the recently developed methods.

...read moreread less

Abstract: Motivation: A critical task in high-throughput sequencing is aligning millions of short reads to a reference genome. Alignment is especially complicated for RNA sequencing (RNA-Seq) because of RNA splicing. A number of RNA-Seq algorithms are available, and claim to align reads with high accuracy and efficiency while detecting splice junctions. RNA-Seq data are discrete in nature; therefore, with reasonable gene models and comparative metrics RNA-Seq data can be simulated to sufficient accuracy to enable meaningful benchmarking of alignment algorithms. The exercise to rigorously compare all viable published RNA-Seq algorithms has not been performed previously. Results: We developed an RNA-Seq simulator that models the main impediments to RNA alignment, including alternative splicing, insertions, deletions, substitutions, sequencing errors and intron signal. We used this simulator to measure the accuracy and robustness of available algorithms at the base and junction levels. Additionally, we used reverse transcription–polymerase chain reaction (RT–PCR) and Sanger sequencing to validate the ability of the algorithms to detect novel transcript features such as novel exons and alternative splicing in RNA-Seq data from mouse retina. A pipeline based on BLAT was developed to explore the performance of established tools for this problem, and to compare it to the recently developed methods. This pipeline, the RNA-Seq Unified Mapper (RUM), performs comparably to the best current aligners and provides an advantageous combination of accuracy, speed and usability. Availability: The RUM pipeline is distributed via the Amazon Cloud and for computing clusters using the Sun Grid Engine ( http://cbil.upenn.edu/RUM). Contact:ggrant@pcbi.upenn.edu; epierce@mail.med.upenn.edu Supplementary Information:The RNA-Seq sequence reads described in the article are deposited at GEO, accession GSE26248.

...read moreread less

353 citations

Journal Article•DOI•

CircaDB: a database of mammalian circadian gene expression profiles

[...]

Angel Pizarro¹, Katharina E. Hayer¹, Nicholas F. Lahens¹, John B. Hogenesch¹•Institutions (1)

University of Pennsylvania¹

24 Nov 2012-Nucleic Acids Research

TL;DR: A new database of circadian transcriptional profiles from time course expression experiments from mice and humans, where each transcript’s expression was evaluated by three separate algorithms, JTK_Cycle, Lomb Scargle and DeLichtenberg.

...read moreread less

Abstract: CircaDB (http://circadb.org) is a new database of circadian transcriptional profiles from time course expression experiments from mice and humans. Each transcript’s expression was evaluated by three separate algorithms, JTK_Cycle, Lomb Scargle and DeLichtenberg. Users can query the gene annotations using simple and powerful full text search terms, restrict results to specific data sets and provide probability thresholds for each algorithm. Visualizations of the data are intuitive charts that convey profile information more effectively than a table of probabilities. The CircaDB web application is open source and available at http:// github.com/itmat/circadb.

...read moreread less

276 citations

Journal Article•DOI•

The mzIdentML Data Standard for Mass Spectrometry-Based Proteomics Results

[...]

Andrew R. Jones¹, Martin Eisenacher², Gerhard Mayer², Oliver Kohlbacher³, Jennifer A. Siepen⁴, Simon J. Hubbard⁴, Julian N. Selley⁴, Brian C. Searle, James Shofstahl⁵, Sean L. Seymour, Randall K. Julian, Pierre-Alain Binz⁶, Eric W. Deutsch⁷, Henning Hermjakob⁸, Florian Reisinger⁸, Johannes Griss⁸, Juan Antonio Vizcaíno⁸, Matthew C. Chambers⁹, Angel Pizarro¹⁰, David M. Creasy - Show less +16 more•Institutions (10)

University of Liverpool¹, Ruhr University Bochum², University of Tübingen³, University of Manchester⁴, Thermo Fisher Scientific⁵, Swiss Institute of Bioinformatics⁶, Institute for Systems Biology⁷, Wellcome Trust⁸, Vanderbilt University⁹, University of Pennsylvania¹⁰

01 Jul 2012-Molecular & Cellular Proteomics

TL;DR: The release of mzIdentML enables proteomics scientists to start working with the standard for exchanging and publishing data sets in support of publications and they provide a stable platform for bioinformatics groups and commercial software vendors to work with a single file format for identification data.

...read moreread less

188 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

STAR: ultrafast universal RNA-seq aligner

[...]

Alexander Dobin¹, Carrie A. Davis¹, Felix Schlesinger¹, Jorg Drenkow¹, Chris Zaleski¹, Sonali Jha¹, Philippe Batut¹, Mark Chaisson¹, Thomas R. Gingeras¹ - Show less +5 more•Institutions (1)

Cold Spring Harbor Laboratory¹

01 Jan 2013-Bioinformatics

TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.

...read moreread less

Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

...read moreread less

30,684 citations

Journal Article•DOI•

HISAT: a fast spliced aligner with low memory requirements

[...]

Daehwan Kim¹, Ben Langmead¹, Steven L. Salzberg¹•Institutions (1)

Johns Hopkins University School of Medicine¹

01 Apr 2015-Nature Methods

TL;DR: Tests showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method, and requires only 4.3 gigabytes of memory.

...read moreread less

Abstract: HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of ∼64,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.

...read moreread less

13,192 citations

Journal Article•DOI•

TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions

[...]

Daehwan Kim¹, Daehwan Kim², Geo Pertea³, Cole Trapnell⁴, Cole Trapnell⁵, Harold Pimentel⁶, Kelley Ryan Matthew⁷, Steven L. Salzberg³, Steven L. Salzberg¹ - Show less +5 more•Institutions (7)

Johns Hopkins University School of Medicine¹, University of Maryland, College Park², Johns Hopkins University³, Broad Institute⁴, Harvard University⁵, University of California, Berkeley⁶, Illumina⁷

25 Apr 2013-Genome Biology

TL;DR: TopHat2 is described, which incorporates many significant enhancements to TopHat, and combines the ability to identify novel splice sites with direct mapping to known transcripts, producing sensitive and accurate alignments, even for highly repetitive genomes or in the presence of pseudogenes.

...read moreread less

Abstract: TopHat is a popular spliced aligner for RNA-sequence (RNA-seq) experiments. In this paper, we describe TopHat2, which incorporates many significant enhancements to TopHat. TopHat2 can align reads of various lengths produced by the latest sequencing technologies, while allowing for variable-length indels with respect to the reference genome. In addition to de novo spliced alignment, TopHat2 can align reads across fusion breaks, which can occur after genomic translocations. TopHat2 combines the ability to identify novel splice sites with direct mapping to known transcripts, producing sensitive and accurate alignments, even for highly repetitive genomes or in the presence of pseudogenes. TopHat2 is available at http://ccb.jhu.edu/software/tophat.

...read moreread less

11,380 citations

“Bioinformatics” 특집을 내면서

[...]

장병탁, 김삼묘, 허철구

01 Aug 2000

TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.

...read moreread less

Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

...read moreread less

4,833 citations

Journal Article•DOI•

2016 update of the PRIDE database and its related tools

[...]

Juan Antonio Vizcaíno¹, Attila Csordas¹, Noemi del-Toro¹, José A. Dianes¹, Johannes Griss², Ilias Lavidas¹, Gerhard Mayer¹, Yasset Perez-Riverol¹, Florian Reisinger¹, Tobias Ternent¹, Qing Wei Xu¹, Rui Wang¹, Henning Hermjakob¹ - Show less +9 more•Institutions (2)

European Bioinformatics Institute¹, Medical University of Vienna²

04 Jan 2016-Nucleic Acids Research

TL;DR: The developments in PRIDE resources and related tools are summarized and a brief update on the resources under development 'PRIDE Cluster' and 'PRide Proteomes', which provide a complementary view and quality-scored information of the peptide and protein identification data available inPRIDE Archive are given.

...read moreread less

Abstract: The PRoteomics IDEntifications (PRIDE) database is one of the world-leading data repositories of mass spectrometry (MS)-based proteomics data Since the beginning of 2014, PRIDE Archive (http://wwwebiacuk/pride/archive/) is the new PRIDE archival system, replacing the original PRIDE database Here we summarize the developments in PRIDE resources and related tools since the previous update manuscript in the Database Issue in 2013 PRIDE Archive constitutes a complete redevelopment of the original PRIDE, comprising a new storage backend, data submission system and web interface, among other components PRIDE Archive supports the most-widely used PSI (Proteomics Standards Initiative) data standard formats (mzML and mzIdentML) and implements the data requirements and guidelines of the ProteomeXchange Consortium The wide adoption of ProteomeXchange within the community has triggered an unprecedented increase in the number of submitted data sets (around 150 data sets per month) We outline some statistics on the current PRIDE Archive data contents We also report on the status of the PRIDE related stand-alone tools: PRIDE Inspector, PRIDE Converter 2 and the ProteomeXchange submission tool Finally, we will give a brief update on the resources under development 'PRIDE Cluster' and 'PRIDE Proteomes', which provide a complementary view and quality-scored information of the peptide and protein identification data available in PRIDE Archive

...read moreread less

3,375 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse