Home
/
Authors
/
José M. Peregrín-Alvarez

Author

José M. Peregrín-Alvarez

Other affiliations: Hospital for Sick Children, University of Pittsburgh, Lawrence Berkeley National Laboratory ...read more

Bio: José M. Peregrín-Alvarez is an academic researcher from University of Toronto. The author has contributed to research in topics: Genome & Conserved sequence. The author has an hindex of 11, co-authored 11 publications receiving 4917 citations. Previous affiliations of José M. Peregrín-Alvarez include Hospital for Sick Children & University of Pittsburgh.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Global landscape of protein complexes in the yeast Saccharomyces cerevisiae

[...]

Nevan J. Krogan¹, Gerard Cagney¹, Gerard Cagney², Haiyuan Yu³, Gouqing Zhong¹, Xinghua Guo¹, Alexandr Ignatchenko¹, Joyce Li¹, Shuye Pu¹, Nira Datta¹, Aaron Tikuisis¹, Thanuja Punna¹, José M. Peregrín-Alvarez¹, Michael Shales¹, Xin Zhang¹, Michael Davey¹, Mark D. Robinson¹, Alberto Paccanaro³, James E. Bray¹, Anthony Sheung¹, Bryan Beattie, Dawn Richards, Veronica Canadien, Atanas Iliev Lalev¹, Frank Mena, Peter D Wong¹, Andrei Starostine¹, Myra M. Canete¹, James Vlasblom¹, Samuel Wu¹, Chris Orsi¹, Sean R. Collins⁴, Shamanta Chandran¹, Robin Haw¹, Jennifer J. Rilstone¹, Kiran Gandi¹, Natalie J. Thompson¹, Gabe Musso¹, Peter St Onge¹, Shaun Ghanny¹, Mandy H. Y. Lam¹, Gareth Butland¹, Amin M. Altaf-Ul⁵, Shigehiko Kanaya⁵, Ali Shilatifard⁶, Erin K. O'Shea⁷, Jonathan S. Weissman⁸, C. James Ingles¹, Timothy P. Hughes¹, John Parkinson¹, Mark Gerstein³, Shoshana J. Wodak¹, Andrew Emili¹, Jack Greenblatt¹ - Show less +50 more•Institutions (8)

University of Toronto¹, University College Dublin², Yale University³, Howard Hughes Medical Institute⁴, Nara Institute of Science and Technology⁵, Saint Louis University⁶, Harvard University⁷, University of California, San Francisco⁸

30 Mar 2006-Nature

TL;DR: T tandem affinity purification was used to process 4,562 different tagged proteins of the yeast Saccharomyces cerevisiae to identify protein–protein interactions, which will help future studies on individual proteins as well as functional genomics and systems biology.

...read moreread less

Abstract: Identification of protein-protein interactions often provides insight into protein function, and many cellular processes are performed by stable protein complexes. We used tandem affinity purification to process 4,562 different tagged proteins of the yeast Saccharomyces cerevisiae. Each preparation was analysed by both matrix-assisted laser desorption/ionization-time of flight mass spectrometry and liquid chromatography tandem mass spectrometry to increase coverage and accuracy. Machine learning was used to integrate the mass spectrometry scores and assign probabilities to the protein-protein interactions. Among 4,087 different proteins identified with high confidence by mass spectrometry from 2,357 successful purifications, our core data set (median precision of 0.69) comprises 7,123 protein-protein interactions involving 2,708 proteins. A Markov clustering algorithm organized these interactions into 547 protein complexes averaging 4.9 subunits per complex, about half of them absent from the MIPS database, as well as 429 additional interactions between pairs of complexes. The data (all of which are available online) will help future studies on individual proteins as well as functional genomics and systems biology.

...read moreread less

2,975 citations

Journal Article•DOI•

Interaction network containing conserved and essential protein complexes in Escherichia coli

[...]

Gareth Butland¹, José M. Peregrín-Alvarez², Joyce Li¹, Wehong Yang¹, Xiaochun Yang¹, Veronica Canadien, Andrei Starostine¹, Dawn Richards, Bryan Beattie, Nevan J. Krogan¹, Michael Davey¹, John Parkinson¹, John Parkinson², Jack Greenblatt¹, Andrew Emili¹ - Show less +11 more•Institutions (2)

University of Toronto¹, Hospital for Sick Children²

03 Feb 2005-Nature

TL;DR: Insight is provided into the function of previously uncharacterized bacterial proteins and the overall topology of a microbial interaction network, the core components of which are broadly conserved across Prokaryota.

...read moreread less

Abstract: Proteins often function as components of multi-subunit complexes. Despite its long history as a model organism, no large-scale analysis of protein complexes in Escherichia coli has yet been reported. To this end, we have targeted DNA cassettes into the E. coli chromosome to create carboxy-terminal, affinity-tagged alleles of 1,000 open reading frames (approximately 23% of the genome). A total of 857 proteins, including 198 of the most highly conserved, soluble non-ribosomal proteins essential in at least one bacterial species, were tagged successfully, whereas 648 could be purified to homogeneity and their interacting protein partners identified by mass spectrometry. An interaction network of protein complexes involved in diverse biological processes was uncovered and validated by sequential rounds of tagging and purification. This network includes many new interactions as well as interactions predicted based solely on genomic inference or limited phenotypic data. This study provides insight into the function of previously uncharacterized bacterial proteins and the overall topology of a microbial interaction network, the core components of which are broadly conserved across Prokaryota.

...read moreread less

1,175 citations

Journal Article•DOI•

Draft Genome of the Filarial Nematode Parasite Brugia malayi

[...]

Elodie Ghedin¹, Elodie Ghedin², Shiliang Wang¹, David J. Spiro¹, Elisabet Caler¹, Qi Zhao¹, Jonathan Crabtree¹, Jonathan E. Allen¹, Arthur L. Delcher¹, David B. Guiliano³, Diego Miranda-Saavedra⁴, Samuel V. Angiuoli¹, Todd Creasy¹, Paolo Amedeo¹, Brian J. Haas¹, Najib M. El-Sayed¹, Jennifer R. Wortman¹, Tamara Feldblyum¹, Luke J. Tallon¹, Michael C. Schatz¹, Martin Shumway¹, Hean Koo¹, Steven L. Salzberg¹, Seth Schobel¹, Mihaela Pertea¹, Mihai Pop¹, Owen White¹, Geoffrey J. Barton⁴, Clotilde K. S. Carlow⁵, Crawford Michael J, Jennifer Daub⁶, Dimmic Matt W, Chris F. Estes⁷, Jeremy M. Foster⁵, Mehul B. Ganatra⁵, William F. Gregory⁶, Nicholas M. Johnson⁸, Jinming Jin⁵, Richard Komuniecki⁹, Ian F Korf¹⁰, Sanjay Kumar⁵, Sandra J. Laney¹¹, Ben-Wen Li¹², Wen Li¹¹, Tim H. Lindblom⁷, Sara Lustigman¹³, Dong Ma⁵, Claude V. Maina⁵, David M. A. Martin⁴, James P. McCarter¹², Larry A. McReynolds⁵, Makedonka Mitreva¹², Thomas B. Nutman¹⁴, John Parkinson, José M. Peregrín-Alvarez², Catherine B. Poole⁵, Qinghu Ren¹, Lori Saunders¹¹, Ann E. Sluder, Katherine A. Smith⁹, Mario Stanke¹⁵, Thomas R. Unnasch¹⁶, Jenna Ware⁵, Aguan Wei¹², Gary J. Weil¹², Deryck J. Williams⁶, Yinhua Zhang⁵, Steven A. Williams¹¹, Claire M. Fraser-Liggett¹, Barton E. Slatko⁵, Mark Blaxter⁶, Alan L. Scott¹⁷ - Show less +68 more•Institutions (17)

J. Craig Venter Institute¹, University of Pittsburgh², Imperial College London³, University of Dundee⁴, New England Biolabs⁵, University of Edinburgh⁶, Lyon College⁷, Australian National University⁸, University of Toledo⁹, University of California, Davis¹⁰, Smith College¹¹, Washington University in St. Louis¹², New York Blood Center¹³, National Institutes of Health¹⁴, University of Göttingen¹⁵, University of Alabama at Birmingham¹⁶, Johns Hopkins University¹⁷

21 Sep 2007-Science

TL;DR: In this article, the authors sequenced the ∼90 megabase (Mb) genome of the human filarial parasite Brugia malayi and predicted ∼11,500 protein coding genes in 71 Mb of robustly assembled sequence.

...read moreread less

Abstract: Parasitic nematodes that cause elephantiasis and river blindness threaten hundreds of millions of people in the developing world. We have sequenced the ∼90 megabase (Mb) genome of the human filarial parasite Brugia malayi and predict ∼11,500 protein coding genes in 71 Mb of robustly assembled sequence. Comparative analysis with the free-living, model nematode Caenorhabditis elegans revealed that, despite these genes having maintained little conservation of local synteny during ∼350 million years of evolution, they largely remain in linkage on chromosomal units. More than 100 conserved operons were identified. Analysis of the predicted proteome provides evidence for adaptations of B. malayi to niches in its human and vector hosts and insights into the molecular basis of a mutualistic relationship with its Wolbachia endosymbiont. These findings offer a foundation for rational drug design.

...read moreread less

583 citations

Journal Article•DOI•

The Phylogenetic Extent of Metabolic Enzymes and Pathways

[...]

José M. Peregrín-Alvarez¹, Sophia Tsoka, Christos A. Ouzounis•Institutions (1)

European Bioinformatics Institute¹

01 Mar 2003-Genome Research

TL;DR: This work examines the phylogenetic distribution of the full-known metabolic complement of Escherichia coli, using sequence comparison against taxa-specific databases to show for the first time and in a comprehensive way that metabolism is conserved at the enzyme level.

...read moreread less

Abstract: The evolution of metabolic enzymes and pathways has been a subject of intense study for more than half a century. Yet, so far, previous studies have focused on a small number of enzyme families or biochemical pathways. Here, we examine the phylogenetic distribution of the full-known metabolic complement of Escherichia coli, using sequence comparison against taxa-specific databases. Half of the metabolic enzymes have homologs in all domains of life, representing families involved in some of the most fundamental cellular processes. We thus show for the first time and in a comprehensive way that metabolism is conserved at the enzyme level. In addition, our analysis suggests that despite the sequence conservation and the extensive phylogenetic distribution of metabolic enzymes, their groupings into biochemical pathways are much more variable than previously thought.

...read moreread less

96 citations

Journal Article•DOI•

The origins of apicomplexan sequence innovation

[...]

James D. Wasmuth, Jennifer Daub, José M. Peregrín-Alvarez, Constance A. M. Finney, John Parkinson - Show less +1 more

01 Jul 2009-Genome Research

TL;DR: A systematic analysis of protein family and domain incidence across the Apicomplexa reveals domains that may be important to parasite survival and constructed taxonomic profiles that reveal the extent of apicom Plexan sequence diversity.

...read moreread less

Abstract: The Apicomplexa are a group of phylogenetically related parasitic protists that include Plasmodium, Cryptosporidium, and Toxoplasma. Together they are a major global burden on human health and economics. To meet this challenge, several international consortia have generated vast amounts of sequence data for many of these parasites. Here, we exploit these data to perform a systematic analysis of protein family and domain incidence across the phylum. A total of 87,736 protein sequences were collected from 15 apicomplexan species. These were compared with three protein databases, including the partial genome database, PartiGeneDB, which increases the breadth of taxonomic coverage. From these searches we constructed taxonomic profiles that reveal the extent of apicomplexan sequence diversity. Sequences without a significant match outside the phylum were denoted as apicomplexan specialized. These were collated into 9134 discrete protein families and placed in the context of the apicomplexan phylogeny, identifying the putative origin of each family. Most apicomplexan families were associated with an individual genus or species. Interestingly, many genera-specific innovations were associated with specialized host cell invasion and/or parasite survival processes. Contrastingly, those families reflecting more ancestral relationships were enriched in generalized housekeeping functions such as translation and transcription, which have diverged within the apicomplexan lineage. Protein domain searches revealed 192 domains not previously reported in apicomplexans together with a number of novel domain combinations. We highlight domains that may be important to parasite survival.

...read moreread less

71 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Journal Article•DOI•

Global landscape of protein complexes in the yeast Saccharomyces cerevisiae

[...]

Nevan J. Krogan¹, Gerard Cagney², Gerard Cagney¹, Haiyuan Yu³, Gouqing Zhong¹, Xinghua Guo¹, Alexandr Ignatchenko¹, Joyce Li¹, Shuye Pu¹, Nira Datta¹, Aaron Tikuisis¹, Thanuja Punna¹, José M. Peregrín-Alvarez¹, Michael Shales¹, Xin Zhang¹, Michael Davey¹, Mark D. Robinson¹, Alberto Paccanaro³, James E. Bray¹, Anthony Sheung¹, Bryan Beattie, Dawn Richards, Veronica Canadien, Atanas Iliev Lalev¹, Frank Mena, Peter D Wong¹, Andrei Starostine¹, Myra M. Canete¹, James Vlasblom¹, Samuel Wu¹, Chris Orsi¹, Sean R. Collins⁴, Shamanta Chandran¹, Robin Haw¹, Jennifer J. Rilstone¹, Kiran Gandi¹, Natalie J. Thompson¹, Gabe Musso¹, Peter St Onge¹, Shaun Ghanny¹, Mandy H. Y. Lam¹, Gareth Butland¹, Amin M. Altaf-Ul⁵, Shigehiko Kanaya⁵, Ali Shilatifard⁶, Erin K. O'Shea⁷, Jonathan S. Weissman⁸, C. James Ingles¹, Timothy P. Hughes¹, John Parkinson¹, Mark Gerstein³, Shoshana J. Wodak¹, Andrew Emili¹, Jack Greenblatt¹ - Show less +50 more•Institutions (8)

30 Mar 2006-Nature

...read moreread less

2,975 citations

Journal Article•DOI•

Proteome survey reveals modularity of the yeast cell machinery

[...]

Anne-Claude Gavin, Patrick Aloy¹, Paola Grandi, Roland Krause², Markus Boesche, Martina Marzioch, Christina Rau, Lars Juhl Jensen¹, Sonja Bastuck, Birgit Dümpelfeld, Angela Edelmann, Marie-Anne Heurtier, Verena Hoffman, Christian Hoefert, Karin Klein, Manuela Hudak, Anne-Marie Michon, Malgorzata Schelder, Markus Schirle, Marita Remor, Tatjana Rudi, Sean D. Hooper¹, Andreas Bauer, Tewis Bouwmeester, Georg Casari, Gerard Drewes, Gitte Neubauer, Jens Rick, Bernhard Kuster, Peer Bork¹, Robert B. Russell¹, Giulio Superti-Furga³ - Show less +28 more•Institutions (3)

European Bioinformatics Institute¹, Charité², Austrian Academy of Sciences³

30 Mar 2006-Nature

TL;DR: This study reports the first genome-wide screen for complexes in an organism, budding yeast, using affinity purification and mass spectrometry and provides the largest collection of physically determined eukaryotic cellular machines so far and a platform for biological data integration and modelling.

...read moreread less

Abstract: Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. Here we report the first genome-wide screen for complexes in an organism, budding yeast, using affinity purification and mass spectrometry. Through systematic tagging of open reading frames (ORFs), the majority of complexes were purified several times, suggesting screen saturation. The richness of the data set enabled a de novo characterization of the composition and organization of the cellular machinery. The ensemble of cellular proteins partitions into 491 complexes, of which 257 are novel, that differentially combine with additional attachment proteins or protein modules to enable a diversification of potential functions. Support for this modular organization of the proteome comes from integration with available data on expression, localization, function, evolutionary conservation, protein structure and binary interactions. This study provides the largest collection of physically determined eukaryotic cellular machines so far and a platform for biological data integration and modelling.

...read moreread less

2,640 citations

Journal Article•DOI•

Integration of biological networks and gene expression data using Cytoscape

[...]

Melissa S. Cline¹, Michael E. Smoot, Ethan Cerami², Allan Kuchinsky³, Nerius Landys, Christopher T. Workman⁴, Rowan H. Christmas⁵, Iliana Avila-Campilo⁶, Iliana Avila-Campilo⁵, Michael L. Creech, Benjamin Gross², Kristina Hanspers, Ruth Isserlin⁷, Ryan Kelley, Sarah Killcoyne⁵, Samad Lotia, Steven Maere⁸, John H. Morris⁹, Keiichiro Ono, Vuk Pavlovic⁷, Alexander R. Pico, Aditya Vailaya³, Peng-Liang Wang, Annette M. Adler³, Bruce R. Conklin, Leroy Hood⁵, Martin Kuiper⁸, Chris Sander², Ilya Schmulevich⁵, Benno Schwikowski¹, Guy J. Warner, Trey Ideker, Gary D. Bader⁷ - Show less +29 more•Institutions (9)

Pasteur Institute¹, Memorial Sloan Kettering Cancer Center², Agilent Technologies³, Technical University of Denmark⁴, Institute for Systems Biology⁵, Merck & Co.⁶, University of Toronto⁷, Ghent University⁸, University of California, San Francisco⁹

01 Jan 2007-Nature Protocols

TL;DR: This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest.

...read moreread less

Abstract: Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape.

...read moreread less

2,313 citations

Journal Article•DOI•

The genetic landscape of a cell.

[...]

Michael Costanzo¹, Anastasia Baryshnikova¹, Jeremy Bellay², Yungil Kim², Eric D. Spear³, Carolyn S. Sevier³, Huiming Ding¹, Judice L. Y. Koh¹, Kiana Toufighi¹, Sara Mostafavi¹, Jeany Prinz¹, Robert P. St.Onge⁴, Benjamin VanderSluis², Taras Makhnevych¹, Franco J. Vizeacoumar¹, Solmaz Alizadeh¹, Sondra Bahr¹, Renee L. Brost¹, Yiqun Chen¹, Murat Cokol⁵, Raamesh Deshpande², Zhijian Li¹, Zhen Yuan Lin¹, Wendy Liang¹, Michaela Marback¹, Jadine Paw¹, Bryan Joseph San Luis¹, Ermira Shuteriqi¹, Amy Hin Yan Tong¹, Nydia Van Dyk¹, Iain M. Wallace¹, Joseph Whitney¹, Matthew T. Weirauch⁶, Guoqing Zhong¹, Hongwei Zhu¹, Walid A. Houry¹, Michael Brudno¹, Sasan Ragibizadeh, Balázs Papp⁷, Csaba Pál⁷, Frederick P. Roth⁵, Guri Giaever¹, Corey Nislow¹, Olga G. Troyanskaya⁸, Howard Bussey⁹, Gary D. Bader¹, Anne-Claude Gingras¹, Quaid Morris¹, Philip M. Kim¹, Chris A. Kaiser³, Chad L. Myers², Brenda J. Andrews¹, Charles Boone¹ - Show less +49 more•Institutions (9)

University of Toronto¹, University of Minnesota², Massachusetts Institute of Technology³, Stanford University⁴, Harvard University⁵, University of California, Santa Cruz⁶, Hungarian Academy of Sciences⁷, Princeton University⁸, McGill University⁹

22 Jan 2010-Science

TL;DR: A network based on genetic interaction profiles reveals a functional map of the cell in which genes of similar biological processes cluster together in coherent subsets, and highly correlated profiles delineate specific pathways to define gene function.

...read moreread less

Abstract: A genome-scale genetic interaction map was constructed by examining 5.4 million gene-gene pairs for synthetic genetic interactions, generating quantitative genetic interaction profiles for ~75% of all genes in the budding yeast, Saccharomyces cerevisiae. A network based on genetic interaction profiles reveals a functional map of the cell in which genes of similar biological processes cluster together in coherent subsets, and highly correlated profiles delineate specific pathways to define gene function. The global network identifies functional cross-connections between all bioprocesses, mapping a cellular wiring diagram of pleiotropy. Genetic interaction degree correlated with a number of different gene attributes, which may be informative about genetic network hubs in other organisms. We also demonstrate that extensive and unbiased mapping of the genetic landscape provides a key for interpretation of chemical-genetic interactions and drug target identification.

...read moreread less

2,225 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse