Home
/
Authors
/
Misha Kapushesky

Author

Misha Kapushesky

Other affiliations: University of Pennsylvania, University of Cambridge, Wellcome Trust Sanger Institute ...read more

Bio: Misha Kapushesky is an academic researcher from European Bioinformatics Institute. The author has contributed to research in topics: Ontology (information science) & Gene expression profiling. The author has an hindex of 20, co-authored 34 publications receiving 4209 citations. Previous affiliations of Misha Kapushesky include University of Pennsylvania & University of Cambridge.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

ArrayExpress—a public repository for microarray gene expression data at the EBI

[...]

Helen Parkinson¹, Ugis Sarkans¹, Mohammadreza Shojatalab¹, Niran Abeygunawardena¹, Sergio Contrino¹, Richard M.R. Coulson¹, Anna Farne¹, Gonzalo Garcia Lara¹, Ele Holloway¹, Misha Kapushesky¹, P. Lilja¹, Gaurab Mukherjee¹, Ahmet Oezcimen¹, Tim F. Rayner¹, Philippe Rocca-Serra¹, Anjan Sharma¹, Susanna-Assunta Sansone¹, Alvis Brazma¹ - Show less +14 more•Institutions (1)

European Bioinformatics Institute¹

01 Jan 2003-Nucleic Acids Research

TL;DR: ArrayExpress is a public repository for microarray data that supports the MIAME (Minimum Informa-tion About a Microarray Experiment) requirements and stores well-annotated raw and normalized data.

...read moreread less

Abstract: ArrayExpress is a new public database of microarray gene expression data at the EBI, which is a generic gene expression database designed to hold data from all microarray platforms. ArrayExpress uses the annotation standard Minimum Information About a Microarray Experiment (MIAME) and the associated XML data exchange format Microarray Gene Expression Markup Language (MAGE-ML) and it is designed to store well annotated data in a structured way. The ArrayExpress infrastructure consists of the database itself, data submissions in MAGE-ML format or via an online submission tool MIAMExpress, online database query interface, and the Expression Profiler online analysis tool. ArrayExpress accepts three types of submission, arrays, experiments and protocols, each of these is assigned an accession number. Help on data submission and annotation is provided by the curation team. The database can be queried on parameters such as author, laboratory, organism, experiment or array types. With an increasing number of organisations adopting MAGE-ML standard, the volume of submissions to ArrayExpress is increasing rapidly. The database can be accessed at http://www.ebi.ac.uk/arrayexpress.

...read moreread less

1,183 citations

Journal Article•DOI•

ArrayExpress—a public database of microarray experiments and gene expression profiles

[...]

Helen Parkinson¹, Misha Kapushesky¹, Mohammadreza Shojatalab¹, Niran Abeygunawardena¹, Richard M.R. Coulson¹, Anna Farne¹, Ele Holloway¹, Nikolay Kolesnykov¹, P. Lilja¹, Margus Lukk¹, Roby Mani¹, Tim F. Rayner¹, Anjan Sharma¹, E. William¹, Ugis Sarkans¹, Alvis Brazma¹ - Show less +12 more•Institutions (1)

European Bioinformatics Institute¹

01 Jan 2007-Nucleic Acids Research

TL;DR: The ArrayExpress Repository and Data Warehouse is a database of gene expression profiles selected from the repository and consistently re-annotated, which contains data from >50 000 hybridizations and >1 500‬000 individual expression profiles.

...read moreread less

Abstract: UNLABELLED ArrayExpress is a public database for high throughput functional genomics data. ArrayExpress consists of two parts--the ArrayExpress Repository, which is a MIAME supportive public archive of microarray data, and the ArrayExpress Data Warehouse, which is a database of gene expression profiles selected from the repository and consistently re-annotated. Archived experiments can be queried by experiment attributes, such as keywords, species, array platform, authors, journals or accession numbers. Gene expression profiles can be queried by gene names and properties, such as Gene Ontology terms and gene expression profiles can be visualized. ArrayExpress is a rapidly growing database, currently it contains data from >50,000 hybridizations and >1,500,000 individual expression profiles. ArrayExpress supports community standards, including MIAME, MAGE-ML and more recently the proposal for a spreadsheet based data exchange format: MAGE-TAB. AVAILABILITY www.ebi.ac.uk/arrayexpress.

...read moreread less

644 citations

Journal Article•DOI•

Modeling sample variables with an Experimental Factor Ontology

[...]

James Malone¹, Ele Holloway², Tomasz Adamusiak², Misha Kapushesky², Jie Zheng², Nikolay Kolesnikov², Anna Zhukova², Alvis Brazma², Helen Parkinson² - Show less +5 more•Institutions (2)

Wellcome Trust¹, University of Pennsylvania²

01 Apr 2010-Bioinformatics

TL;DR: The application of reference ontologies to data is a key problem, and this work presents guidelines on how community ontologies can be presented in an application ontology in a data-driven way.

...read moreread less

Abstract: Motivation: Describing biological sample variables with ontologies is complex due to the cross-domain nature of experiments. Ontologies provide annotation solutions; however, for cross-domain investigations, multiple ontologies are needed to represent the data. These are subject to rapid change, are often not interoperable and present complexities that are a barrier to biological resource users. Results: We present the Experimental Factor Ontology, designed to meet cross-domain, application focused use cases for gene expression data. We describe our methodology and open source tools used to create the ontology. These include tools for creating ontology mappings, ontology views, detecting ontology changes and using ontologies in interfaces to enhance querying. The application of reference ontologies to data is a key problem, and this work presents guidelines on how community ontologies can be presented in an application ontology in a data-driven way. Availability: http://www.ebi.ac.uk/efo Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.

...read moreread less

468 citations

Journal Article•DOI•

ArrayExpress update—from an archive of functional genomics experiments to the atlas of gene expression

[...]

Helen Parkinson¹, Misha Kapushesky², Nikolay Kolesnikov², Gabriella Rustici², Mohammadreza Shojatalab², Niran Abeygunawardena², Hugo Bérubé², Miroslaw Dylag², Ibrahim Emam², Anna Farne², Ele Holloway², Margus Lukk², James Malone², Roby Mani², Ekaterina Pilicheva², Tim F. Rayner², Faisal I. Rezwan², Anjan Sharma², Eleanor Williams², Xiangqun Zheng Bradley², Tomasz Adamusiak², Marco Brandizi², Tony Burdett², Richard M.R. Coulson², Maria Krestyaninova², Pavel Kurnosov², Eamonn Maguire², Sudeshna Guha Neogi², Philippe Rocca-Serra², Susanna-Assunta Sansone², Nataliya Sklyar², Mengyao Zhao², Ugis Sarkans², Alvis Brazma² - Show less +30 more•Institutions (2)

European Bioinformatics Institute¹, University of Cambridge²

01 Jan 2009-Nucleic Acids Research

TL;DR: This update describes the ArrayExpress developments over the last two years and describes the new summary database and meta-analytical tool of ranked gene expression across multiple experiments and different biological conditions.

...read moreread less

Abstract: ArrayExpress http://www.ebi.ac.uk/arrayexpress consists of three components: the ArrayExpress Repository—a public archive of functional genomics experiments and supporting data, the ArrayExpress Warehouse—a database of gene expression profiles and other bio-measurements and the ArrayExpress Atlas—a new summary database and metaanalytical tool of ranked gene expression across multiple experiments and different biological conditions. The Repository contains data from over 6000 experiments comprising approximately 200000 assays, and the database doubles in size every 15 months. The majority of the data are array based, but other data types are included, most recently— ultra high-throughput sequencing transcriptomics and epigenetic data. The Warehouse and Atlas allow users to query for differentially expressed genes by gene names and properties, experimental conditions and sample properties, or a combination of both. In this update, we describe the ArrayExpress developments over the last two years.

...read moreread less

456 citations

Journal Article•DOI•

A global map of human gene expression

[...]

Margus Lukk¹, Misha Kapushesky¹, Janne Nikkilä², Helen Parkinson¹, Angela Goncalves¹, Wolfgang Huber¹, Esko Ukkonen², Alvis Brazma¹ - Show less +4 more•Institutions (2)

European Bioinformatics Institute¹, University of Helsinki²

01 Apr 2010-Nature Biotechnology

TL;DR: A global gene expression map is constructed by integrating microarray data from 5,372 human samples representing 369 different cell and tissue types, disease states and cell lines and reveals that it can be described by a small number of distinct expression profile classes.

...read moreread less

Abstract: To the Editor Although there is only one human genome sequence, different genes are expressed in many different cell types and tissues, as well as in different developmental stages or diseases. The structure of this ‘expression space’ is still largely unknown, as most transcriptomics experiments focus on sampling small regions. We have constructed a global gene expression map by integrating microarray data from 5,372 human samples representing 369 different cell and tissue types, disease states and cell lines. These have been compiled in an online resource (http://www.ebi.ac.uk/gxa/array/U133A) that allows the user to search for a gene of interest and find the conditions in which it is over- or underexpressed, or, conversely, to find which genes are over- or underexpressed in a particular condition. An analysis of the structure of the expression space reveals that it can be described by a small number of distinct expression profile classes and that the first three principal components of this space have biological interpretations. The hematopoietic system, solid tissues and incompletely differentiated cell types are arranged on the first principal axis; cell lines, neoplastic samples and nonneoplastic primary tissue–derived samples are on the second principal axis; and nervous system is separated from the rest of the samples on the third axis. We also show below that most cell lines cluster together rather than with their tissues of origin.

...read moreread less

368 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Tissue-based map of the human proteome

[...]

Mathias Uhlén¹, Mathias Uhlén², Linn Fagerberg², Björn M. Hallström², Cecilia Lindskog³, Per Oksvold², Adil Mardinoglu⁴, Åsa Sivertsson², Caroline Kampf³, Evelina Sjöstedt³, Evelina Sjöstedt², Anna Asplund³, IngMarie Olsson³, Karolina Edlund, Emma Lundberg², Sanjay Navani, Cristina Al-Khalili Szigyarto², Jacob Odeberg², Dijana Djureinovic³, Jenny Ottosson Takanen², Sophia Hober², Tove Alm², Per-Henrik Edqvist³, Holger Berling², Hanna Tegel², Jan Mulder³, Johan Rockberg², Peter Nilsson², Jochen M. Schwenk², Marica Hamsten², Kalle von Feilitzen², Mattias Forsberg², Lukas Persson², Fredric Johansson², Martin Zwahlen², Gunnar von Heijne⁵, Jens Nielsen¹, Jens Nielsen⁴, Fredrik Pontén³ - Show less +35 more•Institutions (5)

Technical University of Denmark¹, Royal Institute of Technology², Science for Life Laboratory³, Chalmers University of Technology⁴, Stockholm University⁵

23 Jan 2015-Science

TL;DR: In this paper, a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level.

...read moreread less

Abstract: Resolving the molecular details of proteome variation in the different tissues and organs of the human body will greatly increase our knowledge of human biology and disease. Here, we present a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level. Our tissue-based analysis detected more than 90% of the putative protein-coding genes. We used this approach to explore the human secretome, the membrane proteome, the druggable proteome, the cancer proteome, and the metabolic functions in 32 different tissues and organs. All the data are integrated in an interactive Web-based database that allows exploration of individual proteins, as well as navigation of global expression patterns, in all major tissues and organs in the human body.

...read moreread less

9,745 citations

Journal Article•DOI•

Robust enumeration of cell subsets from tissue expression profiles

[...]

Aaron M. Newman¹, Chih Long Liu¹, Michael R. Green¹, Andrew J. Gentles¹, Weiguo Feng¹, Yue Xu¹, Chuong D. Hoang¹, Maximilian Diehn¹, Arash Ash Alizadeh¹ - Show less +5 more•Institutions (1)

Stanford University¹

01 May 2015-Nature Methods

TL;DR: CIBERSORT outperformed other methods with respect to noise, unknown mixture content and closely related cell types when applied to enumeration of hematopoietic subsets in RNA mixtures from fresh, frozen and fixed tissues, including solid tumors.

...read moreread less

Abstract: We introduce CIBERSORT, a method for characterizing cell composition of complex tissues from their gene expression profiles When applied to enumeration of hematopoietic subsets in RNA mixtures from fresh, frozen and fixed tissues, including solid tumors, CIBERSORT outperformed other methods with respect to noise, unknown mixture content and closely related cell types CIBERSORT should enable large-scale analysis of RNA mixtures for cellular biomarkers and therapeutic targets (http://cibersortstanfordedu/)

...read moreread less

6,967 citations

Journal Article•DOI•

The Reactome Pathway Knowledgebase.

[...]

Antonio Fabregat¹, Konstantinos Sidiropoulos¹, Phani V. Garapati¹, Marc Gillespie², Marc Gillespie³, Kerstin Hausmann¹, Robin Haw², Bijay Jassal², S Jupe¹, Florian Korninger¹, Sheldon J. McKay², Lisa Matthews⁴, Bruce May², Marija Milacic², Karen Rothfels², Veronica Shamovsky⁴, Marissa Webber², Joel Weiser², Mark Williams¹, Guanming Wu², Lincoln Stein⁵, Lincoln Stein⁶, Lincoln Stein², Henning Hermjakob⁷, Henning Hermjakob¹, Peter D'Eustachio⁴ - Show less +22 more•Institutions (7)

European Bioinformatics Institute¹, Ontario Institute for Cancer Research², St. John's University³, New York University⁴, University of Toronto⁵, Cold Spring Harbor Laboratory⁶, Protein Sciences⁷

01 Jan 2014-Nucleic Acids Research

TL;DR: The Reactome Knowledgebase provides molecular details of signal transduction, transport, DNA replication, metabolism and other cellular processes as an ordered network of molecular transformations—an extended version of a classic metabolic map, in a single consistent data model.

...read moreread less

Abstract: The Reactome Knowledgebase (www.reactome.org) provides molecular details of signal transduction, transport, DNA replication, metabolism and other cellular processes as an ordered network of molecular transformations-an extended version of a classic metabolic map, in a single consistent data model. Reactome functions both as an archive of biological processes and as a tool for discovering unexpected functional relationships in data such as gene expression pattern surveys or somatic mutation catalogues from tumour cells. Over the last two years we redeveloped major components of the Reactome web interface to improve usability, responsiveness and data visualization. A new pathway diagram viewer provides a faster, clearer interface and smooth zooming from the entire reaction network to the details of individual reactions. Tool performance for analysis of user datasets has been substantially improved, now generating detailed results for genome-wide expression datasets within seconds. The analysis module can now be accessed through a RESTFul interface, facilitating its inclusion in third party applications. A new overview module allows the visualization of analysis results on a genome-wide Reactome pathway hierarchy using a single screen page. The search interface now provides auto-completion as well as a faceted search to narrow result lists efficiently.

...read moreread less

5,065 citations

Journal Article•DOI•

Molecular signatures database (MSigDB) 3.0

[...]

Arthur Liberzon¹, Aravind Subramanian¹, Reid M. Pinchback¹, Helga Thorvaldsdottir¹, Pablo Tamayo¹, Jill P. Mesirov¹ - Show less +2 more•Institutions (1)

Broad Institute¹

01 Jun 2011-Bioinformatics

TL;DR: A new version of the database, MSigDB 3.0, is reported, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site.

...read moreread less

Abstract: Motivation: Well-annotated gene sets representing the universe of the biological processes are critical for meaningful and insightful interpretation of large-scale genomic data. The Molecular Signatures Database (MSigDB) is one of the most widely used repositories of such sets. Results: We report the availability of a new version of the database, MSigDB 3.0, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site. Availability and Implementation: MSigDB is freely available for non-commercial use at http://www.broadinstitute.org/msigdb. Contact: gsea@broadinstitute.org

...read moreread less

4,128 citations

Journal Article•DOI•

The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019.

[...]

Annalisa Buniello¹, Jacqueline A. L. MacArthur¹, Maria Cerezo¹, Laura W. Harris¹, James D. Hayhurst¹, Cinzia Malangone¹, Aoife McMahon¹, Joannella Morales¹, Edward Mountjoy², Edward Mountjoy³, Elliot Sollis¹, Daniel Suveges¹, Olga Vrousgou¹, Patricia L. Whetzel¹, M. Ridwan Amode¹, Jose A. Guillen¹, Harpreet Singh Riat¹, Stephen J. Trevanion¹, Peggy Hall⁴, Heather Junkins⁴, Paul Flicek¹, Tony Burdett¹, Lucia A. Hindorff⁴, Fiona Cunningham¹, Helen Parkinson¹ - Show less +21 more•Institutions (4)

European Bioinformatics Institute¹, University of Oxford², Wellcome Trust Sanger Institute³, National Institutes of Health⁴

08 Jan 2019-Nucleic Acids Research

TL;DR: Improved data access is improved with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database.

...read moreread less

Abstract: The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability: https://www.ebi.ac.uk/gwas/.

...read moreread less

2,878 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse