Home
/
Authors
/
Andrew Tikhonov

Author

Andrew Tikhonov

Other affiliations: Wellcome Trust Sanger Institute, Harvard University

Bio: Andrew Tikhonov is an academic researcher from European Bioinformatics Institute. The author has contributed to research in topics: Bioconductor & Gene. The author has an hindex of 6, co-authored 7 publications receiving 2931 citations. Previous affiliations of Andrew Tikhonov include Wellcome Trust Sanger Institute & Harvard University.

Topics: Bioconductor, Gene, Genome, Functional genomics, Transcriptome ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Transcriptome and genome sequencing uncovers functional variation in humans

[...]

Tuuli Lappalainen¹, Michael Sammeth, Marc R. Friedländer, Peter A C 't Hoen², Jean Monlong³, Manuel A. Rivas⁴, Mar Gonzàlez-Porta⁵, Natalja Kurbatova⁵, Thasso Griebel, Pedro G. Ferreira³, Matthias Barann⁶, Thomas Wieland, Liliana Greger⁵, Maarten van Iterson², Jonas Carlsson Almlöf⁷, Paolo Ribeca, Irina Pulyakhina², Daniela Esser⁶, Thomas Giger¹, Andrew Tikhonov⁵, Marc Sultan⁸, Gabrielle Bertier³, Daniel G. MacArthur⁹, Daniel G. MacArthur¹⁰, Monkol Lek⁹, Monkol Lek¹⁰, Esther Lizano, Henk P. J. Buermans², Ismael Padioleau¹, Ismael Padioleau¹¹, Thomas Schwarzmayr, Olof Karlberg⁷, Halit Ongen¹¹, Halit Ongen¹, Helena Kilpinen¹, Helena Kilpinen¹¹, Sergi Beltran, Marta Gut, Katja Kahlem, Vyacheslav Amstislavskiy⁸, Oliver Stegle⁵, Matti Pirinen⁴, Stephen B. Montgomery¹², Stephen B. Montgomery¹, Peter Donnelly⁴, Mark I. McCarthy¹³, Mark I. McCarthy⁴, Paul Flicek⁵, Tim M. Strom¹⁴, Hans Lehrach⁸, Stefan Schreiber⁶, Ralf Sudbrak⁸, Angel Carracedo¹⁵, Stylianos E. Antonarakis¹, Robert Häsler⁶, Ann-Christine Syvänen⁷, Gert-Jan B. van Ommen², Alvis Brazma⁵, Thomas Meitinger¹⁴, Philip Rosenstiel⁶, Roderic Guigó³, Ivo Gut, Xavier Estivill, Emmanouil T. Dermitzakis¹¹, Emmanouil T. Dermitzakis¹ - Show less +61 more•Institutions (15)

University of Geneva¹, Leiden University Medical Center², Pompeu Fabra University³, Wellcome Trust Centre for Human Genetics⁴, European Bioinformatics Institute⁵, University of Kiel⁶, Science for Life Laboratory⁷, Max Planck Society⁸, Broad Institute⁹, Harvard University¹⁰, Swiss Institute of Bioinformatics¹¹, Stanford University¹², University of Oxford¹³, Technische Universität München¹⁴, University of Santiago de Compostela¹⁵

26 Sep 2013-Nature

TL;DR: Se sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project—the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences discover extremely widespread genetic variation affecting the regulation of most genes.

...read moreread less

Abstract: Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.

...read moreread less

1,892 citations

Journal Article•DOI•

ArrayExpress update—simplifying data submissions

[...]

Nikolay Kolesnikov¹, Emma Hastings¹, Maria Keays¹, Olga Melnichuk¹, Y. Amy Tang¹, Eleanor Williams¹, Miroslaw Dylag¹, Natalja Kurbatova¹, Marco Brandizi¹, Tony Burdett¹, Karyn Megy¹, Ekaterina Pilicheva¹, Gabriella Rustici¹, Andrew Tikhonov¹, Helen Parkinson¹, Robert Petryszak¹, Ugis Sarkans¹, Alvis Brazma¹ - Show less +14 more•Institutions (1)

European Bioinformatics Institute¹

28 Jan 2015-Nucleic Acids Research

TL;DR: The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold and will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines in the near future.

...read moreread less

Abstract: The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42 000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.

...read moreread less

676 citations

Journal Article•DOI•

ArrayExpress update—trends in database growth and links to data analysis tools

[...]

Gabriella Rustici¹, Nikolay Kolesnikov¹, Marco Brandizi¹, Tony Burdett¹, Miroslaw Dylag¹, Ibrahim Emam¹, Anna Farne¹, Emma Hastings¹, Jon Ison¹, Maria Keays¹, Natalja Kurbatova¹, James Malone¹, Roby Mani¹, Annalisa Mupo¹, Rui Pedro Pereira¹, Ekaterina Pilicheva¹, Johan Rung¹, Anjan Sharma¹, Y. Amy Tang¹, Tobias Ternent¹, Andrew Tikhonov¹, Danielle Welter¹, Eleanor Williams¹, Alvis Brazma¹, Helen Parkinson¹, Ugis Sarkans¹ - Show less +22 more•Institutions (1)

Wellcome Trust Sanger Institute¹

27 Nov 2012-Nucleic Acids Research

TL;DR: The ArrayExpress Archive of Functional Genomics Data ( ArrayExpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications.

...read moreread less

Abstract: The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.

...read moreread less

377 citations

Journal Article•DOI•

Gene Expression Atlas update—a value-added database of microarray and sequencing-based functional genomics experiments

[...]

Misha Kapushesky¹, Tomasz Adamusiak¹, Tony Burdett¹, Aedín C. Culhane¹, Anna Farne¹, Alexey Filippov¹, Ele Holloway¹, Andrey Klebanov¹, Nataliya Kryvych¹, Natalja Kurbatova¹, Pavel Kurnosov¹, James Malone¹, Olga Melnichuk¹, Robert Petryszak¹, Nikolay Pultsin¹, Gabriella Rustici¹, Andrew Tikhonov¹, Ravensara S. Travillian¹, Eleanor Williams¹, Andrey Zorin¹, Helen Parkinson¹, Alvis Brazma¹ - Show less +18 more•Institutions (1)

Harvard University¹

01 Jan 2012-Nucleic Acids Research

TL;DR: The Gene Expression Atlas is an added-value database providing information about gene expression in different cell types, organism parts, developmental stages, disease states, sample treatments and other biological/experimental conditions.

...read moreread less

Abstract: Gene Expression Atlas (http://www.ebi.ac.uk/gxa) is an added-value database providing information about gene expression in different cell types, organism parts, developmental stages, disease states, sample treatments and other biological/experimental conditions. The content of this database derives from curation, re-annotation and statistical analysis of selected data from the ArrayExpress Archive and the European Nucleotide Archive. A simple interface allows the user to query for differential gene expression either by gene names or attributes or by biological conditions, e.g. diseases, organism parts or cell types. Since our previous report we made 20 monthly releases and, as of Release 11.08 (August 2011), the database supports 19 species, which contains expression data measured for 19,014 biological conditions in 136,551 assays from 5598 independent studies.

...read moreread less

166 citations

Journal Article•DOI•

The BioStudies database-one stop shop for all data supporting a life sciences study.

[...]

Ugis Sarkans¹, Mikhail Gostev¹, Awais Athar¹, Ehsan Behrangi¹, Olga Melnichuk¹, Ahmed Ali¹, Jasmine Minguet¹, Juan Camillo Rada¹, Catherine Snow¹, Andrew Tikhonov¹, Alvis Brazma¹, Johanna McEntyre¹ - Show less +8 more•Institutions (1)

European Bioinformatics Institute¹

04 Jan 2018-Nucleic Acids Research

TL;DR: BioStudies offers a simple way to describe the study structure, and provides flexible data deposition tools and data access interfaces, and is a resource for authors and publishers for packaging data during the manuscript preparation process.

...read moreread less

Abstract: BioStudies (www.ebi.ac.uk/biostudies) is a new public database that organizes data from biological studies. Typically, but not exclusively, a study is associated with a publication. BioStudies offers a simple way to describe the study structure, and provides flexible data deposition tools and data access interfaces. The actual data can be stored either in BioStudies or remotely, or both. BioStudies imports supplementary data from Europe PMC, and is a resource for authors and publishers for packaging data during the manuscript preparation process. It also can support data management needs of collaborative projects. The growth in multiomics experiments and other multi-faceted approaches to life sciences research mean that studies result in a diversity of data outputs in multiple locations. BioStudies presents a solution to ensuring that all these data and the associated publication(s) can be found coherently in the longer term.

...read moreread less

85 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Near-optimal probabilistic RNA-seq quantification

[...]

Nicolas Bray¹, Harold Pimentel¹, Páll Melsted², Lior Pachter¹•Institutions (2)

University of California, Berkeley¹, University of Iceland²

01 May 2016-Nature Biotechnology

TL;DR: Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases, which removes a major computational bottleneck in RNA-seq analysis.

...read moreread less

Abstract: We present kallisto, an RNA-seq quantification program that is two orders of magnitude faster than previous approaches and achieves similar accuracy. Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases. We use kallisto to analyze 30 million unaligned paired-end RNA-seq reads in <10 min on a standard laptop computer. This removes a major computational bottleneck in RNA-seq analysis.

...read moreread less

6,468 citations

Journal Article•DOI•

Salmon provides fast and bias-aware quantification of transcript expression

[...]

Rob Patro¹, Geet Duggal, Michael I. Love², Rafael A. Irizarry², Carl Kingsford³ - Show less +1 more•Institutions (3)

Stony Brook University¹, Harvard University², Carnegie Mellon University³

01 Apr 2017-Nature Methods

TL;DR: Salmon is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.

...read moreread less

Abstract: We introduce Salmon, a lightweight method for quantifying transcript abundance from RNA-seq reads. Salmon combines a new dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure. It is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which, as we demonstrate here, substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.

...read moreread less

6,095 citations

Journal Article•DOI•

The Molecular Signatures Database Hallmark Gene Set Collection

[...]

Arthur Liberzon¹, Chet Birger¹, Helga Thorvaldsdottir¹, Mahmoud Ghandi¹, Jill P. Mesirov², Pablo Tamayo² - Show less +2 more•Institutions (2)

Broad Institute¹, University of California, San Diego²

23 Dec 2015-Cell systems

TL;DR: A combination of automated approaches and expert curation is used to develop a collection of "hallmark" gene sets, derived from multiple "founder" sets, that conveys a specific biological state or process and displays coherent expression in MSigDB.

...read moreread less

Abstract: The Molecular Signatures Database (MSigDB) is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis. Since its creation, MSigDB has grown beyond its roots in metabolic disease and cancer to include >10,000 gene sets. These better represent a wider range of biological processes and diseases, but the utility of the database is reduced by increased redundancy across, and heterogeneity within, gene sets. To address this challenge, here we use a combination of automated approaches and expert curation to develop a collection of “hallmark” gene sets as part of MSigDB. Each hallmark in this collection consists of a “refined” gene set, derived from multiple “founder” sets, that conveys a specific biological state or process and displays coherent expression. The hallmarks effectively summarize most of the relevant information of the original founder sets and, by reducing both variation and redundancy, provide more refined and concise inputs for gene set enrichment analysis.

...read moreread less

6,062 citations

Journal Article•DOI•

The PRIDE database and related tools and resources in 2019: improving support for quantification data.

[...]

Yasset Perez-Riverol¹, Attila Csordas¹, Jingwen Bai¹, Manuel Bernal-Llinares¹, Suresh Hewapathirana¹, Deepti J. Kundu¹, Avinash Inuganti¹, Johannes Griss¹, Johannes Griss², Gerhard Mayer³, Martin Eisenacher³, Enrique Perez¹, Julian Uszkoreit³, Julianus Pfeuffer⁴, Timo Sachsenberg⁴, Şule Yılmaz⁵, Shivani Tiwary⁵, Juergen Cox⁵, Enrique Audain, Mathias Walzer¹, Andrew F. Jarnuczak¹, Tobias Ternent¹, Alvis Brazma¹, Juan Antonio Vizcaíno¹ - Show less +20 more•Institutions (5)

European Bioinformatics Institute¹, Medical University of Vienna², Ruhr University Bochum³, University of Tübingen⁴, Max Planck Society⁵

08 Jan 2019-Nucleic Acids Research

TL;DR: Key statistics on the current data contents and volume of downloads are outlined, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas are outlined.

...read moreread less

Abstract: The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world’s largest data repository of mass spectrometry-based proteomics data, and is one of the founding members of the global ProteomeXchange (PX) consortium. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2016. In the last 3 years, public data sharing through PRIDE (as part of PX) has definitely become the norm in the field. In parallel, data re-use of public proteomics data has increased enormously, with multiple applications. We first describe the new architecture of PRIDE Archive, the archival component of PRIDE. PRIDE Archive and the related data submission framework have been further developed to support the increase in submitted data volumes and additional data types. A new scalable and fault tolerant storage backend, Application Programming Interface and web interface have been implemented, as a part of an ongoing process. Additionally, we emphasize the improved support for quantitative proteomics data through the mzTab format. At last, we outline key statistics on the current data contents and volume of downloads, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas.

...read moreread less

5,735 citations

Journal Article•DOI•

The Reactome Pathway Knowledgebase.

[...]

Antonio Fabregat¹, Konstantinos Sidiropoulos¹, Phani V. Garapati¹, Marc Gillespie², Marc Gillespie³, Kerstin Hausmann¹, Robin Haw³, Bijay Jassal³, S Jupe¹, Florian Korninger¹, Sheldon J. McKay³, Lisa Matthews⁴, Bruce May³, Marija Milacic³, Karen Rothfels³, Veronica Shamovsky⁴, Marissa Webber³, Joel Weiser³, Mark Williams¹, Guanming Wu³, Lincoln Stein⁵, Lincoln Stein⁶, Lincoln Stein³, Henning Hermjakob¹, Henning Hermjakob⁷, Peter D'Eustachio⁴ - Show less +22 more•Institutions (7)

European Bioinformatics Institute¹, St. John's University², Ontario Institute for Cancer Research³, New York University⁴, Cold Spring Harbor Laboratory⁵, University of Toronto⁶, Protein Sciences⁷

01 Jan 2014-Nucleic Acids Research

TL;DR: The Reactome Knowledgebase provides molecular details of signal transduction, transport, DNA replication, metabolism and other cellular processes as an ordered network of molecular transformations—an extended version of a classic metabolic map, in a single consistent data model.

...read moreread less

Abstract: The Reactome Knowledgebase (www.reactome.org) provides molecular details of signal transduction, transport, DNA replication, metabolism and other cellular processes as an ordered network of molecular transformations-an extended version of a classic metabolic map, in a single consistent data model. Reactome functions both as an archive of biological processes and as a tool for discovering unexpected functional relationships in data such as gene expression pattern surveys or somatic mutation catalogues from tumour cells. Over the last two years we redeveloped major components of the Reactome web interface to improve usability, responsiveness and data visualization. A new pathway diagram viewer provides a faster, clearer interface and smooth zooming from the entire reaction network to the details of individual reactions. Tool performance for analysis of user datasets has been substantially improved, now generating detailed results for genome-wide expression datasets within seconds. The analysis module can now be accessed through a RESTFul interface, facilitating its inclusion in third party applications. A new overview module allows the visualization of analysis results on a genome-wide Reactome pathway hierarchy using a single screen page. The search interface now provides auto-completion as well as a faceted search to narrow result lists efficiently.

...read moreread less

5,065 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse