scispace - formally typeset
Search or ask a question
Author

Helena Kilpinen

Bio: Helena Kilpinen is an academic researcher from European Bioinformatics Institute. The author has contributed to research in topics: Induced pluripotent stem cell & Biology. The author has an hindex of 21, co-authored 30 publications receiving 6354 citations. Previous affiliations of Helena Kilpinen include Wellcome Trust & Wellcome Trust Sanger Institute.

Papers
More filters
Journal ArticleDOI
Silvia De Rubeis1, Xin-Xin He2, Arthur P. Goldberg1, Christopher S. Poultney1, Kaitlin E. Samocha3, A. Ercument Cicek2, Yan Kou1, Li Liu2, Menachem Fromer1, Menachem Fromer3, R. Susan Walker4, Tarjinder Singh5, Lambertus Klei6, Jack A. Kosmicki3, Shih-Chen Fu1, Branko Aleksic7, Monica Biscaldi8, Patrick Bolton9, Jessica M. Brownfeld1, Jinlu Cai1, Nicholas G. Campbell10, Angel Carracedo11, Angel Carracedo12, Maria H. Chahrour3, Andreas G. Chiocchetti, Hilary Coon13, Emily L. Crawford10, Lucy Crooks5, Sarah Curran9, Geraldine Dawson14, Eftichia Duketis, Bridget A. Fernandez15, Louise Gallagher16, Evan T. Geller17, Stephen J. Guter18, R. Sean Hill3, R. Sean Hill19, Iuliana Ionita-Laza20, Patricia Jiménez González, Helena Kilpinen, Sabine M. Klauck21, Alexander Kolevzon1, Irene Lee22, Jing Lei2, Terho Lehtimäki, Chiao-Feng Lin17, Avi Ma'ayan1, Christian R. Marshall4, Alison L. McInnes23, Benjamin M. Neale24, Michael John Owen25, Norio Ozaki7, Mara Parellada26, Jeremy R. Parr27, Shaun Purcell1, Kaija Puura, Deepthi Rajagopalan4, Karola Rehnström5, Abraham Reichenberg1, Aniko Sabo28, Michael Sachse, Stephen Sanders29, Chad M. Schafer2, Martin Schulte-Rüther30, David Skuse22, David Skuse31, Christine Stevens24, Peter Szatmari32, Kristiina Tammimies4, Otto Valladares17, Annette Voran33, Li-San Wang17, Lauren A. Weiss29, A. Jeremy Willsey29, Timothy W. Yu19, Timothy W. Yu3, Ryan K. C. Yuen4, Edwin H. Cook18, Christine M. Freitag, Michael Gill16, Christina M. Hultman34, Thomas Lehner35, Aarno Palotie36, Aarno Palotie24, Aarno Palotie3, Gerard D. Schellenberg17, Pamela Sklar1, Matthew W. State29, James S. Sutcliffe10, Christopher A. Walsh19, Christopher A. Walsh3, Stephen W. Scherer4, Michael E. Zwick37, Jeffrey C. Barrett5, David J. Cutler37, Kathryn Roeder2, Bernie Devlin6, Mark J. Daly3, Mark J. Daly24, Joseph D. Buxbaum1 
13 Nov 2014-Nature
TL;DR: Using exome sequencing, it is shown that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate of < 0.05, plus a set of 107 genes strongly enriched for those likely to affect risk (FDR < 0.30).
Abstract: The genetic architecture of autism spectrum disorder involves the interplay of common and rare variants and their impact on hundreds of genes. Using exome sequencing, here we show that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate (FDR) < 0.05, plus a set of 107 autosomal genes strongly enriched for those likely to affect risk (FDR < 0.30). These 107 genes, which show unusual evolutionary constraint against mutations, incur de novo loss-of-function mutations in over 5% of autistic subjects. Many of the genes implicated encode proteins for synaptic formation, transcriptional regulation and chromatin-remodelling pathways. These include voltage-gated ion channels regulating the propagation of action potentials, pacemaking and excitability-transcription coupling, as well as histone-modifying enzymes and chromatin remodellers-most prominently those that mediate post-translational lysine methylation/demethylation modifications of histones.

2,228 citations

Journal ArticleDOI
26 Sep 2013-Nature
TL;DR: Se sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project—the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences discover extremely widespread genetic variation affecting the regulation of most genes.
Abstract: Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.

1,892 citations

Journal ArticleDOI
Peter J. Campbell1, Gad Getz2, Jan O. Korbel3, Joshua M. Stuart4  +1329 moreInstitutions (238)
06 Feb 2020-Nature
TL;DR: The flagship paper of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium describes the generation of the integrative analyses of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types, the structures for international data sharing and standardized analyses, and the main scientific findings from across the consortium studies.
Abstract: Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale1,2,3. Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4–5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter4; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation5,6; analyses timings and patterns of tumour evolution7; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity8,9; and evaluates a range of more-specialized features of cancer genomes8,10,11,12,13,14,15,16,17,18.

1,600 citations

Journal ArticleDOI
15 Jun 2017-Nature
TL;DR: This study outlines the major sources of genetic and phenotypic variation in iPS cells and establishes their suitability as models of complex human traits and cancer.
Abstract: Technology utilizing human induced pluripotent stem cells (iPS cells) has enormous potential to provide improved cellular models of human disease. However, variable genetic and phenotypic characterization of many existing iPS cell lines limits their potential use for research and therapy. Here we describe the systematic generation, genotyping and phenotyping of 711 iPS cell lines derived from 301 healthy individuals by the Human Induced Pluripotent Stem Cells Initiative. Our study outlines the major sources of genetic and phenotypic variation in iPS cells and establishes their suitability as models of complex human traits and cancer. Through genome-wide profiling we find that 5-46% of the variation in different iPS cell phenotypes, including differentiation capacity and cellular morphology, arises from differences between individuals. Additionally, we assess the phenotypic consequences of genomic copy-number alterations that are repeatedly observed in iPS cells. In addition, we present a comprehensive map of common regulatory variants affecting the transcriptome of human pluripotent cells.

462 citations

Journal ArticleDOI
08 Nov 2013-Science
TL;DR: This work quantified allelic variability in transcription factor binding, histone modifications, and gene expression within humans and found abundant allelic specificity in chromatin and extensive local, short-range, and long-range allelic coordination among the studied molecular phenotypes.
Abstract: DNA sequence variation has been associated with quantitative changes in molecular phenotypes such as gene expression, but its impact on chromatin states is poorly characterized. To understand the interplay between chromatin and genetic control of gene regulation, we quantified allelic variability in transcription factor binding, histone modifications, and gene expression within humans. We found abundant allelic specificity in chromatin and extensive local, short-range, and long-range allelic coordination among the studied molecular phenotypes. We observed genetic influence on most of these phenotypes, with histone modifications exhibiting strong context-dependent behavior. Our results implicate transcription factors as primary mediators of sequence-specific regulation of gene expression programs, with histone modifications frequently reflecting the primary regulatory event.

363 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases, which removes a major computational bottleneck in RNA-seq analysis.
Abstract: We present kallisto, an RNA-seq quantification program that is two orders of magnitude faster than previous approaches and achieves similar accuracy. Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases. We use kallisto to analyze 30 million unaligned paired-end RNA-seq reads in <10 min on a standard laptop computer. This removes a major computational bottleneck in RNA-seq analysis.

6,468 citations

Journal ArticleDOI
TL;DR: Salmon is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.
Abstract: We introduce Salmon, a lightweight method for quantifying transcript abundance from RNA-seq reads. Salmon combines a new dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure. It is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which, as we demonstrate here, substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.

6,095 citations

Journal ArticleDOI
TL;DR: This work presents a method named HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) that can align both DNA and RNA sequences using a graph Ferragina Manzini index, and uses it to represent and search an expanded model of the human reference genome.
Abstract: The human reference genome represents only a small number of individuals, which limits its usefulness for genotyping. We present a method named HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) that can align both DNA and RNA sequences using a graph Ferragina Manzini index. We use HISAT2 to represent and search an expanded model of the human reference genome in which over 14.5 million genomic variants in combination with haplotypes are incorporated into the data structure used for searching and alignment. We benchmark HISAT2 using simulated and real datasets to demonstrate that our strategy of representing a population of genomes, together with a fast, memory-efficient search algorithm, provides more detailed and accurate variant analyses than other methods. We apply HISAT2 for HLA typing and DNA fingerprinting; both applications form part of the HISAT-genotype software that enables analysis of haplotype-resolved genes or genomic regions. HISAT-genotype outperforms other computational methods and matches or exceeds the performance of laboratory-based assays. A graph-based genome indexing scheme enables variant-aware alignment of sequences with very low memory requirements.

4,855 citations

Journal ArticleDOI
Kristin G. Ardlie, David S. DeLuca, Ayellet V. Segrè, Timothy J. Sullivan, Taylor Young, Ellen Gelfand, Casandra A. Trowbridge, Julian Maller, Taru Tukiainen, Monkol Lek, Lucas D. Ward, Pouya Kheradpour, Benjamin Iriarte, Yan Meng, Cameron D. Palmer, Tõnu Esko, Wendy Winckler, Joel N. Hirschhorn, Manolis Kellis, Daniel G. MacArthur, Gad Getz, Andrey A. Shabalin, Gen Li, Yi-Hui Zhou, Andrew B. Nobel, Ivan Rusyn, Fred A. Wright, Tuuli Lappalainen, Pedro G. Ferreira, Halit Ongen, Manuel A. Rivas, Alexis Battle, Sara Mostafavi, Jean Monlong, Michael Sammeth, Marta Melé, Ferran Reverter, Jakob M. Goldmann, Daphne Koller, Roderic Guigó, Mark I. McCarthy, Emmanouil T. Dermitzakis, Eric R. Gamazon, Hae Kyung Im, Anuar Konkashbaev, Dan L. Nicolae, Nancy J. Cox, Timothée Flutre, Xiaoquan Wen, Matthew Stephens, Jonathan K. Pritchard, Zhidong Tu, Bin Zhang, Tao Huang, Quan Long, Luan Lin, Jialiang Yang, Jun Zhu, Jun Liu, Amanda Brown, Bernadette Mestichelli, Denee Tidwell, Edmund Lo, Mike Salvatore, Saboor Shad, Jeffrey A. Thomas, John T. Lonsdale, Michael T. Moser, Bryan Gillard, Ellen Karasik, Kimberly Ramsey, Christopher Choi, Barbara A. Foster, John Syron, Johnell Fleming, Harold Magazine, Rick Hasz, Gary Walters, Jason Bridge, Mark Miklos, Susan L. Sullivan, Laura Barker, Heather M. Traino, Maghboeba Mosavel, Laura A. Siminoff, Dana R. Valley, Daniel C. Rohrer, Scott D. Jewell, Philip A. Branton, Leslie H. Sobin, Mary Barcus, Liqun Qi, Jeffrey McLean, Pushpa Hariharan, Ki Sung Um, Shenpei Wu, David Tabor, Charles Shive, Anna M. Smith, Stephen A. Buia, Anita H. Undale, Karna Robinson, Nancy Roche, Kimberly M. Valentino, Angela Britton, Robin Burges, Debra Bradbury, Kenneth W. Hambright, John Seleski, Greg E. Korzeniewski, Kenyon Erickson, Yvonne Marcus, Jorge Tejada, Mehran Taherian, Chunrong Lu, Margaret J. Basile, Deborah C. Mash, Simona Volpi, Jeffery P. Struewing, Gary F. Temple, Joy T. Boyer, Deborah Colantuoni, Roger Little, Susan E. Koester, Latarsha J. Carithers, Helen M. Moore, Ping Guan, Carolyn C. Compton, Sherilyn Sawyer, Joanne P. Demchok, Jimmie B. Vaught, Chana A. Rabiner, Nicole C. Lockhart 
08 May 2015-Science
TL;DR: The landscape of gene expression across tissues is described, thousands of tissue-specific and shared regulatory expression quantitative trait loci (eQTL) variants are cataloged, complex network relationships are described, and signals from genome-wide association studies explained by eQTLs are identified.
Abstract: Understanding the functional consequences of genetic variation, and how it affects complex human disease and quantitative traits, remains a critical challenge for biomedicine. We present an analysi...

4,418 citations

01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

4,409 citations