Molecular and functional variation in iPSC-derived sensory neurons.
TL;DR: It is estimated that recall-by-genotype studies that use iPSC-derived cells will require cells from at least 20–80 individuals to detect the effects of regulatory variants with moderately large effect sizes, despite high differentiation-induced variability.
Abstract: Induced pluripotent stem cells (iPSCs), and cells derived from them, have become key tools for modeling biological processes, particularly in cell types that are difficult to obtain from living donors. Here we present a map of regulatory variants in iPSC-derived neurons, based on 123 differentiations of iPSCs to a sensory neuronal fate. Gene expression was more variable across cultures than in primary dorsal root ganglion, particularly for genes related to nervous system development. Using single-cell RNA-sequencing, we found that the number of neuronal versus contaminating cells was influenced by iPSC culture conditions before differentiation. Despite high differentiation-induced variability, our allele-specific method detected thousands of quantitative trait loci (QTLs) that influenced gene expression, chromatin accessibility, and RNA splicing. On the basis of these detected QTLs, we estimate that recall-by-genotype studies that use iPSC-derived cells will require cells from at least 20-80 individuals to detect the effects of regulatory variants with moderately large effect sizes.
Citations
More filters
••
VU University Amsterdam1, University of Oslo2, Oslo University Hospital3, University of California, San Diego4, University of Bergen5, Norwegian University of Science and Technology6, University of Michigan7, Namsos Hospital8, Statens Serum Institut9, Harvard University10, King's College London11, Vanderbilt University Medical Center12, Karolinska Institutet13, University of California, Riverside14, Jönköping University15, deCODE genetics16, University of Iceland17, University of Gothenburg18, Sahlgrenska University Hospital19, Akershus University Hospital20, Stavanger University Hospital21, Broad Institute22, Charité23, University of Amsterdam24
TL;DR: This paper identified microglia, immune cells and protein catabolism as relevant genes for late-onset Alzheimer's disease, while identifying and prioritizing previously unidentified genes of potential interest.
Abstract: Late-onset Alzheimer's disease is a prevalent age-related polygenic disease that accounts for 50-70% of dementia cases. Currently, only a fraction of the genetic variants underlying Alzheimer's disease have been identified. Here we show that increased sample sizes allowed identification of seven previously unidentified genetic loci contributing to Alzheimer's disease. This study highlights microglia, immune cells and protein catabolism as relevant to late-onset Alzheimer's disease, while identifying and prioritizing previously unidentified genes of potential interest. We anticipate that these results can be included in larger meta-analyses of Alzheimer's disease to identify further genetic variants that contribute to Alzheimer's pathology.
269 citations
••
TL;DR: It is concluded that a focus on patient stratification is needed to achieve the goals of precision medicine, and the recent "omnigenic" or "core genes" model may underestimate the biological complexity of common disease.
219 citations
••
TL;DR: Induced pluripotent stem cells from 125 donors are exploited to track gene expression changes and expression quantitative trait loci at single cell resolution during in vitro endoderm differentiation to identify molecular markers that are predictive of differentiation efficiency of individual lines.
Abstract: Recent developments in stem cell biology have enabled the study of cell fate decisions in early human development that are impossible to study in vivo. However, understanding how development varies across individuals and, in particular, the influence of common genetic variants during this process has not been characterised. Here, we exploit human iPS cell lines from 125 donors, a pooled experimental design, and single-cell RNA-sequencing to study population variation of endoderm differentiation. We identify molecular markers that are predictive of differentiation efficiency of individual lines, and utilise heterogeneity in the genetic background across individuals to map hundreds of expression quantitative trait loci that influence expression dynamically during differentiation and across cellular contexts.
196 citations
••
TL;DR: In this paper, the authors performed an updated genome-wide AD meta-analysis, which identified 37 risk loci, including new associations near CCDC6, TSPAN14, NCK2 and SPRED2.
Abstract: Genome-wide association studies have discovered numerous genomic loci associated with Alzheimer's disease (AD); yet the causal genes and variants are incompletely identified. We performed an updated genome-wide AD meta-analysis, which identified 37 risk loci, including new associations near CCDC6, TSPAN14, NCK2 and SPRED2. Using three SNP-level fine-mapping methods, we identified 21 SNPs with >50% probability each of being causally involved in AD risk and others strongly suggested by functional annotation. We followed this with colocalization analyses across 109 gene expression quantitative trait loci datasets and prioritization of genes by using protein interaction networks and tissue-specific expression. Combining this information into a quantitative score, we found that evidence converged on likely causal genes, including the above four genes, and those at previously discovered AD loci, including BIN1, APH1B, PTK2B, PILRA and CASS4.
189 citations
••
TL;DR: It is proposed that the links between rare and common variants implicated in psychiatric disease risk constitute a potentially generalizable phenomenon occurring more widely in complex genetic disorders.
Abstract: The mechanisms by which common risk variants of small effect interact to contribute to complex genetic disorders are unclear. Here, we apply a genetic approach, using isogenic human induced pluripotent stem cells, to evaluate the effects of schizophrenia (SZ)-associated common variants predicted to function as SZ expression quantitative trait loci (eQTLs). By integrating CRISPR-mediated gene editing, activation and repression technologies to study one putative SZ eQTL (FURIN rs4702) and four top-ranked SZ eQTL genes (FURIN, SNAP91, TSNARE1 and CLCN3), our platform resolves pre- and postsynaptic neuronal deficits, recapitulates genotype-dependent gene expression differences and identifies convergence downstream of SZ eQTL gene perturbations. Our observations highlight the cell-type-specific effects of common variants and demonstrate a synergistic effect between SZ eQTL genes that converges on synaptic function. We propose that the links between rare and common variants implicated in psychiatric disease risk constitute a potentially generalizable phenomenon occurring more widely in complex genetic disorders.
160 citations
References
More filters
••
TL;DR: FeatureCounts as discussed by the authors is a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments, which implements highly efficient chromosome hashing and feature blocking techniques.
Abstract: MOTIVATION: Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. RESULTS: We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. AVAILABILITY AND IMPLEMENTATION: featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.
14,103 citations
••
TL;DR: This work presents Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer, and uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for more robust predictions.
Abstract: We present Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer. MACS empirically models the shift size of ChIP-Seq tags, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for more robust predictions. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, and is freely available.
13,008 citations
••
TL;DR: VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API.
Abstract: Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API.
Availability: http://vcftools.sourceforge.net
Contact: [email protected]
10,164 citations
••
TL;DR: CIBERSORT outperformed other methods with respect to noise, unknown mixture content and closely related cell types when applied to enumeration of hematopoietic subsets in RNA mixtures from fresh, frozen and fixed tissues, including solid tumors.
Abstract: We introduce CIBERSORT, a method for characterizing cell composition of complex tissues from their gene expression profiles When applied to enumeration of hematopoietic subsets in RNA mixtures from fresh, frozen and fixed tissues, including solid tumors, CIBERSORT outperformed other methods with respect to noise, unknown mixture content and closely related cell types CIBERSORT should enable large-scale analysis of RNA mixtures for cellular biomarkers and therapeutic targets (http://cibersortstanfordedu/)
6,967 citations
••
TL;DR: The feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making is demonstrated and classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes are discovered.
Abstract: We describe an assay for transposase-accessible chromatin using sequencing (ATAC-seq), based on direct in vitro transposition of sequencing adaptors into native chromatin, as a rapid and sensitive method for integrative epigenomic analysis. ATAC-seq captures open chromatin sites using a simple two-step protocol with 500-50,000 cells and reveals the interplay between genomic locations of open chromatin, DNA-binding proteins, individual nucleosomes and chromatin compaction at nucleotide resolution. We discovered classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes. Using ATAC-seq maps of human CD4(+) T cells from a proband obtained on consecutive days, we demonstrated the feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making.
4,984 citations
"Molecular and functional variation ..." refers methods in this paper
...Tagmentation, PCR amplification and size selection The tagmentation and PCR methods used here are in principle the same as that described in (Buenrostro et al. 2013), but with some modifications as described in (Kumasaka, Knights, and Gaffney 2016)....
[...]