Home
/
Authors
/
Siddarth Selvaraj

Author

Siddarth Selvaraj

Other affiliations: University of California, San Diego

Bio: Siddarth Selvaraj is an academic researcher from Ludwig Institute for Cancer Research. The author has contributed to research in topics: Genome & Chromatin. The author has an hindex of 17, co-authored 26 publications receiving 9551 citations. Previous affiliations of Siddarth Selvaraj include University of California, San Diego.

Topics: Genome, Chromatin, Epigenomics, Reference genome, Human genome ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Topological domains in mammalian genomes identified by analysis of chromatin interactions

[...]

Jesse R. Dixon¹, Siddarth Selvaraj¹, Siddarth Selvaraj², Feng Yue¹, Audrey Kim¹, Yan-Yan Li¹, Yin-Zhong Shen¹, Ming Hu³, Jun Liu³, Bing Ren¹, Bing Ren² - Show less +7 more•Institutions (3)

Ludwig Institute for Cancer Research¹, University of California, San Diego², Harvard University³

17 May 2012-Nature

TL;DR: It is found that the boundaries of topological domains are enriched for the insulator binding protein CTCF, housekeeping genes, transfer RNAs and short interspersed element (SINE) retrotransposons, indicating that these factors may have a role in establishing the topological domain structure of the genome.

...read moreread less

Abstract: The spatial organization of the genome is intimately linked to its biological function, yet our understanding of higher order genomic structure is coarse, fragmented and incomplete. In the nucleus of eukaryotic cells, interphase chromosomes occupy distinct chromosome territories, and numerous models have been proposed for how chromosomes fold within chromosome territories. These models, however, provide only few mechanistic details about the relationship between higher order chromatin structure and genome function. Recent advances in genomic technologies have led to rapid advances in the study of three-dimensional genome organization. In particular, Hi-C has been introduced as a method for identifying higher order chromatin interactions genome wide. Here we investigate the three-dimensional organization of the human and mouse genomes in embryonic stem cells and terminally differentiated cell types at unprecedented resolution. We identify large, megabase-sized local chromatin interaction domains, which we term 'topological domains', as a pervasive structural feature of the genome organization. These domains correlate with regions of the genome that constrain the spread of heterochromatin. The domains are stable across different cell types and highly conserved across species, indicating that topological domains are an inherent property of mammalian genomes. Finally, we find that the boundaries of topological domains are enriched for the insulator binding protein CTCF, housekeeping genes, transfer RNAs and short interspersed element (SINE) retrotransposons, indicating that these factors may have a role in establishing the topological domain structure of the genome.

...read moreread less

5,774 citations

Journal Article•DOI•

Chromatin architecture reorganization during stem cell differentiation

[...]

Jesse R. Dixon¹, Inkyung Jung¹, Siddarth Selvaraj¹, Yin Shen¹, Jessica Antosiewicz-Bourget², Ah Young Lee¹, Zhen Ye¹, Audrey Kim¹, Nisha Rajagopal¹, Wei Xie³, Yarui Diao¹, Jing Liang⁴, Huimin Zhao⁴, Victor V. Lobanenkov⁵, Joseph R. Ecker⁶, James A. Thomson⁷, Bing Ren⁸ - Show less +13 more•Institutions (8)

Ludwig Institute for Cancer Research¹, Morgridge Institute for Research², Tsinghua University³, University of Illinois at Urbana–Champaign⁴, National Institutes of Health⁵, Salk Institute for Biological Studies⁶, University of Wisconsin-Madison⁷, University of California, San Diego⁸

19 Feb 2015-Nature

TL;DR: Mapping genome-wide chromatin interactions in human embryonic stem cells and four human ES-cell-derived lineages reveals extensive chromatin reorganization during lineage specification, providing a global view of chromatin dynamics and a resource for studying long-range control of gene expression in distinct human cell lineages.

...read moreread less

Abstract: Higher-order chromatin structure is emerging as an important regulator of gene expression. Although dynamic chromatin structures have been identified in the genome, the full scope of chromatin dynamics during mammalian development and lineage specification remains to be determined. By mapping genome-wide chromatin interactions in human embryonic stem (ES) cells and four human ES-cell-derived lineages, we uncover extensive chromatin reorganization during lineage specification. We observe that although self-associating chromatin domains are stable during differentiation, chromatin interactions both within and between domains change in a striking manner, altering 36% of active and inactive chromosomal compartments throughout the genome. By integrating chromatin interaction maps with haplotype-resolved epigenome and transcriptome data sets, we find widespread allelic bias in gene expression correlated with allele-biased chromatin states of linked promoters and distal enhancers. Our results therefore provide a global view of chromatin dynamics and a resource for studying long-range control of gene expression in distinct human cell lineages.

...read moreread less

1,393 citations

Journal Article•DOI•

A high-resolution map of the three-dimensional chromatin interactome in human cells

[...]

Fulai Jin¹, Yan Li¹, Jesse R. Dixon², Jesse R. Dixon¹, Siddarth Selvaraj², Siddarth Selvaraj¹, Zhen Ye¹, Ah Young Lee¹, Chia-An Yen¹, Anthony D. Schmitt², Anthony D. Schmitt¹, Celso A. Espinoza¹, Bing Ren², Bing Ren¹ - Show less +10 more•Institutions (2)

Ludwig Institute for Cancer Research¹, University of California, San Diego²

14 Nov 2013-Nature

TL;DR: A comprehensive chromatin interaction map generated in human fibroblasts using a genome-wide 3C analysis method (Hi-C) is reported and suggests that the three-dimensional chromatin landscape, once established in a particular cell type, is relatively stable and could influence the selection of target genes by a ubiquitous transcription activator in a cell-specific manner.

...read moreread less

Abstract: A large number of cis-regulatory sequences have been annotated in the human genome, but defining their target genes remains a challenge. One strategy is to identify the long-range looping interactions at these elements with the use of chromosome conformation capture (3C)-based techniques. However, previous studies lack either the resolution or coverage to permit a whole-genome, unbiased view of chromatin interactions. Here we report a comprehensive chromatin interaction map generated in human fibroblasts using a genome-wide 3C analysis method (Hi-C). We determined over one million long-range chromatin interactions at 5-10-kb resolution, and uncovered general principles of chromatin organization at different types of genomic features. We also characterized the dynamics of promoter-enhancer contacts after TNF-α signalling in these cells. Unexpectedly, we found that TNF-α-responsive enhancers are already in contact with their target promoters before signalling. Such pre-existing chromatin looping, which also exists in other cell types with different extracellular signalling, is a strong predictor of gene induction. Our observations suggest that the three-dimensional chromatin landscape, once established in a particular cell type, is relatively stable and could influence the selection or activation of target genes by a ubiquitous transcription activator in a cell-specific manner.

...read moreread less

1,144 citations

Journal Article•DOI•

Towards complete and error-free genome assemblies of all vertebrate species

[...]

Arang Rhie¹, Shane A. McCarthy², Shane A. McCarthy³, Olivier Fedrigo⁴, Joana Damas⁵, Giulio Formenti⁴, Sergey Koren¹, Marcela Uliano-Silva⁶, William Chow², Arkarachai Fungtammasan, J. H. Kim⁷, Chul Hee Lee⁷, Byung June Ko⁷, Mark Chaisson⁸, Gregory Gedman⁴, Lindsey J. Cantin⁴, Françoise Thibaud-Nissen¹, Leanne Haggerty⁹, Iliana Bista², Iliana Bista³, Michelle Smith², Bettina Haase⁴, Jacquelyn Mountcastle⁴, Sylke Winkler¹⁰, Sylke Winkler¹¹, Sadye Paez⁴, Jason T. Howard, Sonja C. Vernes¹², Sonja C. Vernes¹¹, Sonja C. Vernes¹³, Tanya M. Lama¹⁴, Frank Grützner¹⁵, Wesley C. Warren¹⁶, Christopher N. Balakrishnan¹⁷, Dave W Burt¹⁸, Jimin George¹⁹, Matthew T. Biegler⁴, David Iorns, Andrew Digby, Daryl Eason, Bruce C. Robertson²⁰, Taylor Edwards²¹, Mark Wilkinson²², George F. Turner²³, Axel Meyer²⁴, Andreas F. Kautt²⁵, Andreas F. Kautt²⁴, Paolo Franchini²⁴, H. William Detrich²⁶, Hannes Svardal²⁷, Hannes Svardal²⁸, Maximilian Wagner²⁹, Gavin J. P. Naylor³⁰, Martin Pippel¹¹, Milan Malinsky³¹, Milan Malinsky², Mark Mooney, Maria Simbirsky, Brett T. Hannigan, Trevor Pesout³², Marlys L. Houck³³, Ann C Misuraca³³, Sarah B. Kingan³⁴, Richard Hall³⁴, Zev N. Kronenberg³⁴, Ivan Sović³⁴, Christopher Dunn³⁴, Zemin Ning², Alex Hastie, Joyce V. Lee, Siddarth Selvaraj, Richard E. Green³², Nicholas H. Putnam, Ivo Gut³⁵, Jay Ghurye³⁶, Erik Garrison³², Ying Sims², Joanna Collins², Sarah Pelan², James Torrance², Alan Tracey², Jonathan Wood², Robel E. Dagnew⁸, Dengfeng Guan³, Dengfeng Guan³⁷, Sarah E. London³⁸, David F. Clayton¹⁹, Claudio V. Mello³⁹, Samantha R. Friedrich³⁹, Peter V. Lovell³⁹, Ekaterina Osipova¹¹, Farooq O. Al-Ajli⁴⁰, Farooq O. Al-Ajli⁴¹, Simona Secomandi⁴², Heebal Kim⁷, Constantina Theofanopoulou⁴, Michael Hiller⁴³, Yang Zhou, Robert S. Harris⁴⁴, Kateryna D. Makova⁴⁴, Paul Medvedev⁴⁴, Jinna Hoffman¹, Patrick Masterson¹, Karen Clark¹, Fergal J. Martin⁹, Kevin L. Howe⁹, Paul Flicek⁹, Brian P. Walenz¹, Woori Kwak, Hiram Clawson³², Mark Diekhans³², Luis R Nassar³², Benedict Paten³², Robert H. S. Kraus¹¹, Robert H. S. Kraus²⁴, Andrew J. Crawford⁴⁵, M. Thomas P. Gilbert⁴⁶, M. Thomas P. Gilbert⁴⁷, Guojie Zhang, Byrappa Venkatesh⁴⁸, Robert W. Murphy⁴⁹, Klaus-Peter Koepfli⁵⁰, Beth Shapiro⁵¹, Beth Shapiro³², Warren E. Johnson⁵², Warren E. Johnson⁵⁰, Federica Di Palma⁵³, Tomas Marques-Bonet, Emma C. Teeling⁵⁴, Tandy Warnow⁵⁵, Jennifer A. Marshall Graves⁵⁶, Oliver A. Ryder³³, Oliver A. Ryder⁵⁷, David Haussler³², Stephen J. O'Brien⁵⁸, Jonas Korlach³⁴, Harris A. Lewin⁵, Kerstin Howe², Eugene W. Myers¹⁰, Eugene W. Myers¹¹, Richard Durbin³, Richard Durbin², Adam M. Phillippy¹, Erich D. Jarvis⁵¹, Erich D. Jarvis⁴ - Show less +141 more•Institutions (58)

National Institutes of Health¹, Wellcome Trust Sanger Institute², University of Cambridge³, Rockefeller University⁴, University of California, Davis⁵, Leibniz Association⁶, Seoul National University⁷, University of Southern California⁸, European Bioinformatics Institute⁹, Dresden University of Technology¹⁰, Max Planck Society¹¹, Radboud University Nijmegen¹², University of St Andrews¹³, University of Massachusetts Amherst¹⁴, University of Adelaide¹⁵, University of Missouri¹⁶, East Carolina University¹⁷, University of Queensland¹⁸, Clemson University¹⁹, University of Otago²⁰, University of Arizona²¹, Natural History Museum²², Bangor University²³, University of Konstanz²⁴, Harvard University²⁵, Northeastern University²⁶, National Museum of Natural History²⁷, University of Antwerp²⁸, University of Graz²⁹, University of Florida³⁰, University of Basel³¹, University of California, Santa Cruz³², Zoological Society of San Diego³³, Pacific Biosciences³⁴, Pompeu Fabra University³⁵, University of Maryland, College Park³⁶, Harbin Institute of Technology³⁷, University of Chicago³⁸, Oregon Health & Science University³⁹, Monash University Malaysia Campus⁴⁰, Qatar Airways⁴¹, University of Milan⁴², Goethe University Frankfurt⁴³, Pennsylvania State University⁴⁴, University of Los Andes⁴⁵, University of Copenhagen⁴⁶, Norwegian University of Science and Technology⁴⁷, Agency for Science, Technology and Research⁴⁸, Royal Ontario Museum⁴⁹, Smithsonian Institution⁵⁰, Howard Hughes Medical Institute⁵¹, Walter Reed Army Institute of Research⁵², University of East Anglia⁵³, University College Dublin⁵⁴, University of Illinois at Urbana–Champaign⁵⁵, La Trobe University⁵⁶, University of California, San Diego⁵⁷, Nova Southeastern University⁵⁸

28 Apr 2021-Nature

TL;DR: The Vertebrate Genomes Project (VGP) as mentioned in this paper is an international effort to generate high quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.

...read moreread less

Abstract: High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1-4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.

...read moreread less

647 citations

Journal Article•DOI•

Human body epigenome maps reveal noncanonical DNA methylation variation

[...]

Matthew D. Schultz¹, Yupeng He¹, John W. Whitaker², Manoj Hariharan¹, Eran A. Mukamel¹, Danny Leung³, Nisha Rajagopal³, Joseph R. Nery¹, Mark A. Urich¹, Huaming Chen¹, Shin Lin⁴, Yiing Lin⁵, Inkyung Jung³, Anthony D. Schmitt³, Siddarth Selvaraj², Bing Ren², Terrence J. Sejnowski¹, Wei Wang², Joseph R. Ecker¹ - Show less +15 more•Institutions (5)

Salk Institute for Biological Studies¹, University of California, San Diego², Ludwig Institute for Cancer Research³, Stanford University⁴, Washington University in St. Louis⁵

09 Jul 2015-Nature

TL;DR: High coverage methylomes are reported that catalogue cytosine methylation in all contexts for the major human organ systems, integrated with matched transcriptomes and genomic sequence.

...read moreread less

Abstract: Understanding the diversity of human tissues is fundamental to disease and requires linking genetic information, which is identical in most of an individual's cells, with epigenetic mechanisms that could have tissue-specific roles. Surveys of DNA methylation in human tissues have established a complex landscape including both tissue-specific and invariant methylation patterns. Here we report high coverage methylomes that catalogue cytosine methylation in all contexts for the major human organ systems, integrated with matched transcriptomes and genomic sequence. By combining these diverse data types with each individuals' phased genome, we identified widespread tissue-specific differential CG methylation (mCG), partially methylated domains, allele-specific methylation and transcription, and the unexpected presence of non-CG methylation (mCH) in almost all human tissues. mCH correlated with tissue-specific functions, and using this mark, we made novel predictions of genes that escape X-chromosome inactivation in specific tissues. Overall, DNA methylation in several genomic contexts varies substantially among human tissues.

...read moreread less

577 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1（PfPMP1）与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员，通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

...read moreread less

18,940 citations

Journal Article•DOI•

A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping

[...]

Suhas S.P. Rao¹, Miriam H. Huntley¹, Neva C. Durand, Elena K. Stamenova, Ivan D. Bochkov¹, James T. Robinson¹, James T. Robinson², Adrian L. Sanborn¹, Ido Machol³, Ido Machol¹, Arina D. Omer³, Arina D. Omer¹, Eric S. Lander⁴, Eric S. Lander⁵, Eric S. Lander², Erez Lieberman Aiden - Show less +12 more•Institutions (5)

Baylor College of Medicine¹, Broad Institute², Rice University³, Massachusetts Institute of Technology⁴, Harvard University⁵

18 Dec 2014-Cell

TL;DR: In situ Hi-C is used to probe the 3D architecture of genomes, constructing haploid and diploid maps of nine cell types, identifying ∼10,000 loops that frequently link promoters and enhancers, correlate with gene activation, and show conservation across cell types and species.

...read moreread less

5,945 citations

Journal Article•DOI•

Integrative analysis of 111 reference human epigenomes

[...]

Anshul Kundaje¹, Wouter Meuleman², Wouter Meuleman¹, Jason Ernst³, Misha Bilenky⁴, Angela Yen¹, Angela Yen², Alireza Heravi-Moussavi⁴, Pouya Kheradpour¹, Pouya Kheradpour², Zhizhuo Zhang², Zhizhuo Zhang¹, Jianrong Wang², Jianrong Wang¹, Michael J. Ziller², Viren Amin⁵, John W. Whitaker, Matthew D. Schultz⁶, Lucas D. Ward², Lucas D. Ward¹, Abhishek Sarkar¹, Abhishek Sarkar², Gerald Quon¹, Gerald Quon², Richard Sandstrom⁷, Matthew L. Eaton¹, Matthew L. Eaton², Yi-Chieh Wu², Yi-Chieh Wu¹, Andreas R. Pfenning¹, Andreas R. Pfenning², Xinchen Wang², Xinchen Wang¹, Melina Claussnitzer¹, Melina Claussnitzer², Yaping Liu¹, Yaping Liu², Cristian Coarfa⁵, R. Alan Harris⁵, Noam Shoresh², Charles B. Epstein², Elizabeta Gjoneska², Elizabeta Gjoneska¹, Danny Leung⁸, Wei Xie⁸, R. David Hawkins⁸, Ryan Lister⁶, Chibo Hong⁹, Philippe Gascard⁹, Andrew J. Mungall⁴, Richard A. Moore⁴, Eric Chuah⁴, Angela Tam⁴, Theresa K. Canfield⁷, R. Scott Hansen⁷, Rajinder Kaul⁷, Peter J. Sabo⁷, Mukul S. Bansal¹, Mukul S. Bansal², Mukul S. Bansal¹⁰, Annaick Carles⁴, Jesse R. Dixon⁸, Kai How Farh², Soheil Feizi², Soheil Feizi¹, Rosa Karlic¹¹, Ah Ram Kim², Ah Ram Kim¹, Ashwinikumar Kulkarni¹², Daofeng Li¹³, Rebecca F. Lowdon¹³, Ginell Elliott¹³, Tim R. Mercer¹⁴, Shane Neph⁷, Vitor Onuchic⁵, Paz Polak¹⁵, Paz Polak², Nisha Rajagopal⁸, Pradipta R. Ray¹², Richard C Sallari¹, Richard C Sallari², Kyle Siebenthall⁷, Nicholas A Sinnott-Armstrong², Nicholas A Sinnott-Armstrong¹, Michael Stevens¹³, Robert E. Thurman⁷, Jie Wu¹⁶, Bo Zhang¹³, Xin Zhou¹³, Arthur E. Beaudet⁵, Laurie A. Boyer¹, Philip L. De Jager², Philip L. De Jager¹⁵, Peggy J. Farnham¹⁷, Susan J. Fisher⁹, David Haussler¹⁸, Steven J.M. Jones⁴, Steven J.M. Jones¹⁹, Wei Li⁵, Marco A. Marra⁴, Michael T. McManus⁹, Shamil R. Sunyaev¹⁵, Shamil R. Sunyaev², James A. Thomson²⁰, Thea D. Tlsty⁹, Li-Huei Tsai², Li-Huei Tsai¹, Wei Wang, Robert A. Waterland⁵, Michael Q. Zhang²¹, Lisa Helbling Chadwick²², Bradley E. Bernstein⁶, Bradley E. Bernstein², Bradley E. Bernstein¹⁵, Joseph F. Costello⁹, Joseph R. Ecker¹¹, Martin Hirst⁴, Alexander Meissner², Aleksandar Milosavljevic⁵, Bing Ren⁸, John A. Stamatoyannopoulos⁷, Ting Wang¹³, Manolis Kellis², Manolis Kellis¹ - Show less +120 more•Institutions (22)

Massachusetts Institute of Technology¹, Broad Institute², University of California, Los Angeles³, University of British Columbia⁴, Baylor College of Medicine⁵, Howard Hughes Medical Institute⁶, University of Washington⁷, Ludwig Institute for Cancer Research⁸, University of California, San Francisco⁹, University of Connecticut¹⁰, University of Zagreb¹¹, University of Texas at Austin¹², Washington University in St. Louis¹³, University of Queensland¹⁴, Harvard University¹⁵, Cold Spring Harbor Laboratory¹⁶, University of Southern California¹⁷, University of California, Santa Cruz¹⁸, Simon Fraser University¹⁹, Morgridge Institute for Research²⁰, University of Texas at Dallas²¹, National Institutes of Health²²

19 Feb 2015-Nature

TL;DR: It is shown that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease.

...read moreread less

Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

...read moreread less

5,037 citations

Journal Article•DOI•

Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.

[...]

Sergey Koren¹, Brian P. Walenz¹, Konstantin Berlin², Jason R. Miller³, Nicholas H. Bergman, Adam M. Phillippy¹ - Show less +2 more•Institutions (3)

National Institutes of Health¹, Invincea², J. Craig Venter Institute³

15 Mar 2017-Genome Research

TL;DR: Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences, is presented, demonstrating that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences or Oxford Nanopore technologies.

...read moreread less

Abstract: Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes versus Celera Assembler 8.2. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either Pacific Biosciences (PacBio) or Oxford Nanopore technologies and achieves a contig NG50 of >21 Mbp on both human and Drosophila melanogaster PacBio data sets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs in graphical fragment assembly (GFA) format for analysis or integration with complementary phasing and scaffolding techniques. The combination of such highly resolved assembly graphs with long-range scaffolding information promises the complete and automated assembly of complex genomes.

...read moreread less

4,806 citations

Integrative analysis of 111 reference human epigenomes

[...]

Anshul Kundaje, Wouter Meuleman, Jason Ernst, Angela Yen, Pouya Kheradpour, Zhizhuo Zhang, Jianrong Wang, Lucas D. Ward, Abhishek Sarkar, Gerald Quon, Matthew L. Eaton, Yi-Chieh Wu, Andreas R. Pfenning, Xinchen Wang, Melina Claussnitzer, Yaping Liu, Mukul S. Bansal, Soheil Feizi-Khankandi, Ah Ram Kim, Richard C Sallari, Nicholas A Sinnott-Armstrong, Laurie A. Boyer, Elizabeta Gjoneska, Li-Huei Tsai, Manolis Kellis - Show less +21 more

01 Feb 2015

TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.

...read moreread less

4,409 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse