scispace - formally typeset
Open AccessJournal ArticleDOI

Telomere-to-telomere assembly of a complete human X chromosome

Reads0
Chats0
TLDR
High-coverage, ultra-long-read nanopore sequencing is used to create a new human genome assembly that improves on the coverage and accuracy of the current reference (GRCh38) and includes the gap-free, telomere-to-telomere sequence of the X chromosome.
Abstract
After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist1,2. Here we present a human genome assembly that surpasses the continuity of GRCh382, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome3, we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes. High-coverage, ultra-long-read nanopore sequencing is used to create a new human genome assembly that improves on the coverage and accuracy of the current reference (GRCh38) and includes the gap-free, telomere-to-telomere sequence of the X chromosome.

read more

Content maybe subject to copyright    Report

Citations
More filters

Integrative Genomics Viewer

TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Journal ArticleDOI

The complete sequence of a human genome

TL;DR: The T2T-CHM13-T2T Consortium presented a complete 3.055 billion-base pair sequence of a human genome, including gapless assemblies for all chromosomes except Y, corrected errors in the prior references, and introduced nearly 200 million base pairs of sequence containing gene predictions, 99 of which are predicted to be protein coding as discussed by the authors .
Journal ArticleDOI

Towards complete and error-free genome assemblies of all vertebrate species

Arang Rhie, +144 more
- 28 Apr 2021 - 
TL;DR: The Vertebrate Genomes Project (VGP) as mentioned in this paper is an international effort to generate high quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
Journal ArticleDOI

Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies

TL;DR: This work presents Merqury, a novel tool for reference-free assembly evaluation based on efficient k-mer set operations, and demonstrates on both human and plant genomes that it is a fast and robust method for assembly validation.
Journal ArticleDOI

Long-read human genome sequencing and its applications.

TL;DR: The currently available platforms, how the technologies are being applied to assemble and phase human genomes, and their impact on improving the authors' understanding of human genetic variation are discussed.
References
More filters
Journal ArticleDOI

Fast and accurate short read alignment with Burrows–Wheeler transform

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Journal ArticleDOI

Initial sequencing and analysis of the human genome.

Eric S. Lander, +248 more
- 15 Feb 2001 - 
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Journal ArticleDOI

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 - 
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Journal ArticleDOI

Mapping and quantifying mammalian transcriptomes by RNA-Seq.

TL;DR: Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors.
Journal ArticleDOI

Integrative genomics viewer

TL;DR: In this article, the authors present an approach for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Related Papers (5)