scispace - formally typeset
Search or ask a question
Author

Jiayou Chu

Bio: Jiayou Chu is an academic researcher from Peking Union Medical College. The author has contributed to research in topics: Population & Haplotype. The author has an hindex of 20, co-authored 68 publications receiving 14039 citations.


Papers
More filters
Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

12,661 citations

01 Oct 2015
TL;DR: The 1000 Genomes Project as mentioned in this paper provided a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and reported the completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole genome sequencing, deep exome sequencing and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

3,247 citations

Journal ArticleDOI
12 Apr 2002-Science
TL;DR: A resource of 1064 cultured lymphoblastoid cell lines from individuals in different world populations and corresponding milligram quantities of DNA is deposited at the Foundation Jean Dausset (CEPH) in Paris.
Abstract: A resource of 1064 cultured lymphoblastoid cell lines (LCLs) ([1][1]) from individuals in different world populations and corresponding milligram quantities of DNA is deposited at the Foundation Jean Dausset (CEPH) ([2][2]) in Paris. LCLs were collected from various laboratories by the Human Genome

1,002 citations

Journal ArticleDOI
Mahmood Ameen Abdulla1, Ikhlak Ahmed2, Anunchai Assawamakin3, Anunchai Assawamakin4, Jong Bhak5, Samir K. Brahmachari2, Gayvelline C. Calacal6, Amit Kumar Chaurasia2, Chien-Hsiun Chen7, Jieming Chen8, Yuan-Tsong Chen7, Jiayou Chu9, Eva Maria Cutiongco-de la Paz6, Maria Corazon A. De Ungria6, Frederick C. Delfin6, Juli Edo1, Suthat Fuchareon3, Ho Ghang5, Takashi Gojobori10, Junsong Han, Sheng Feng Ho7, Boon Peng Hoh11, Wei Huang12, Hidetoshi Inoko13, Pankaj Jha2, Timothy A. Jinam1, Li Jin14, Jongsun Jung, Daoroong Kangwanpong15, Jatupol Kampuansai15, Giulia C. Kennedy16, Preeti Khurana2, Hyung Lae Kim, Kwangjoong Kim, Sangsoo Kim17, Woo Yeon Kim5, Kuchan Kimm18, Ryosuke Kimura19, Tomohiro Koike, Supasak Kulawonganunchai4, Vikrant Kumar8, Poh San Lai20, Jong-Young Lee, Sunghoon Lee5, Edison T. Liu8, Partha P. Majumder21, Kiran Kumar Mandapati2, Sangkot Marzuki22, Wayne Mitchell8, Wayne Mitchell23, Mitali Mukerji2, Kenji Naritomi24, Chumpol Ngamphiw4, Norio Niikawa25, Nao Nishida19, Bermseok Oh, Sangho Oh5, Jun Ohashi19, Akira Oka13, Rick Twee-Hee Ong8, Carmencita Padilla6, Prasit Palittapongarnpim4, Henry B. Perdigon6, Maude E. Phipps1, Maude E. Phipps26, Eileen Png8, Yoshiyuki Sakaki, Jazelyn M. Salvador6, Yuliana Sandraling22, Vinod Scaria2, Mark Seielstad8, Mohd Ros Sidek11, Amit Sinha2, Metawee Srikummool15, Herawati Sudoyo22, Sumio Sugano19, Helena Suryadi22, Yoshiyuki Suzuki, Kristina A. Tabbada6, Adrian Tan8, Katsushi Tokunaga19, Sissades Tongsima4, Lilian P. Villamor6, Eric Wang16, Ying Wang12, Haifeng Wang12, Jer-Yuarn Wu7, Huasheng Xiao, Shuhua Xu, Jin Ok Yang5, Yin Yao Shugart27, Hyang Sook Yoo5, Wentao Yuan12, Guoping Zhao12, Bin Alwi Zilfalil11 
11 Dec 2009-Science
TL;DR: The results suggest that there may have been a single major migration of people into Asia and a subsequent south-to-north migration across the continent, and that genetic ancestry is strongly correlated with linguistic affiliations as well as geography.
Abstract: Asia harbors substantial cultural and linguistic diversity, but the geographic structure of genetic variation across the continent remains enigmatic. Here we report a large-scale survey of autosomal variation from a broad geographic sample of Asian human populations. Our results show that genetic ancestry is strongly correlated with linguistic affiliations as well as geography. Most populations show relatedness within ethnic/linguistic groups, despite prevalent gene flow among populations. More than 90% of East Asian (EA) haplotypes could be found in either Southeast Asian (SEA) or Central-South Asian (CSA) populations and show clinal structure with haplotype diversity decreasing from south to north. Furthermore, 50% of EA haplotypes were found in SEA only and 5% were found in CSA only, indicating that SEA was a major geographic source of EA populations.

545 citations

Journal ArticleDOI
TL;DR: This pattern indicates that the first settlement of modern humans in eastern Asia occurred in mainland Southeast Asia during the last Ice Age, coinciding with the absence of human fossils in easternAsia, 50,000-100,000 years ago.
Abstract: Summary The timing and nature of the arrival and the subsequent expansion of modern humans into eastern Asia remains controversial. Using Y-chromosome biallelic markers, we investigated the ancient human-migration patterns in eastern Asia. Our data indicate that southern populations in eastern Asia are much more polymorphic than northern populations, which have only a subset of the southern haplotypes. This pattern indicates that the first settlement of modern humans in eastern Asia occurred in mainland Southeast Asia during the last Ice Age, coinciding with the absence of human fossils in eastern Asia, 50,000–100,000 years ago. After the initial peopling, a great northward migration extended into northern China and Siberia.

404 citations


Cited by
More filters
Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

Journal ArticleDOI
TL;DR: A unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs is presented.
Abstract: Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets.

10,056 citations

Journal ArticleDOI
Monkol Lek, Konrad J. Karczewski1, Konrad J. Karczewski2, Eric Vallabh Minikel1, Eric Vallabh Minikel2, Kaitlin E. Samocha, Eric Banks1, Timothy Fennell1, Anne H. O’Donnell-Luria1, Anne H. O’Donnell-Luria2, Anne H. O’Donnell-Luria3, James S. Ware, Andrew J. Hill4, Andrew J. Hill1, Andrew J. Hill2, Beryl B. Cummings2, Beryl B. Cummings1, Taru Tukiainen1, Taru Tukiainen2, Daniel P. Birnbaum1, Jack A. Kosmicki, Laramie E. Duncan1, Laramie E. Duncan2, Karol Estrada1, Karol Estrada2, Fengmei Zhao2, Fengmei Zhao1, James Zou1, Emma Pierce-Hoffman2, Emma Pierce-Hoffman1, Joanne Berghout5, David Neil Cooper6, Nicole A. Deflaux7, Mark A. DePristo1, Ron Do, Jason Flannick2, Jason Flannick1, Menachem Fromer, Laura D. Gauthier1, Jackie Goldstein2, Jackie Goldstein1, Namrata Gupta1, Daniel P. Howrigan2, Daniel P. Howrigan1, Adam Kiezun1, Mitja I. Kurki2, Mitja I. Kurki1, Ami Levy Moonshine1, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso1, Gina M. Peloso2, Ryan Poplin1, Manuel A. Rivas1, Valentin Ruano-Rubio1, Samuel A. Rose1, Douglas M. Ruderfer8, Khalid Shakir1, Peter D. Stenson6, Christine Stevens1, Brett Thomas1, Brett Thomas2, Grace Tiao1, María Teresa Tusié-Luna, Ben Weisburd1, Hong-Hee Won9, Dongmei Yu, David Altshuler1, David Altshuler10, Diego Ardissino, Michael Boehnke11, John Danesh12, Stacey Donnelly1, Roberto Elosua, Jose C. Florez2, Jose C. Florez1, Stacey Gabriel1, Gad Getz2, Gad Getz1, Stephen J. Glatt13, Christina M. Hultman14, Sekar Kathiresan, Markku Laakso15, Steven A. McCarroll1, Steven A. McCarroll2, Mark I. McCarthy16, Mark I. McCarthy17, Dermot P.B. McGovern18, Ruth McPherson19, Benjamin M. Neale1, Benjamin M. Neale2, Aarno Palotie, Shaun Purcell8, Danish Saleheen20, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan14, Patrick F. Sullivan21, Jaakko Tuomilehto22, Ming T. Tsuang23, Hugh Watkins16, Hugh Watkins17, James G. Wilson24, Mark J. Daly2, Mark J. Daly1, Daniel G. MacArthur2, Daniel G. MacArthur1 
18 Aug 2016-Nature
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

8,758 citations

Journal ArticleDOI
11 Oct 2018-Nature
TL;DR: Deep phenotype and genome-wide genetic data from 500,000 individuals from the UK Biobank is described, describing population structure and relatedness in the cohort, and imputation to increase the number of testable variants to 96 million.
Abstract: The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

4,489 citations

Journal ArticleDOI
13 Sep 2012-Nature
TL;DR: Viewing the microbiota from an ecological perspective could provide insight into how to promote health by targeting this microbial community in clinical treatments.
Abstract: Trillions of microbes inhabit the human intestine, forming a complex ecological community that influences normal physiology and susceptibility to disease through its collective metabolic activities and host interactions. Understanding the factors that underlie changes in the composition and function of the gut microbiota will aid in the design of therapies that target it. This goal is formidable. The gut microbiota is immensely diverse, varies between individuals and can fluctuate over time — especially during disease and early development. Viewing the microbiota from an ecological perspective could provide insight into how to promote health by targeting this microbial community in clinical treatments.

3,890 citations