Author
Yu Wang
Other affiliations: Vanderbilt University, Zhejiang University, University of Virginia
Bio: Yu Wang is an academic researcher from Vanderbilt University Medical Center. The author has contributed to research in topics: Population & Gene. The author has an hindex of 24, co-authored 49 publications receiving 5248 citations. Previous affiliations of Yu Wang include Vanderbilt University & Zhejiang University.
Topics: Population, Gene, Domestication, Immunotherapy, Small RNA
Papers published on a yearly basis
Papers
More filters
••
University of Georgia1, Rutgers University2, United States Department of Energy3, Stanford University4, University of California, Berkeley5, North China University of Science and Technology6, University of Zurich7, Clemson University8, University of Düsseldorf9, Cold Spring Harbor Laboratory10, Purdue University11, International Crops Research Institute for the Semi-Arid Tropics12, Texas A&M University13, Cornell University14, University of Illinois at Urbana–Champaign15, Mississippi State University16, National Institute for Biotechnology and Genetic Engineering17, United States Department of Agriculture18
TL;DR: An initial analysis of the ∼730-megabase Sorghum bicolor (L.) Moench genome is presented, placing ∼98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information.
Abstract: Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approximately 730-megabase Sorghum bicolor (L.) Moench genome, placing approximately 98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approximately 75% larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approximately 70 million years ago, most duplicated gene sets lost one member before the sorghum-rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24% of genes are grass-specific and 7% are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.
2,809 citations
01 Jan 2015
TL;DR: The contribution of rare and low-frequency variants to human traits is largely unexplored as mentioned in this paper, but the contribution of these variants to the human traits has not yet been fully explored.
Abstract: The contribution of rare and low-frequency variants to human traits is largely unexplored. Here we describe insights from sequencing whole genomes (low read depth, 7×) or exomes (high read depth, 80×) of nearly 10,000 individuals from population-based and disease collections. In extensively phenotyped cohorts we characterize over 24 million novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with levels of triglycerides (APOB), adiponectin (ADIPOQ) and low-density lipoprotein cholesterol (LDLR and RGAG1) from single-marker and rare variant aggregation tests. We describe population structure and functional annotation of rare and low-frequency variants, use the data to estimate the benefits of sequencing for association studies, and summarize lessons from disease-specific collections. Finally, we make available an extensive resource, including individual-level genetic and phenotypic data and web-based tools to facilitate the exploration of association results.
824 citations
••
TL;DR: QuanTIseq as discussed by the authors is a method to quantify the fractions of ten immune cell types from bulk RNA-sequencing data, which is extensively validated in blood and tumor samples using simulated, flow cytometry, and immunohistochemistry data.
Abstract: We introduce quanTIseq, a method to quantify the fractions of ten immune cell types from bulk RNA-sequencing data. quanTIseq was extensively validated in blood and tumor samples using simulated, flow cytometry, and immunohistochemistry data. quanTIseq analysis of 8000 tumor samples revealed that cytotoxic T cell infiltration is more strongly associated with the activation of the CXCR3/CXCL9 axis than with mutational load and that deconvolution-based cell scores have prognostic value in several solid cancers. Finally, we used quanTIseq to show how kinase inhibitors modulate the immune contexture and to reveal immune-cell types that underlie differential patients’ responses to checkpoint blockers. Availability: quanTIseq is available at http://icbi.at/quantiseq.
572 citations
••
TL;DR: QuanTIseq, a deconvolution method that quantifies the densities of ten immune cell types from bulk RNA sequencing data and tissue imaging data, is developed and used to show how kinase inhibitors modulate the immune contexture and suggest that it might have predictive value for immunotherapy.
Abstract: The immune contexture has a prognostic value in several cancers and the study of its pharmacological modulation could identify drugs acting synergistically with immune checkpoint blockers. However, the quantification of the immune contexture is hampered by the lack of simple and efficient methods. We developed quanTIseq, a deconvolution method that quantifies the densities of ten immune cell types from bulk RNA sequencing data and tissue imaging data. We performed extensive validation using simulated data, flow cytometry data, and immunohistochemistry data from three cancer cohorts. Analysis of 8,000 samples showed that the activation of the CXCR3/CXCL9 axis, rather than the mutational load is associated with cytotoxic T cell infiltration. We also show the prognostic value of deconvolution-based immunoscore and T cell/B cell score in several solid cancers. Finally, we used quanTIseq to show how kinase inhibitors modulate the immune contexture, and we suggest that it might have predictive value for immunotherapy.
337 citations
••
Vanderbilt University Medical Center1, Vanderbilt University2, University of Pennsylvania3, Queen Mary University of London4, National Institute for Health Research5, Veterans Health Administration6, Emory University7, VA Boston Healthcare System8, Imperial College London9, University of Ioannina10, University of Leicester11, University of Bordeaux12, University of Michigan13, University of Cambridge14, McMaster University15, COMSATS Institute of Information Technology16, University of Dundee17, University of Newcastle18, Lund University19, Leiden University Medical Center20, University Medical Center Groningen21, University of Edinburgh22, King's College London23, Guy's and St Thomas' NHS Foundation Trust24, University of Texas Health Science Center at Houston25, University of Liverpool26, Broad Institute27, Boston University28, University of London29, University of Bristol30, Washington University in St. Louis31, university of lille32, Wellcome Trust Centre for Human Genetics33, University of Eastern Finland34, Wellcome Trust Sanger Institute35, National Institutes of Health36, Population Health Research Institute37, Brigham and Women's Hospital38, University of Sassari39, Wellcome Trust40, University of Oxford41, Harokopio University42, University of Washington43, Harvard University44, Stanford University45, VA Palo Alto Healthcare System46
TL;DR: Analysis of blood pressure data from the Million Veteran Program trans-ethnic cohort identifies common and rare variants, and genetically predicted gene expression across multiple tissues associated with systolic, diastolic and pulse pressure in over 775,000 individuals.
Abstract: In this trans-ethnic multi-omic study, we reinterpret the genetic architecture of blood pressure to identify genes, tissues, phenomes and medication contexts of blood pressure homeostasis. We discovered 208 novel common blood pressure SNPs and 53 rare variants in genome-wide association studies of systolic, diastolic and pulse pressure in up to 776,078 participants from the Million Veteran Program (MVP) and collaborating studies, with analysis of the blood pressure clinical phenome in MVP. Our transcriptome-wide association study detected 4,043 blood pressure associations with genetically predicted gene expression of 840 genes in 45 tissues, and mouse renal single-cell RNA sequencing identified upregulated blood pressure genes in kidney tubule cells.
310 citations
Cited by
More filters
••
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
12,661 citations
•
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.
11,521 citations
01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.
4,409 citations
••
TL;DR: The sequence of the maize genome reveals it to be the most complex genome known to date and the correlation of methylation-poor regions with Mu transposon insertions and recombination and how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state is reported.
Abstract: We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.
3,761 citations
••
Agricultural Research Service1, Purdue University2, University of North Carolina at Charlotte3, University of California, Berkeley4, University of Arizona5, University of Maryland, College Park6, University of Missouri7, Joint Genome Institute8, National Center for Genome Resources9, Iowa State University10, University of Wisconsin–Stevens Point11, University of Nebraska–Lincoln12
TL;DR: An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Abstract: Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
3,743 citations