scispace - formally typeset
Search or ask a question
JournalISSN: 1061-4036

Nature Genetics 

Nature Portfolio
About: Nature Genetics is an academic journal published by Nature Portfolio. The journal publishes majorly in the area(s): Genome-wide association study & Gene. It has an ISSN identifier of 1061-4036. Over the lifetime, 9108 publications have been published receiving 2555850 citations. The journal is also known as: Nat Genet & Nature Genetics、Nat. Genet..


Papers
More filters
Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations

Journal ArticleDOI
TL;DR: A unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs is presented.
Abstract: Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets.

10,056 citations

Journal ArticleDOI
TL;DR: This work describes a method that enables explicit detection and correction of population stratification on a genome-wide scale and uses principal components analysis to explicitly model ancestry differences between cases and controls.
Abstract: Population stratification—allele frequency differences between cases and controls due to systematic ancestry differences—can cause spurious associations in disease studies. We describe a method that enables explicit detection and correction of population stratification on a genome-wide scale. Our method uses principal components analysis to explicitly model ancestry differences between cases and controls. The resulting correction is specific to a candidate marker’s variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. Our simple, efficient approach can easily be applied to disease studies with hundreds of thousands of markers. Population stratification—allele frequency differences between cases and controls due to systematic ancestry differences—can cause spurious associations in disease studies 1‐8 . Because the effects of stratification vary in proportion to the number of samples 9 , stratification will be an increasing problem in the large-scale association studies of the future, which will analyze thousands of samples in an effort to detect common genetic variants of weak effect. The two prevailing methods for dealing with stratification are genomic control and structured association 9‐14 . Although genomic control and structured association have proven useful in a variety of contexts, they have limitations. Genomic control corrects for stratification by adjusting association statistics at each marker by a uniform overall inflation factor. However, some markers differ in their allele frequencies across ancestral populations more than others. Thus, the uniform adjustment applied by genomic control may be insufficient at markers having unusually strong differentiation across ancestral populations and may be superfluous at markers devoid of such differentiation, leading to a loss in power. Structured association uses a program such as STRUCTURE 15 to assign the samples to discrete subpopulation clusters and then aggregates evidence of association within each cluster. If fractional membership in more than one cluster is allowed, the method cannot currently be applied to genome-wide association studies because of its intensive computational cost on large data sets. Furthermore, assignments of individuals to clusters are highly sensitive to the number of clusters, which is not well defined 14,16 .

9,387 citations

Journal ArticleDOI
TL;DR: An analytical strategy is introduced, Gene Set Enrichment Analysis, designed to detect modest but coordinate changes in the expression of groups of functionally related genes, which identifies a set of genes involved in oxidative phosphorylation whose expression is coordinately decreased in human diabetic muscle.
Abstract: DNA microarrays can be used to identify gene expression changes characteristic of human disease. This is challenging, however, when relevant differences are subtle at the level of individual genes. We introduce an analytical strategy, Gene Set Enrichment Analysis, designed to detect modest but coordinate changes in the expression of groups of functionally related genes. Using this approach, we identify a set of genes involved in oxidative phosphorylation whose expression is coordinately decreased in human diabetic muscle. Expression of these genes is high at sites of insulin-mediated glucose disposal, activated by PGC-1α and correlated with total-body aerobic capacity. Our results associate this gene set with clinically important variation in human metabolism and illustrate the value of pathway relationships in the analysis of genomic profiling experiments.

7,997 citations

Journal ArticleDOI
John T. Lonsdale, Jeffrey Thomas, Mike Salvatore, Rebecca Phillips, Edmund Lo, Saboor Shad, Richard Hasz, Gary Walters, Fernando U. Garcia1, Nancy Young2, Barbara A. Foster3, Mike Moser3, Ellen Karasik3, Bryan Gillard3, Kimberley Ramsey3, Susan L. Sullivan, Jason Bridge, Harold Magazine, John Syron, Johnelle Fleming, Laura A. Siminoff4, Heather M. Traino4, Maghboeba Mosavel4, Laura Barker4, Scott D. Jewell5, Daniel C. Rohrer5, Dan Maxim5, Dana Filkins5, Philip Harbach5, Eddie Cortadillo5, Bree Berghuis5, Lisa Turner5, Eric Hudson5, Kristin Feenstra5, Leslie H. Sobin6, James A. Robb6, Phillip Branton, Greg E. Korzeniewski6, Charles Shive6, David Tabor6, Liqun Qi6, Kevin Groch6, Sreenath Nampally6, Steve Buia6, Angela Zimmerman6, Anna M. Smith6, Robin Burges6, Karna Robinson6, Kim Valentino6, Deborah Bradbury6, Mark Cosentino6, Norma Diaz-Mayoral6, Mary Kennedy6, Theresa Engel6, Penelope Williams6, Kenyon Erickson, Kristin G. Ardlie7, Wendy Winckler7, Gad Getz8, Gad Getz7, David S. DeLuca7, MacArthur Daniel MacArthur8, MacArthur Daniel MacArthur7, Manolis Kellis7, Alexander Thomson7, Taylor Young7, Ellen Gelfand7, Molly Donovan7, Yan Meng7, George B. Grant7, Deborah C. Mash9, Yvonne Marcus9, Margaret J. Basile9, Jun Liu8, Jun Zhu10, Zhidong Tu10, Nancy J. Cox11, Dan L. Nicolae11, Eric R. Gamazon11, Hae Kyung Im11, Anuar Konkashbaev11, Jonathan K. Pritchard12, Jonathan K. Pritchard11, Matthew Stevens11, Timothée Flutre11, Xiaoquan Wen11, Emmanouil T. Dermitzakis13, Tuuli Lappalainen13, Roderic Guigó, Jean Monlong, Michael Sammeth, Daphne Koller14, Alexis Battle14, Sara Mostafavi14, Mark I. McCarthy15, Manual Rivas15, Julian Maller15, Ivan Rusyn16, Andrew B. Nobel16, Fred A. Wright16, Andrey A. Shabalin16, Mike Feolo17, Nataliya Sharopova17, Anne Sturcke17, Justin Paschal17, James M. Anderson17, Elizabeth L. Wilder17, Leslie Derr17, Eric D. Green17, Jeffery P. Struewing17, Gary F. Temple17, Simona Volpi17, Joy T. Boyer17, Elizabeth J. Thomson17, Mark S. Guyer17, Cathy Ng17, Assya Abdallah17, Deborah Colantuoni17, Thomas R. Insel17, Susan E. Koester17, Roger Little17, Patrick Bender17, Thomas Lehner17, Yin Yao17, Carolyn C. Compton17, Jimmie B. Vaught17, Sherilyn Sawyer17, Nicole C. Lockhart17, Joanne P. Demchok17, Helen F. Moore17 
TL;DR: The Genotype-Tissue Expression (GTEx) project is described, which will establish a resource database and associated tissue bank for the scientific community to study the relationship between genetic variation and gene expression in human tissues.
Abstract: Genome-wide association studies have identified thousands of loci for common diseases, but, for the majority of these, the mechanisms underlying disease susceptibility remain unknown. Most associated variants are not correlated with protein-coding changes, suggesting that polymorphisms in regulatory regions probably contribute to many disease phenotypes. Here we describe the Genotype-Tissue Expression (GTEx) project, which will establish a resource database and associated tissue bank for the scientific community to study the relationship between genetic variation and gene expression in human tissues.

6,545 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
2023177
2022318
2021198
2020184
2019241
2018258