scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A positively-selected MAGEE2 LoF allele is associated with sexual dimorphism in human brain size, and shows similar phenotypes in Magee2 null mice.

TL;DR: In this article, the Magee2 null mice did not exhibit gross abnormalities apart from enlarged brain structures (13% increased total brain area, P = 0.0022) in hemizygous males.
Abstract: A nonsense allele at rs1343879 in human MAGEE2 on chromosome X has previously been reported as a strong candidate for positive selection in East Asia. This premature stop codon causing ∼80% protein truncation is characterized by a striking geographical pattern of high population differentiation: common in Asia and the Americas (up to 84% in the 1000 Genomes Project East Asians) but rare elsewhere. Here, we generated a Magee2 mouse knockout mimicking the human loss-of-function mutation to study its functional consequences. The Magee2 null mice did not exhibit gross abnormalities apart from enlarged brain structures (13% increased total brain area, P = 0.0022) in hemizygous males. The area of the granular retrosplenial cortex responsible for memory, navigation and spatial information processing was the most severely affected, exhibiting an enlargement of 34% (P = 3.4x10-6). The brain size in homozygous females showed the opposite trend of reduced brain size, although this did not reach statistical significance. With these insights, we performed human association analyses between brain size measurements and rs1343879 genotypes in 141 Chinese volunteers with brain MRI scans, replicating the sexual dimorphism seen in the knockout mouse model. The derived stop gain allele was significantly associated with a larger volume of grey matter in males (P = 0.00094), and smaller volumes of grey (P = 0.00021) and white (P = 0.0015) matter in females. It is unclear whether or not the observed neuroanatomical phenotypes affect behaviour or cognition, but it might have been the driving force underlying the positive selection in humans.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Investigating the nucleotide frequency at a large scale and results from 19,911 prokaryote genomes revealed that nucleotides coinciding with stop codons indeed have the lowest frequency in most genomes, and the T-G-A deficiency pattern observed would help to understand the evolution of codon usage tactics in extant organisms.
Abstract: If a stop codon appears within one gene, then its translation will be terminated earlier than expected. False folding of premature protein will be adverse to the host; hence, all functional genes would tend to avoid the intragenic stop codons. Therefore, we hypothesize that there will be less frequency of nucleotides corresponding to stop codons at each codon position of genes. Here, we validate this inference by investigating the nucleotide frequency at a large scale and results from 19,911 prokaryote genomes revealed that nucleotides coinciding with stop codons indeed have the lowest frequency in most genomes. Interestingly, genes with three types of stop codons all tend to follow a T-G-A deficiency pattern, suggesting that the property of avoiding intragenic termination pressure is the same and the major stop codon TGA plays a dominant role in this effect. Finally, a positive correlation between the TGA deficiency extent and the base length was observed in start-experimentally verified genes of Escherichia coli (E. coli). This strengthens the proof of our hypothesis. The T-G-A deficiency pattern observed would help to understand the evolution of codon usage tactics in extant organisms.
References
More filters
Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

12,661 citations

Journal ArticleDOI
TL;DR: A unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs is presented.
Abstract: Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets.

10,056 citations

Journal ArticleDOI
Kristin G. Ardlie, David S. DeLuca, Ayellet V. Segrè, Timothy J. Sullivan, Taylor Young, Ellen Gelfand, Casandra A. Trowbridge, Julian Maller, Taru Tukiainen, Monkol Lek, Lucas D. Ward, Pouya Kheradpour, Benjamin Iriarte, Yan Meng, Cameron D. Palmer, Tõnu Esko, Wendy Winckler, Joel N. Hirschhorn, Manolis Kellis, Daniel G. MacArthur, Gad Getz, Andrey A. Shabalin, Gen Li, Yi-Hui Zhou, Andrew B. Nobel, Ivan Rusyn, Fred A. Wright, Tuuli Lappalainen, Pedro G. Ferreira, Halit Ongen, Manuel A. Rivas, Alexis Battle, Sara Mostafavi, Jean Monlong, Michael Sammeth, Marta Melé, Ferran Reverter, Jakob M. Goldmann, Daphne Koller, Roderic Guigó, Mark I. McCarthy, Emmanouil T. Dermitzakis, Eric R. Gamazon, Hae Kyung Im, Anuar Konkashbaev, Dan L. Nicolae, Nancy J. Cox, Timothée Flutre, Xiaoquan Wen, Matthew Stephens, Jonathan K. Pritchard, Zhidong Tu, Bin Zhang, Tao Huang, Quan Long, Luan Lin, Jialiang Yang, Jun Zhu, Jun Liu, Amanda Brown, Bernadette Mestichelli, Denee Tidwell, Edmund Lo, Mike Salvatore, Saboor Shad, Jeffrey A. Thomas, John T. Lonsdale, Michael T. Moser, Bryan Gillard, Ellen Karasik, Kimberly Ramsey, Christopher Choi, Barbara A. Foster, John Syron, Johnell Fleming, Harold Magazine, Rick Hasz, Gary Walters, Jason Bridge, Mark Miklos, Susan L. Sullivan, Laura Barker, Heather M. Traino, Maghboeba Mosavel, Laura A. Siminoff, Dana R. Valley, Daniel C. Rohrer, Scott D. Jewell, Philip A. Branton, Leslie H. Sobin, Mary Barcus, Liqun Qi, Jeffrey McLean, Pushpa Hariharan, Ki Sung Um, Shenpei Wu, David Tabor, Charles Shive, Anna M. Smith, Stephen A. Buia, Anita H. Undale, Karna Robinson, Nancy Roche, Kimberly M. Valentino, Angela Britton, Robin Burges, Debra Bradbury, Kenneth W. Hambright, John Seleski, Greg E. Korzeniewski, Kenyon Erickson, Yvonne Marcus, Jorge Tejada, Mehran Taherian, Chunrong Lu, Margaret J. Basile, Deborah C. Mash, Simona Volpi, Jeffery P. Struewing, Gary F. Temple, Joy T. Boyer, Deborah Colantuoni, Roger Little, Susan E. Koester, Latarsha J. Carithers, Helen M. Moore, Ping Guan, Carolyn C. Compton, Sherilyn Sawyer, Joanne P. Demchok, Jimmie B. Vaught, Chana A. Rabiner, Nicole C. Lockhart 
08 May 2015-Science
TL;DR: The landscape of gene expression across tissues is described, thousands of tissue-specific and shared regulatory expression quantitative trait loci (eQTL) variants are cataloged, complex network relationships are described, and signals from genome-wide association studies explained by eQTLs are identified.
Abstract: Understanding the functional consequences of genetic variation, and how it affects complex human disease and quantitative traits, remains a critical challenge for biomedicine. We present an analysi...

4,418 citations

Journal ArticleDOI
TL;DR: This work has expanded and refined existing algorithms for the subdivision of MRI datasets into anatomical structures, and used the normalized superimposed atlases to create a maximum probability map in stereotaxic space, which retains quantitative information regarding inter‐subject variability.
Abstract: Probabilistic atlases of neuroanatomy are more representative of population anatomy than single brain atlases. They allow anatomical labeling of the results of group studies in stereotaxic space, automated anatomical labeling of individual brain imaging datasets, and the statistical assessment of normal ranges for structure volumes and extents. No such manually constructed atlas is currently available for the frequently studied group of young adults. We studied 20 normal subjects (10 women, median age 31 years) with high-resolution magnetic resonance imaging (MRI) scanning. Images were nonuniformity corrected and reoriented along both the anterior-posterior commissure (AC-PC) line horizontally and the midsagittal plane sagittally. Building on our previous work, we have expanded and refined existing algorithms for the subdivision of MRI datasets into anatomical structures. The resulting algorithm is presented in the Appendix. Forty-nine structures were interactively defined as three-dimensional volumes-of-interest (VOIs). The resulting 20 individual atlases were spatially transformed (normalized) into standard stereotaxic space, using SPM99 software and the MNI/ICBM 152 template. We evaluated volume data for all structures both in native space and after spatial normalization, and used the normalized superimposed atlases to create a maximum probability map in stereotaxic space, which retains quantitative information regarding inter-subject variability. Its potential applications range from the automatic labeling of new scans to the detection of anatomical abnormalities in patients. Further data can be extracted from the atlas for the detailed analysis of individual structures.

1,084 citations