scispace - formally typeset
Search or ask a question

Showing papers on "Locus (genetics) published in 2021"


Journal ArticleDOI
Alexander Kurilshikov1, Carolina Medina-Gomez2, Rodrigo Bacigalupe3, Djawad Radjabzadeh2, Jun Wang4, Jun Wang3, Ayse Demirkan5, Ayse Demirkan1, Caroline I. Le Roy6, Juan Antonio Raygoza Garay7, Casey T. Finnicum8, Xingrong Liu9, Daria V. Zhernakova1, Marc Jan Bonder1, Tue H. Hansen10, Fabian Frost11, Malte C. Rühlemann12, Williams Turpin7, Jee-Young Moon13, Han-Na Kim14, Kreete Lüll15, Elad Barkan16, Shiraz A. Shah17, Myriam Fornage18, Joanna Szopinska-Tokov, Zachary D. Wallen19, Dmitrii Borisevich10, Lars Agréus9, Anna Andreasson20, Corinna Bang12, Larbi Bedrani7, Jordana T. Bell6, Hans Bisgaard17, Michael Boehnke21, Dorret I. Boomsma22, Robert D. Burk13, Annique Claringbould1, Kenneth Croitoru7, Gareth E. Davies22, Gareth E. Davies8, Cornelia M. van Duijn23, Cornelia M. van Duijn2, Liesbeth Duijts2, Gwen Falony3, Jingyuan Fu1, Adriaan van der Graaf1, Torben Hansen10, Georg Homuth11, David A. Hughes24, Richard G. IJzerman25, Matthew A. Jackson23, Matthew A. Jackson6, Vincent W. V. Jaddoe2, Marie Joossens3, Torben Jørgensen10, Daniel Keszthelyi26, Rob Knight27, Markku Laakso28, Matthias Laudes, Lenore J. Launer29, Wolfgang Lieb12, Aldons J. Lusis30, Ad A.M. Masclee26, Henriette A. Moll2, Zlatan Mujagic26, Qi Qibin13, Daphna Rothschild16, Hocheol Shin14, Søren J. Sørensen10, Claire J. Steves6, Jonathan Thorsen17, Nicholas J. Timpson24, Raul Y. Tito3, Sara Vieira-Silva3, Uwe Völker11, Henry Völzke11, Urmo Võsa1, Kaitlin H Wade24, Susanna Walter31, Kyoko Watanabe22, Stefan Weiss11, Frank Ulrich Weiss11, Omer Weissbrod32, Harm-Jan Westra1, Gonneke Willemsen22, Haydeh Payami19, Daisy Jonkers26, Alejandro Arias Vasquez33, Eco J. C. de Geus22, Katie A. Meyer34, Jakob Stokholm17, Eran Segal16, Elin Org15, Cisca Wijmenga1, Hyung Lae Kim35, Robert C. Kaplan36, Tim D. Spector6, André G. Uitterlinden2, Fernando Rivadeneira2, Andre Franke12, Markus M. Lerch11, Lude Franke1, Serena Sanna37, Serena Sanna1, Mauro D'Amato, Oluf Pedersen10, Andrew D. Paterson7, Robert Kraaij2, Jeroen Raes3, Alexandra Zhernakova1 
TL;DR: In this article, the MiBioGen consortium curated and analyzed genome-wide genotypes and 16S fecal microbiome data from 18,340 individuals (24 cohorts) and found high variability across cohorts: only 9 of 410 genera were detected in more than 95% of samples.
Abstract: To study the effect of host genetics on gut microbiome composition, the MiBioGen consortium curated and analyzed genome-wide genotypes and 16S fecal microbiome data from 18,340 individuals (24 cohorts). Microbial composition showed high variability across cohorts: only 9 of 410 genera were detected in more than 95% of samples. A genome-wide association study of host genetic variation regarding microbial taxa identified 31 loci affecting the microbiome at a genome-wide significant (P < 5 × 10−8) threshold. One locus, the lactase (LCT) gene locus, reached study-wide significance (genome-wide association study signal: P = 1.28 × 10−20), and it showed an age-dependent association with Bifidobacterium abundance. Other associations were suggestive (1.95 × 10−10 < P < 5 × 10−8) but enriched for taxa showing high heritability and for genes expressed in the intestine and brain. A phenome-wide association study and Mendelian randomization identified enrichment of microbiome trait loci in the metabolic, nutrition and environment domains and suggested the microbiome might have causal effects in ulcerative colitis and rheumatoid arthritis.

287 citations


Journal ArticleDOI
TL;DR: In this paper, the SARS-CoV-2 virus, the causative agent of COVID-19, is undergoing constant mutation and the authors utilized an integrative approach combining epidemiology, virus genome sequencing, clinical phenotyping, and experimental validation to locate mutations of clinical importance.

79 citations


Journal ArticleDOI
TL;DR: In this paper, gene expression variation in primary human microglia isolated from 141 patients undergoing neurosurgery was profiled using expression quantitative trait loci (eQTL) mapping.
Abstract: Microglia, the tissue-resident macrophages of the central nervous system (CNS), play critical roles in immune defense, development and homeostasis. However, isolating microglia from humans in large numbers is challenging. Here, we profiled gene expression variation in primary human microglia isolated from 141 patients undergoing neurosurgery. Using single-cell and bulk RNA sequencing, we identify how age, sex and clinical pathology influence microglia gene expression and which genetic variants have microglia-specific functions using expression quantitative trait loci (eQTL) mapping. We follow up one of our findings using a human induced pluripotent stem cell-based macrophage model to fine-map a candidate causal variant for Alzheimer's disease at the BIN1 locus. Our study provides a population-scale transcriptional map of a critically important cell for human CNS development and disease.

79 citations


Journal ArticleDOI
TL;DR: In this paper, a transgene cassette of five resistance genes was introduced into bread wheat as a single locus and showed that at least four of the five genes are functional. But, a new Pgt isolate with virulence to several genes at this locus suggests gene stacks will need strategic deployment to maintain their effectiveness.
Abstract: Breeding wheat with durable resistance to the fungal pathogen Puccinia graminis f. sp. tritici (Pgt), a major threat to cereal production, is challenging due to the rapid evolution of pathogen virulence. Increased durability and broad-spectrum resistance can be achieved by introducing more than one resistance gene, but combining numerous unlinked genes by breeding is laborious. Here we generate polygenic Pgt resistance by introducing a transgene cassette of five resistance genes into bread wheat as a single locus and show that at least four of the five genes are functional. These wheat lines are resistant to aggressive and highly virulent Pgt isolates from around the world and show very high levels of resistance in the field. The simple monogenic inheritance of this multigene locus greatly simplifies its use in breeding. However, a new Pgt isolate with virulence to several genes at this locus suggests gene stacks will need strategic deployment to maintain their effectiveness. Combining fungal-resistance genes into a single cassette enables the generation of highly resistant wheat lines.

76 citations


Journal ArticleDOI
TL;DR: In this paper, a combined multi-omics and machine learning approach was used to identify the gain-of-function risk A allele of an SNP, rs17713054G>A, as a probable causative variant.
Abstract: The severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) disease (COVID-19) pandemic has caused millions of deaths worldwide. Genome-wide association studies identified the 3p21.31 region as conferring a twofold increased risk of respiratory failure. Here, using a combined multiomics and machine learning approach, we identify the gain-of-function risk A allele of an SNP, rs17713054G>A, as a probable causative variant. We show with chromosome conformation capture and gene-expression analysis that the rs17713054-affected enhancer upregulates the interacting gene, leucine zipper transcription factor like 1 (LZTFL1). Selective spatial transcriptomic analysis of lung biopsies from patients with COVID-19 shows the presence of signals associated with epithelial-mesenchymal transition (EMT), a viral response pathway that is regulated by LZTFL1. We conclude that pulmonary epithelial cells undergoing EMT, rather than immune cells, are likely responsible for the 3p21.31-associated risk. Since the 3p21.31 effect is conferred by a gain-of-function, LZTFL1 may represent a therapeutic target.

63 citations


Book
20 May 2021
TL;DR: Using random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), simple sequence repeats (SSR), and morphological traits, the first genetic maps for Cucurbita pepo were constructed and compared.
Abstract: Using random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), simple sequence repeats (SSR), and morphological traits, the first genetic maps for Cucurbita pepo (2n=2x=40) were constructed and compared. The two mapping populations consisted of 92 F2 individuals each. One map was developed from a cross between an oil-seed pumpkin breeding line and a zucchini accession, into which genes for resistance to Zucchini Yellow Mosaic Virus (ZYMV) from a related species, C. moschata, had been introgressed. The other map was developed from a cross between an oil-seed pumpkin and a crookneck variety. A total of 332 and 323 markers were mapped in the two populations. Markers were distributed in each map over 21 linkage groups and covered an average of 2,200 cM of the C. pepo genome. The two maps had 62 loci in common, which enabled identification of 14 homologous linkage groups. Polyacrylamide gel analyses allowed detection of a high number of markers suitable for mapping, 10% of which were co-dominant RAPD loci. In the Pumpkin-Zucchini population, bulked segregant analysis (BSA) identified seven markers less than 7 cM distant from the locus n, affecting lignification of the seed coat. One of these markers, linked to the recessive hull-less allele (AW11-420), was also found in the Pumpkin-Crookneck population, 4 cM from n. In the Pumpkin-Zucchini population, 24 RAPD markers, previously introduced into C. pepo from C. moschata, were mapped in two linkage groups (13 and 11 markers in LGpz1 and LGpz2, respectively), together with two sequence characterized amplified region (SCAR) markers linked to genes for resistance to ZYMV.

55 citations


Journal ArticleDOI
10 Feb 2021-Nature
TL;DR: In this paper, the authors identify homozygous 27-63-kilobase deletions located 300kilobases upstream of the engrailed-1 gene (EN1) in patients with a complex limb malformation featuring mesomelic shortening, syndactyly and ventral nails (dorsal dimelia).
Abstract: Long non-coding RNAs (lncRNAs) can be important components in gene-regulatory networks1, but the exact nature and extent of their involvement in human Mendelian disease is largely unknown. Here we show that genetic ablation of a lncRNA locus on human chromosome 2 causes a severe congenital limb malformation. We identified homozygous 27-63-kilobase deletions located 300 kilobases upstream of the engrailed-1 gene (EN1) in patients with a complex limb malformation featuring mesomelic shortening, syndactyly and ventral nails (dorsal dimelia). Re-engineering of the human deletions in mice resulted in a complete loss of En1 expression in the limb and a double dorsal-limb phenotype that recapitulates the human disease phenotype. Genome-wide transcriptome analysis in the developing mouse limb revealed a four-exon-long non-coding transcript within the deleted region, which we named Maenli. Functional dissection of the Maenli locus showed that its transcriptional activity is required for limb-specific En1 activation in cis, thereby fine-tuning the gene-regulatory networks controlling dorso-ventral polarity in the developing limb bud. Its loss results in the En1-related dorsal ventral limb phenotype, a subset of the full En1-associated phenotype. Our findings demonstrate that mutations involving lncRNA loci can result in human Mendelian disease.

49 citations


Journal ArticleDOI
23 Apr 2021-iScience
TL;DR: In this article, the authors performed an in-depth genetic analysis of chromosome 21 exploiting the genome-wide association study data, including 6,406 individuals hospitalized for COVID-19 and 902,088 controls with European genetic ancestry from the COVID19 Host Genetics Initiative.

46 citations


Journal ArticleDOI
TL;DR: In this paper, the Sr27 resistance gene was identified in a wheat line carrying an introgression of the 3R chromosome from Imperial rye and showed that virulence to Sr27 can arise experimentally and in the field through deletion mutations, copy number variation and expression level polymorphisms at the AvrSr27 locus.
Abstract: Stem rust caused by the fungus Puccinia graminis f. sp. tritici (Pgt) is a devastating disease of the global staple crop wheat. Although this disease was largely controlled in the latter half of the twentieth century, new virulent strains of Pgt, such as Ug99, have recently evolved1,2. These strains have caused notable losses worldwide and their continued spread threatens global wheat production. Breeding for disease resistance provides the most cost-effective control of wheat rust diseases3. A number of rust resistance genes have been characterized in wheat and most encode immune receptors of the nucleotide-binding leucine-rich repeat (NLR) class4, which recognize pathogen effector proteins known as avirulence (Avr) proteins5. However, only two Avr genes have been identified in Pgt so far, AvrSr35 and AvrSr50 (refs. 6,7), and none in other cereal rusts8,9. The Sr27 resistance gene was first identified in a wheat line carrying an introgression of the 3R chromosome from Imperial rye10. Although not deployed widely in wheat, Sr27 is widespread in the artificial crop species Triticosecale (triticale), which is a wheat-rye hybrid and is a host for Pgt11,12. Sr27 is effective against Ug99 (ref. 13) and other recent Pgt strains14,15. Here, we identify both the Sr27 gene in wheat and the corresponding AvrSr27 gene in Pgt and show that virulence to Sr27 can arise experimentally and in the field through deletion mutations, copy number variation and expression level polymorphisms at the AvrSr27 locus.

45 citations


Journal ArticleDOI
TL;DR: In this article, the authors identified 33 gene families that contained intronless and intron-poor sub-families, and they used RNA-seq analyses in Arabidopsis and rice.
Abstract: Eukaryotic genes can be classified into intronless (no introns), intron-poor (three or fewer introns per gene) or intron-rich. Early eukaryotic genes were mostly intron-rich, and their alternative splicing into multiple transcripts, giving rise to different proteins, might have played pivotal roles in adaptation and evolution. Interestingly, extant plant genomes contain many gene families with one or sometimes few sub-families with genes that are intron-poor or intronless, and it remains unknown when and how these intron-poor or intronless genes have originated and evolved, and what their possible functions are. In this study, we identified 33 such gene families that contained intronless and intron-poor sub-families. Intronless genes seemed to have first emerged in early land plant evolution, while intron-poor sub-families seemed first to have appeared in green algae. In contrast to intron-rich genes, intronless genes in intron-poor sub-families occurred later, and were subject to stronger functional constraints. Based on RNA-seq analyses in Arabidopsis and rice, intronless or intron-poor genes in AP2, EF-hand_7, bZIP, FAD_binding_4, STE_STE11, CAMK_CAMKL-CHK1 and C2 gene families were more likely to play a role in response to drought and salt stress, compared with intron-rich genes in the same gene families, whereas intronless genes in the B_lectin and S_locus_glycop gene family were more likely to participate in epigenetic processes and plant development. Understanding the origin and evolutionary trajectory, as well as the potential functions, of intronless and intron-poor sub-families provides further insight into plant genome evolution and the functional divergence of genes.

45 citations


Journal ArticleDOI
TL;DR: The results demonstrate that high‐resolution SNP‐based GWAS enables the rapid identification of putative resistance genes and can be used to improve the efficiency of marker‐assisted selection in wheat disease resistance breeding.
Abstract: The incorporation of resistance genes into wheat commercial varieties is the ideal strategy to combat stripe or yellow rust (YR). In a search for novel resistance genes, we performed a large-scale genomic association analysis with high-density 660K single nucleotide polymorphism (SNP) arrays to determine the genetic components of YR resistance in 411 spring wheat lines. Following quality control, 371 972 SNPs were screened, covering over 50% of the high-confidence annotated gene space. Nineteen stable genomic regions harbouring 292 significant SNPs were associated with adult-plant YR resistance across nine environments. Of these, 14 SNPs were localized in the proximity of known loci widely used in breeding. Obvious candidate SNP variants were identified in certain confidence intervals, such as the cloned gene Yr18 and the major locus on chromosome 2BL, despite a large extent of linkage disequilibrium. The number of causal SNP variants was refined using an independent validation panel and consideration of the estimated functional importance of each nucleotide polymorphism. Interestingly, four natural polymorphisms causing amino acid changes in the gene TraesCS2B01G513100 that encodes a serine/threonine protein kinase (STPK) were significantly involved in YR responses. Gene expression and mutation analysis confirmed that STPK played an important role in YR resistance. PCR markers were developed to identify the favourable TraesCS2B01G513100 haplotype for marker-assisted breeding. These results demonstrate that high-resolution SNP-based GWAS enables the rapid identification of putative resistance genes and can be used to improve the efficiency of marker-assisted selection in wheat disease resistance breeding.

Journal ArticleDOI
TL;DR: In this paper, the authors conducted whole genome sequencing of 80 skull-base chordomas and identified PBRM1, a SWI/SNF (SWItch/Sucrose Non-Fermentable) complex subunit gene, as a significantly mutated driver gene.
Abstract: Chordoma is a rare bone tumor with an unknown etiology and high recurrence rate. Here we conduct whole genome sequencing of 80 skull-base chordomas and identify PBRM1, a SWI/SNF (SWItch/Sucrose Non-Fermentable) complex subunit gene, as a significantly mutated driver gene. Genomic alterations in PBRM1 (12.5%) and homozygous deletions of the CDKN2A/2B locus are the most prevalent events. The combination of PBRM1 alterations and the chromosome 22q deletion, which involves another SWI/SNF gene (SMARCB1), shows strong associations with poor chordoma-specific survival (Hazard ratio [HR] = 10.55, 95% confidence interval [CI] = 2.81-39.64, p = 0.001) and recurrence-free survival (HR = 4.30, 95% CI = 2.34-7.91, p = 2.77 × 10-6). Despite the low mutation rate, extensive somatic copy number alterations frequently occur, most of which are clonal and showed highly concordant profiles between paired primary and recurrence/metastasis samples, indicating their importance in chordoma initiation. In this work, our findings provide important biological and clinical insights into skull-base chordoma.

Journal ArticleDOI
Sophie Garnier1, Sophie Garnier2, Magdalena Harakalova3, Stefan Weiss4, Michal Mokry3, Vera Regitz-Zagrosek5, Christian Hengstenberg6, Christian Hengstenberg7, Thomas P. Cappola8, Richard Isnard2, Richard Isnard1, Eloisa Arbustini, Stuart A. Cook9, Stuart A. Cook10, Jessica van Setten3, Jorg J. A. Calis3, Hakon Hakonarson11, Michael Morley8, Klaus Stark7, Sanjay K Prasad, Jin Li11, Declan P. O'Regan12, Maurizia Grasso, Martina Müller-Nurasyid13, Thomas Meitinger14, Thomas Meitinger13, Jean-Philippe Empana1, Konstantin Strauch13, Konstantin Strauch15, Melanie Waldenberger, Kenneth B. Marguiles8, Christine E. Seidman16, Christine E. Seidman17, Georgios Kararigas18, Benjamin Meder19, Benjamin Meder20, Jan Haas19, Pierre Boutouyrie1, Patrick Lacolley21, Xavier Jouven1, J. Erdmann15, Stefan Blankenberg22, Thomas Wichter, Volker Ruppert, Luigi Tavazzi, Olivier Dubourg, Gérard Roizès23, Richard Dorent, Pascal de Groote, Laurent Fauchier, Jean-Noël Trochu24, Jean-François Aupetit, Zofia T. Bilińska, Marine Germain21, Uwe Völker4, Daiane Hemerich3, Ibticem Raji, Delphine Bacq-Daian25, Carole Proust21, Paloma Remior, Manuel Gómez-Bueno, K Lehnert4, Renee Maas3, Robert Olaso25, Ganapathi Varma Saripella26, Ganapathi Varma Saripella1, Stephan B. Felix4, Steven McGinn25, L. Duboscq-Bidot1, L. Duboscq-Bidot2, Alain van Mil3, Céline Besse25, Vincent Fontaine2, Vincent Fontaine1, Hélène Blanché, Flavie Ader1, Flavie Ader27, Brendan J. Keating8, Angélique Curjol, Anne Boland25, Michel Komajda1, Michel Komajda2, François Cambien21, Jean-François Deleuze25, Marcus Dörr4, Folkert W. Asselbergs3, Folkert W. Asselbergs28, Eric Villard1, Eric Villard2, David-Alexandre Trégouët21, Philippe Charron 
TL;DR: In this article, the authors conducted the largest genome-wide association study performed so far in DCM, with 2719 cases and 4440 controls in the discovery population, and identified and replicated two new DCM-associated loci on chromosome 3p25.1 and chromosome 22q11.23.
Abstract: Aims Our objective was to better understand the genetic bases of dilated cardiomyopathy (DCM), a leading cause of systolic heart failure. Methods and results We conducted the largest genome-wide association study performed so far in DCM, with 2719 cases and 4440 controls in the discovery population. We identified and replicated two new DCM-associated loci on chromosome 3p25.1 [lead single-nucleotide polymorphism (SNP) rs62232870, P = 8.7 × 10-11 and 7.7 × 10-4 in the discovery and replication steps, respectively] and chromosome 22q11.23 (lead SNP rs7284877, P = 3.3 × 10-8 and 1.4 × 10-3 in the discovery and replication steps, respectively), while confirming two previously identified DCM loci on chromosomes 10 and 1, BAG3 and HSPB7. A genetic risk score constructed from the number of risk alleles at these four DCM loci revealed a 27% increased risk of DCM for individuals with 8 risk alleles compared to individuals with 5 risk alleles (median of the referral population). In silico annotation and functional 4C-sequencing analyses on iPSC-derived cardiomyocytes identify SLC6A6 as the most likely DCM gene at the 3p25.1 locus. This gene encodes a taurine transporter whose involvement in myocardial dysfunction and DCM is supported by numerous observations in humans and animals. At the 22q11.23 locus, in silico and data mining annotations, and to a lesser extent functional analysis, strongly suggest SMARCB1 as the candidate culprit gene. Conclusion This study provides a better understanding of the genetic architecture of DCM and sheds light on novel biological pathways underlying heart failure.

Journal ArticleDOI
TL;DR: In this article, the authors show that the single expressed antigen-coding gene displays a specific inter-chromosomal interaction with a major messenger RNA splicing locus, where antigen transcription and splicing occur in a specific nuclear compartment.
Abstract: Highly selective gene expression is a key requirement for antigenic variation in several pathogens, allowing evasion of host immune responses and maintenance of persistent infections1. African trypanosomes-parasites that cause lethal diseases in humans and livestock-employ an antigenic variation mechanism that involves monogenic antigen expression from a pool of >2,600 antigen-coding genes2. In other eukaryotes, the expression of individual genes can be enhanced by mechanisms involving the juxtaposition of otherwise distal chromosomal loci in the three-dimensional nuclear space3-5. However, trypanosomes lack classical enhancer sequences or regulated transcription initiation6,7. In this context, it has remained unclear how genome architecture contributes to monogenic transcription elongation and transcript processing. Here, we show that the single expressed antigen-coding gene displays a specific inter-chromosomal interaction with a major messenger RNA splicing locus. Chromosome conformation capture (Hi-C) revealed a dynamic reconfiguration of this inter-chromosomal interaction upon activation of another antigen. Super-resolution microscopy showed the interaction to be heritable and splicing dependent. We found a specific association of the two genomic loci with the antigen exclusion complex, whereby VSG exclusion 1 (VEX1) occupied the splicing locus and VEX2 occupied the antigen-coding locus. Following VEX2 depletion, loss of monogenic antigen expression was accompanied by increased interactions between previously silent antigen genes and the splicing locus. Our results reveal a mechanism to ensure monogenic expression, where antigen transcription and messenger RNA splicing occur in a specific nuclear compartment. These findings suggest a new means of post-transcriptional gene regulation.

Journal ArticleDOI
TL;DR: This work performed the first genome-wide association study of Latino PD patients from South America, demonstrating that SNCA plays a significant role in PD etiology in a Latino cohort and identifying a suggestive locus near NRROS on chromosome 3 that appeared to be driven by Peruvian subjects.
Abstract: OBJECTIVE This work was undertaken in order to identify Parkinson's disease (PD) risk variants in a Latino cohort, to describe the overlap in the genetic architecture of PD in Latinos compared to European-ancestry subjects, and to increase the diversity in PD genome-wide association (GWAS) data. METHODS We genotyped and imputed 1,497 PD cases and controls recruited from nine clinical sites across South America. We performed a GWAS using logistic mixed models; variants with a p-value <1 × 10-5 were tested in a replication cohort of 1,234 self-reported Latino PD cases and 439,522 Latino controls from 23andMe, Inc. We also performed an admixture mapping analysis where local ancestry blocks were tested for association with PD status. RESULTS One locus, SNCA, achieved genome-wide significance (p-value <5 × 10-8 ); rs356182 achieved genome-wide significance in both the discovery and the replication cohorts (discovery, G allele: 1.58 OR, 95% CI 1.35-1.86, p-value 2.48 × 10-8 ; 23andMe, G allele: 1.26 OR, 95% CI 1.16-1.37, p-value 4.55 × 10-8 ). In our admixture mapping analysis, a locus on chromosome 14, containing the gene STXBP6, achieved significance in a joint test of ancestries and in the Native American single-ancestry test (p-value <5 × 10-5 ). A second locus on chromosome 6, containing the gene RPS6KA2, achieved significance in the African single-ancestry test (p-value <5 × 10-5 ). INTERPRETATION This study demonstrated the importance of the SNCA locus for the etiology of PD in Latinos. By leveraging the demographic history of our cohort via admixture mapping, we identified two potential PD risk loci that merit further study. ANN NEUROL 2021;90:353-365.

Journal ArticleDOI
08 Jan 2021-eLife
TL;DR: In this paper, the authors used the statistics of single-nucleotide polymorphism (SNP) splits to reconstruct robust phylogenies with well-resolved branches from whole genome alignments of strains.
Abstract: Although recombination is accepted to be common in bacteria, for many species robust phylogenies with well-resolved branches can be reconstructed from whole genome alignments of strains, and these are generally interpreted to reflect clonal relationships. Using new methods based on the statistics of single-nucleotide polymorphism (SNP) splits, we show that this interpretation is incorrect. For many species, each locus has recombined many times along its line of descent, and instead of many loci supporting a common phylogeny, the phylogeny changes many thousands of times along the genome alignment. Analysis of the patterns of allele sharing among strains shows that bacterial populations cannot be approximated as either clonal or freely recombining, but are structured such that recombination rates between lineages vary over several orders of magnitude, with a unique pattern of rates for each lineage. Thus, rather than reflecting clonal ancestry, whole genome phylogenies reflect distributions of recombination rates.

Journal ArticleDOI
TL;DR: The data show that MACROD2, GOSR2, WNT3 and MSX1 play an essential functional role in heart development at the embryonic and newborn stage and variant rs870142 related to septal defects is proposed to influence expression of MSX 1.
Abstract: Genetic factors undoubtedly affect the development of congenital heart disease (CHD) but still remain ill defined. We sought to identify genetic risk factors associated with CHD and to accomplish a functional analysis of SNP-carrying genes. We performed a genome-wide association study (GWAS) of 4034 White patients with CHD and 8486 healthy controls. One SNP on chromosome 5q22.2 reached genome-wide significance across all CHD phenotypes and was also indicative for septal defects. One region on chromosome 20p12.1 pointing to the MACROD2 locus identified 4 highly significant SNPs in patients with transposition of the great arteries (TGA). Three highly significant risk variants on chromosome 17q21.32 within the GOSR2 locus were detected in patients with anomalies of thoracic arteries and veins (ATAV). Genetic variants associated with ATAV are suggested to influence the expression of WNT3, and the variant rs870142 related to septal defects is proposed to influence the expression of MSX1. We analyzed the expression of all 4 genes during cardiac differentiation of human and murine induced pluripotent stem cells in vitro and by single-cell RNA-Seq analyses of developing murine and human hearts. Our data show that MACROD2, GOSR2, WNT3, and MSX1 play an essential functional role in heart development at the embryonic and newborn stages.

Journal ArticleDOI
TL;DR: The HD gene was identified in the Human Genome Project (HGP) and its location on chromosome 4p163 was used as a proving ground for development of technologies to clone and sequence genes based upon their genomic location.
Abstract: Historically, Huntington's disease (HD; OMIM #143100) has played an important role in the enormous advances in human genetics seen over the past four decades This familial neurodegenerative disorder involves variable onset followed by consistent worsening of characteristic abnormal movements along with cognitive decline and psychiatric disturbances HD was the first autosomal disease for which the genetic defect was assigned to a position on the human chromosomes using only genetic linkage analysis with common DNA polymorphisms This discovery set off a multitude of similar studies in other diseases, while the HD gene, later renamed HTT, and its vicinity in chromosome 4p163 then acted as a proving ground for development of technologies to clone and sequence genes based upon their genomic location, with the growing momentum of such advances fueling the Human Genome Project The identification of the HD gene has not yet led to an effective treatment, but continued human genetic analysis of genotype-phenotype relationships in large HD subject populations, first at the HTT locus and subsequently genome-wide, has provided insights into pathogenesis that divide the course of the disease into two sequential, mechanistically distinct components

Posted ContentDOI
02 Mar 2021-bioRxiv
TL;DR: In this article, the authors harmonized and integrated 8,727 RNA-seq samples with accompanying genotype data from multiple brain-regions from 14 datasets and performed both cis-and trans-expression quantitative locus (eQTL) mapping.
Abstract: Gaining insight into the downstream consequences of non-coding variants is an essential step towards the identification of therapeutic targets from genome-wide association study (GWAS) findings. Here we have harmonized and integrated 8,727 RNA-seq samples with accompanying genotype data from multiple brain-regions from 14 datasets. This sample size enabled us to perform both cis- and trans-expression quantitative locus (eQTL) mapping. Upon comparing the brain cortex cis-eQTLs (for 12,307 unique genes at FDR We inferred the brain cell type for 1,515 cis-eQTLs by using cell type proportion information. We conducted Mendelian Randomization on 31 brain-related traits using cis-eQTLs as instruments and found 159 significant findings that also passed colocalization. Furthermore, two multiple sclerosis (MS) findings had cell type specific signals, a neuron-specific cis-eQTL for CYP24A1 and a macrophage specific cis-eQTL for CLECL1. To further interpret GWAS hits, we performed trans-eQTL analysis. We identified 2,589 trans-eQTLs (at FDR We also generated a brain-specific gene-coregulation network that we used to predict which genes have brain-specific functions, and to perform a novel network analysis of Alzheimer’s disease (AD), amyotrophic lateral sclerosis (ALS), multiple sclerosis (MS) and Parkinson’s disease (PD) GWAS data. This resulted in the identification of distinct sets of genes that show significantly enriched co-regulation with genes inside the associated GWAS loci, and which might reflect drivers of these diseases.


Journal ArticleDOI
TL;DR: In this article, the authors integrate genome-scale CRISPR loss-of-function screens and eQTLs in diverse cell types and tissues to pinpoint genes underlying COVID-19 risk.
Abstract: To date, the locus with the most robust human genetic association to COVID-19 severity is 3p21.31. Here, we integrate genome-scale CRISPR loss-of-function screens and eQTLs in diverse cell types and tissues to pinpoint genes underlying COVID-19 risk. Our findings identify SLC6A20 and CXCR6 as putative causal genes that modulate COVID-19 risk and highlight the usefulness of this integrative approach to bridge the divide between correlational and causal studies of human biology.

Journal ArticleDOI
TL;DR: In this article, a novel locus, Time of Flowering 5 (Tof5), was identified, which promotes flowering and enhances adaptation to high latitudes in both wild and cultivated soybean.

Journal ArticleDOI
Abstract: Until recently, the field of sex chromosome evolution has been dominated by the canonical unidirectional scenario, first developed by Muller in 1918. This model postulates that sex chromosomes emerge from autosomes by acquiring a sex-determining locus. Recombination reduction then expands outwards from this locus, to maintain its linkage with sexually antagonistic/advantageous alleles, resulting in Y or W degeneration and potentially culminating in their disappearance. Based mostly on empirical vertebrate research, we challenge and expand each conceptual step of this canonical model and present observations by numerous experts in two parts of a theme issue of Phil. Trans. R. Soc. B. We suggest that greater theoretical and empirical insights into the events at the origins of sex-determining genes (rewiring of the gonadal differentiation networks), and a better understanding of the evolutionary forces responsible for recombination suppression are required. Among others, crucial questions are: Why do sex chromosome differentiation rates and the evolution of gene dose regulatory mechanisms between male versus female heterogametic systems not follow earlier theory? Why do several lineages not have sex chromosomes? And: What are the consequences of the presence of (differentiated) sex chromosomes for individual fitness, evolvability, hybridization and diversification? We conclude that the classical scenario appears too reductionistic. Instead of being unidirectional, we show that sex chromosome evolution is more complex than previously anticipated and principally forms networks, interconnected to potentially endless outcomes with restarts, deletions and additions of new genomic material. This article is part of the theme issue 'Challenging the paradigm in sex chromosome evolution: empirical and theoretical insights with a focus on vertebrates (Part II)'.

Journal ArticleDOI
TL;DR: In this article, the authors identify two kidney disease genes Dipeptidase 1 (DPEP1) and Charged Multivesicular Body Protein 1 A (CHMP1A) via the triangulation of kidney function GWAS, human kidney expression, and methylation quantitative trait loci.
Abstract: Genome-wide association studies (GWAS) have identified loci for kidney disease, but the causal variants, genes, and pathways remain unknown. Here we identify two kidney disease genes Dipeptidase 1 (DPEP1) and Charged Multivesicular Body Protein 1 A (CHMP1A) via the triangulation of kidney function GWAS, human kidney expression, and methylation quantitative trait loci. Using single-cell chromatin accessibility and genome editing, we fine map the region that controls the expression of both genes. Mouse genetic models demonstrate the causal roles of both genes in kidney disease. Cellular studies indicate that both Dpep1 and Chmp1a are important regulators of a single pathway, ferroptosis and lead to kidney disease development via altering cellular iron trafficking. Identifying causal variants and genes is an essential step in interpreting GWAS loci. Here, the authors investigate a kidney disease GWAS locus with functional genomics data, CRISPR editing and mouse experiments to identify DPEP1 and CHMP1A as putative kidney disease genes via ferroptosis.

Journal ArticleDOI
TL;DR: In this paper, a multiethnic genome-wide association meta-analysis was conducted, combining results from the GERA and UK Biobank cohorts, and tested for replication in the 23andMe research cohort.
Abstract: Cataract is the leading cause of blindness among the elderly worldwide and cataract surgery is one of the most common operations performed in the United States. As the genetic etiology of cataract formation remains unclear, we conducted a multiethnic genome-wide association meta-analysis, combining results from the GERA and UK Biobank cohorts, and tested for replication in the 23andMe research cohort. We report 54 genome-wide significant loci, 37 of which were novel. Sex-stratified analyses identified CASP7 as an additional novel locus specific to women. We show that genes within or near 80% of the cataract-associated loci are significantly expressed and/or enriched-expressed in the mouse lens across various spatiotemporal stages as per iSyTE analysis. Furthermore, iSyTE shows 32 candidate genes in the associated loci have altered gene expression in 9 different gene perturbation mouse models of lens defects/cataract, suggesting their relevance to lens biology. Our work provides further insight into the complex genetic architecture of cataract susceptibility.

Journal ArticleDOI
TL;DR: In this paper, a quantitative trait locus map of the number of surviving offspring per F2 female detected a single, large-effect locus near Ectodysplasin (Eda), a gene having an ancient freshwater allele causing reduced bony armor and other changes.
Abstract: Mutations of small effect underlie most adaptation to new environments, but beneficial variants with large fitness effects are expected to contribute under certain conditions. Genes and genomic regions having large effects on phenotypic differences between populations are known from numerous taxa, but fitness effect sizes have rarely been estimated. We mapped fitness over a generation in an F2 intercross between a marine and a lake stickleback population introduced to a freshwater pond. A quantitative trait locus map of the number of surviving offspring per F2 female detected a single, large-effect locus near Ectodysplasin (Eda), a gene having an ancient freshwater allele causing reduced bony armor and other changes. F2 females homozygous for the freshwater allele had twice the number of surviving offspring as homozygotes for the marine allele, producing a large selection coefficient, s = 0.50 ± 0.09 SE. Correspondingly, the frequency of the freshwater allele increased from 0.50 in F2 mothers to 0.58 in surviving offspring. We compare these results to allele frequency changes at the Eda gene in an Alaskan lake population colonized by marine stickleback in the 1980s. The frequency of the freshwater Eda allele rose steadily over multiple generations and reached 95% within 20 y, yielding a similar estimate of selection, s = 0.49 ± 0.05, but a different degree of dominance. These findings are consistent with other studies suggesting strong selection on this gene (and/or linked genes) in fresh water. Selection on ancient genetic variants carried by colonizing ancestors is likely to increase the prevalence of large-effect fitness variants in adaptive evolution.

Journal ArticleDOI
01 Jan 2021-Leukemia
TL;DR: It is demonstrated that aberrant PROM1 /CD133 expression is essential for leukemic cell growth, mediated by direct binding of MLL-AF4.
Abstract: MLL gene rearrangements (MLLr) are a common cause of aggressive, incurable acute lymphoblastic leukemias (ALL) in infants and children, most of which originate in utero. The most common MLLr produces an MLL-AF4 fusion protein. MLL-AF4 promotes leukemogenesis by activating key target genes, mainly through recruitment of DOT1L and increased histone H3 lysine-79 methylation (H3K79me2/3). One key MLL-AF4 target gene is PROM1, which encodes CD133 (Prominin-1). CD133 is a pentaspan transmembrane glycoprotein that represents a potential pan-cancer target as it is found on multiple cancer stem cells. Here we demonstrate that aberrant PROM1/CD133 expression is essential for leukemic cell growth, mediated by direct binding of MLL-AF4. Activation is controlled by an intragenic H3K79me2/3 enhancer element (KEE) leading to increased enhancer–promoter interactions between PROM1 and the nearby gene TAPT1. This dual locus regulation is reflected in a strong correlation of expression in leukemia. We find that in PROM1/CD133 non-expressing cells, the PROM1 locus is repressed by polycomb repressive complex 2 (PRC2) binding, associated with reduced expression of TAPT1, partially due to loss of interactions with the PROM1 locus. Together, these results provide the first detailed analysis of PROM1/CD133 regulation that explains CD133 expression in MLLr ALL.

Journal ArticleDOI
TL;DR: This study was undertaken to identify susceptibility loci for cluster headache and obtain insights into relevant disease pathways.
Abstract: OBJECTIVE This study was undertaken to identify susceptibility loci for cluster headache and obtain insights into relevant disease pathways. METHODS We carried out a genome-wide association study, where 852 UK and 591 Swedish cluster headache cases were compared with 5,614 and 1,134 controls, respectively. Following quality control and imputation, single variant association testing was conducted using a logistic mixed model for each cohort. The 2 cohorts were subsequently combined in a merged analysis. Downstream analyses, such as gene-set enrichment, functional variant annotation, prediction and pathway analyses, were performed. RESULTS Initial independent analysis identified 2 replicable cluster headache susceptibility loci on chromosome 2. A merged analysis identified an additional locus on chromosome 1 and confirmed a locus significant in the UK analysis on chromosome 6, which overlaps with a previously known migraine locus. The lead single nucleotide polymorphisms were rs113658130 (p = 1.92 × 10-17 , odds ratio [OR] = 1.51, 95% confidence interval [CI] = 1.37-1.66) and rs4519530 (p = 6.98 × 10-17 , OR = 1.47, 95% CI = 1.34-1.61) on chromosome 2, rs12121134 on chromosome 1 (p = 1.66 × 10-8 , OR = 1.36, 95% CI = 1.22-1.52), and rs11153082 (p = 1.85 × 10-8 , OR = 1.30, 95% CI = 1.19-1.42) on chromosome 6. Downstream analyses implicated immunological processes in the pathogenesis of cluster headache. INTERPRETATION We identified and replicated several genome-wide significant associations supporting a genetic predisposition in cluster headache in a genome-wide association study involving 1,443 cases. Replication in larger independent cohorts combined with comprehensive phenotyping, in relation to, for example, treatment response and cluster headache subtypes, could provide unprecedented insights into genotype-phenotype correlations and the pathophysiological pathways underlying cluster headache. ANN NEUROL 2021;90:193-202.

Journal ArticleDOI
TL;DR: Increasing the phylogenetic density of the target reference file results in improved recovery of target capture loci, and is a drop‐in replacement for the original Angiosperms353 file in HybPiper analyses.
Abstract: PREMISE Universal target enrichment kits maximize utility across wide evolutionary breadth while minimizing the number of baits required to create a cost-efficient kit. The Angiosperms353 kit has been successfully used to capture loci throughout the angiosperms, but the default target reference file includes sequence information from only 6-18 taxa per locus. Consequently, reads sequenced from on-target DNA molecules may fail to map to references, resulting in fewer on-target reads for assembly, and reducing locus recovery. METHODS We expanded the Angiosperms353 target file, incorporating sequences from 566 transcriptomes to produce a 'mega353' target file, with each locus represented by 17-373 taxa. This mega353 file is a drop-in replacement for the original Angiosperms353 file in HybPiper analyses. We provide tools to subsample the file based on user-selected taxon groups, and to incorporate other transcriptome or protein-coding gene data sets. RESULTS Compared to the default Angiosperms353 file, the mega353 file increased the percentage of on-target reads by an average of 32%, increased locus recovery at 75% length by 49%, and increased the total length of the concatenated loci by 29%. DISCUSSION Increasing the phylogenetic density of the target reference file results in improved recovery of target capture loci. The mega353 file and associated scripts are available at: https://github.com/chrisjackson-pellicle/NewTargets.

Journal ArticleDOI
TL;DR: In this paper, a self-compatible diploid potato, RH89-039-16 (RH), which can efficiently induce a mating transition from self-incompatibility to self-compatibility, when crossed to selfincompatible lines.
Abstract: Potato is the third most important staple food crop. To address challenges associated with global food security, a hybrid potato breeding system, aimed at converting potato from a tuber-propagated tetraploid crop into a seed-propagated diploid crop through crossing inbred lines, is under development. However, given that most diploid potatoes are self-incompatible, this represents a major obstacle which needs to be addressed in order to develop inbred lines. Here, we report on a self-compatible diploid potato, RH89-039-16 (RH), which can efficiently induce a mating transition from self-incompatibility to self-compatibility, when crossed to self-incompatible lines. We identify the S-locusinhibitor (Sli) gene in RH, capable of interacting with multiple allelic variants of the pistil-specific S-ribonucleases (S-RNases). Further, Sli gene functions like a general S-RNase inhibitor, to impart SC to RH and other self-incompatible potatoes. Discovery of Sli now offers a path forward for the diploid hybrid breeding program. Diploid potatoes are typically self-incompatible, complicating efforts to breed diploid cultivars. Here the authors report map-based cloning of the S-locus inhibitor (Sli) gene in potato which encodes a non S-locus F-box protein that is expressed in pollen and can functions like a general S-RNase inhibitor to overcome self-incompatibility.