UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts.
Reads0
Chats0
TLDR
Uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, is applied to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data.Abstract:
Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data. This approach is well-suited to the influx of large and diverse data and opens new lines of inquiry in population-scale datasets.read more
Citations
More filters
Journal ArticleDOI
The mutational constraint spectrum quantified from variation in 141,456 humans
Konrad J. Karczewski,Laurent C. Francioli,Grace Tiao,Beryl B. Cummings,Jessica Alföldi,Qingbo Wang,Ryan L. Collins,Kristen M. Laricchia,Andrea Ganna,Daniel P. Birnbaum,Laura D. Gauthier,Harrison Brand,Matthew Solomonson,Nicholas A. Watts,Daniel R. Rhodes,Moriel Singer-Berk,Eleina M. England,Eleanor G. Seaby,Jack A. Kosmicki,Raymond K. Walters,Katherine Tashman,Yossi Farjoun,Eric Banks,Timothy Poterba,Arcturus Wang,Cotton Seed,Nicola Whiffin,Jessica X. Chong,Kaitlin E. Samocha,Emma Pierce-Hoffman,Zachary Zappala,Anne H. O’Donnell-Luria,Eric Vallabh Minikel,Ben Weisburd,Monkol Lek,James S. Ware,Christopher Vittal,Irina M. Armean,Louis Bergelson,Kristian Cibulskis,Kristen M. Connolly,Miguel Covarrubias,Stacey Donnelly,Steven Ferriera,Stacey Gabriel,Jeff Gentry,Namrata Gupta,Thibault Jeandet,Diane Kaplan,Christopher Llanwarne,Ruchi Munshi,Sam Novod,Nikelle Petrillo,David Roazen,Valentin Ruano-Rubio,Andrea Saltzman,Molly Schleicher,Jose Soto,Kathleen Tibbetts,Charlotte Tolonen,Gordon Wade,Michael E. Talkowski,Benjamin M. Neale,Mark J. Daly,Daniel G. MacArthur +64 more
TL;DR: A catalogue of predicted loss-of-function variants in 125,748 whole-exome and 15,708 whole-genome sequencing datasets from the Genome Aggregation Database (gnomAD) reveals the spectrum of mutational constraints that affect these human protein-coding genes.
Journal ArticleDOI
Stature, Living Standards, and Economic Development: Essays in Anthropometric History.
Peter Wynn Kirby,John Komlos +1 more
TL;DR: Komlos as discussed by the authors showed that growth in height as a mirror of the standard of living was a new source for European economic history, and he used it to study the relationship between height and health in the United States.
Journal ArticleDOI
Contrasting signatures of genomic divergence during sympatric speciation
Andreas F. Kautt,Andreas F. Kautt,Claudius F. Kratochwil,Alexander Nater,Gonzalo Machado-Schiaffino,Gonzalo Machado-Schiaffino,Melisa Olave,Melisa Olave,Frederico Henning,Frederico Henning,Julián Torres-Dowdall,Andreas Härer,Andreas Härer,C. Darrin Hulsey,Paolo Franchini,Martin Pippel,Eugene W. Myers,Axel Meyer +17 more
TL;DR: It is concluded that simple trait architectures are not always as conducive to speciation with gene flow as previously suggested, whereas polygenic architectures can promote rapid and stable speciation in sympatry.
Journal ArticleDOI
Dimensionality reduction by UMAP to visualize physical and genetic interactions.
TL;DR: Proximity in low-dimensional UMAP space identifies groups of genes that correspond to protein complexes and pathways, and finds novel protein interactions, even within well-characterized complexes.
Posted ContentDOI
SARS-CoV-2 receptor and entry genes are expressed by sustentacular cells in the human olfactory neuroepithelium
Leon Fodoulian,Joël Tuberosa,Daniel Rossier,Daniel Rossier,Madlaina Boillat,Chenda Kan,Veronique Pauli,Kristof Egervari,Kristof Egervari,Johannes Alexander Lobrinus,Basile Nicolas Landis,Alan Carleton,Ivan Rodriguez +12 more
TL;DR: Sustentacular cells represent a potential entry door for SARS-CoV-2 in a neuronal sensory system that is in direct connection with the brain.
References
More filters
Journal Article
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +15 more
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Journal Article
Visualizing Data using t-SNE
TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Book
ggplot2: Elegant Graphics for Data Analysis
TL;DR: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics.
Posted Content
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Andreas Müller,Joel Nothman,Gilles Louppe,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +18 more
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
Journal ArticleDOI
PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses
Shaun Purcell,Shaun Purcell,Benjamin M. Neale,Benjamin M. Neale,Kathe Todd-Brown,Lori Thomas,Manuel A. R. Ferreira,David Bender,David Bender,Julian Maller,Julian Maller,Pamela Sklar,Pamela Sklar,Paul I.W. de Bakker,Paul I.W. de Bakker,Mark J. Daly,Mark J. Daly,Pak C. Sham +17 more
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Related Papers (5)
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Leland McInnes,John Healy +1 more
A global reference for human genetic variation.
Adam Auton,Gonçalo R. Abecasis,David Altshuler,Richard Durbin,David R. Bentley,Aravinda Chakravarti,Andrew G. Clark,Peter Donnelly,Evan E. Eichler,Paul Flicek,Stacey Gabriel,Richard A. Gibbs,Eric D. Green,Matthew E. Hurles,Bartha Maria Knoppers,Jan O. Korbel,Eric S. Lander,Charles Lee,Hans Lehrach,Elaine R. Mardis,Gabor T. Marth,Gil McVean,Deborah A. Nickerson,Jeanette Schmidt,Stephen T. Sherry,Jun Wang,Richard K. Wilson,Eric Boerwinkle,Harsha Doddapaneni,Yi Han,Viktoriya Korchina,Christie Kovar,Sandra L. Lee,Donna M. Muzny,Jeffrey G. Reid,Yiming Zhu,Yuqi Chang,Qiang Feng,Qiang Feng,Xiaodong Fang,Xiaodong Fang,Xiaosen Guo,Xiaosen Guo,Min Jian,Min Jian,Hui Jiang,Hui Jiang,Xin Jin,Tianming Lan,Guoqing Li,Jingxiang Li,Yingrui Li,Shengmao Liu,Xiao Liu,Xiao Liu,Yao Lu,Xuedi Ma,Meifang Tang,Bo Wang,Guangbiao Wang,Honglong Wu,Renhua Wu,Xun Xu,Ye Yin,Dandan Zhang,Wenwei Zhang,Jiao Zhao,Meiru Zhao,Xiaole Zheng,Namrata Gupta,Neda Gharani,Lorraine Toji,Norman P. Gerry,Alissa M. Resch,Jonathan Barker,Laura Clarke,Laurent Gil,Sarah E. Hunt,Gavin Kelman,Eugene Kulesha,Rasko Leinonen,William M. McLaren,Rajesh Radhakrishnan,Asier Roa,Dmitriy Smirnov,Richard Smith,Ian Streeter,Anja Thormann,Iliana Toneva,Brendan Vaughan,Xiangqun Zheng-Bradley,Russell J. Grocock,Sean Humphray,Terena James,Zoya Kingsbury,Ralf Sudbrak,M. Albrecht,Vyacheslav Amstislavskiy,Tatiana A. Borodina,Matthias Lienhard,Florian Mertes,Marc Sultan,Bernd Timmermann,Marie-Laure Yaspo,Lucinda Fulton,Victor Ananiev,Zinaida Belaia,Dimitriy Beloslyudtsev,Nathan Bouk,Chao Chen,Deanna M. Church,Robert M. Cohen,Charles Cook,John Garner,Timothy Hefferon,Mikhail Kimelman,Chunlei Liu,John Lopez,Peter Meric,Chris O’Sullivan,Yuri Ostapchuk,Lon Phan,Sergiy Ponomarov,Valerie A. Schneider,Eugene Shekhtman,Karl Sirotkin,Douglas J. Slotta,Hua Zhang,Senduran Balasubramaniam,John Burton,Petr Danecek,Thomas M. Keane,Anja Kolb-Kokocinski,Shane A. McCarthy,James Stalker,Michael A. Quail,Christopher Davies,Jeremy Gollub,Teresa Webster,Brant Wong,Yiping Zhan,Christopher L. Campbell,Yu Kong,Anthony Marcketta,Fuli Yu,Lilian Antunes,Matthew N. Bainbridge,Aniko Sabo,Zhuoyi Huang,Lachlan J. M. Coin,Lin Fang,Lin Fang,Qibin Li,Zhenyu Li,Haoxiang Lin,Binghang Liu,Ruibang Luo,Haojing Shao,Haojing Shao,Yinlong Xie,Chen Ye,Chang Yu,Fan Zhang,Hancheng Zheng,Zhu Hongmei,Can Alkan,Elif Dal,Fatma Kahveci,Erik Garrison,Deniz Kural,Wan-Ping Lee,Wen Fung Leong,Michael Strömberg,Alistair Ward,Jiantao Wu,Mengyao Zhang,Mark J. Daly,Mark A. DePristo,Robert E. Handsaker,Robert E. Handsaker,Eric Banks,Gaurav Bhatia,Guillermo del Angel,Giulio Genovese,Heng Li,Seva Kashin,Seva Kashin,Steven A. McCarroll,Steven A. McCarroll,James Nemesh,Ryan Poplin,Seungtai Yoon,Jayon Lihm,Vladimir Makarov,Srikanth Gottipati,Alon Keinan,Juan L. Rodriguez-Flores,Tobias Rausch,Markus Hsi-Yang Fritz,Adrian M. Stütz,Kathryn Beal,Avik Datta,Javier Herrero,Graham R. S. Ritchie,Daniel R. Zerbino,Pardis C. Sabeti,Pardis C. Sabeti,Ilya Shlyakhter,Ilya Shlyakhter,Stephen F. Schaffner,Stephen F. Schaffner,Joseph J. Vitti,Joseph J. Vitti,David Neil Cooper,Edward V. Ball,Peter D. Stenson,Bret Barnes,Markus J. Bauer,R. Keira Cheetham,Anthony J. Cox,Michael A. Eberle,Scott Kahn,Lisa Murray,John F. Peden,Richard Shaw,Eimear E. Kenny,Mark A. Batzer,Miriam K. Konkel,Jerilyn A. Walker,Daniel G. MacArthur,Monkol Lek,Ralf Herwig,Li Ding,Daniel C. Koboldt,David E. Larson,Kai Ye,Simon Gravel,Anand Swaroop,Emily Y. Chew,Tuuli Lappalainen,Yaniv Erlich,Melissa Gymrek,Melissa Gymrek,Thomas Willems,Jared T. Simpson,Mark D. Shriver,Jeffrey A. Rosenfeld,Carlos Bustamante,Stephen B. Montgomery,Francisco M. De La Vega,Jake K. Byrnes,Andrew Carroll,Marianne K. DeGorter,Phil Lacroute,Brian K. Maples,Alicia R. Martin,Andrés Moreno-Estrada,Andrés Moreno-Estrada,Suyash Shringarpure,Fouad Zakharia,Eran Halperin,Eran Halperin,Yael Baran,Eliza Cerveira,Jaeho Hwang,Ankit Malhotra,Dariusz Plewczynski,Kamen Radew,Mallory Romanovitch,Chengsheng Zhang,Fiona Hyland,David Craig,Alexis Christoforides,Nils Homer,Tyler Izatt,Ahmet Kurdoglu,Shripad Sinari,Kevin Squire,Chunlin Xiao,Jonathan Sebat,Danny Antaki,Madhusudan Gujral,Amina Noor,Kenny Ye,Esteban G. Burchard,Ryan D. Hernandez,Christopher R. Gignoux,David Haussler,David Haussler,Sol Katzman,W. James Kent,Bryan Howie,Andres Ruiz-Linares,Emmanouil T. Dermitzakis,Emmanouil T. Dermitzakis,Scott E. Devine,Hyun Min Kang,Jeffrey M. Kidd,Thomas W. Blackwell,Sean Caron,Wei Chen,S. Emery,Lars G. Fritsche,Christian Fuchsberger,Goo Jun,Goo Jun,Bingshan Li,Robert H. Lyons,Chris Scheller,Carlo Sidore,Carlo Sidore,Carlo Sidore,Shiya Song,Elzbieta Sliwerska,Daniel Taliun,Adrian Tan,Ryan P. Welch,Mary Kate Wing,Xiaowei Zhan,Philip Awadalla,Philip Awadalla,Alan Hodgkinson,Yun Li,Xinghua Shi,Andrew Quitadamo,Gerton Lunter,Jonathan Marchini,Simon Myers,Claire Churchhouse,Olivier Delaneau,Olivier Delaneau,Anjali Gupta-Hinch,Warren W. Kretzschmar,Zamin Iqbal,Iain Mathieson,Androniki Menelaou,Androniki Menelaou,Andy Rimmer,Dionysia Kiara Xifara,Taras K. Oleksyk,Yunxin Fu,Xiaoming Liu,Momiao Xiong,Lynn B. Jorde,David J. Witherspoon,Jinchuan Xing,Brian L. Browning,Sharon R. Browning,Fereydoun Hormozdiari,Peter H. Sudmant,Ekta Khurana,Chris Tyler-Smith,Cornelis A. Albers,Qasim Ayub,Yuan Chen,Vincenza Colonna,Vincenza Colonna,Luke Jostins,Klaudia Walter,Yali Xue,Mark Gerstein,Alexej Abyzov,Suganthi Balasubramanian,Jieming Chen,Declan Clarke,Yao Fu,Arif Harmanci,Mike Jin,Dong-Hoon Lee,Jeremy Liu,Xinmeng Jasmine Mu,Xinmeng Jasmine Mu,Jing Zhang,Yan Zhang,Christopher Hartl,Khalid Shakir,Jeremiah D. Degenhardt,Sascha Meiers,Benjamin Raeder,Francesco Paolo Casale,Oliver Stegle,Eric-Wubbo Lameijer,Ira M. Hall,Vineet Bafna,Jacob J. Michaelson,Eugene J. Gardner,Ryan E. Mills,Gargi Dayama,Ken Chen,Xian Fan,Zechen Chong,Tenghui Chen,Mark Chaisson,John Huddleston,Maika Malig,Bradley J. Nelson,Nicholas F. Parrish,Ben Blackburne,Sarah J. Lindsay,Zemin Ning,Yujun Zhang,Hugo Y. K. Lam,Cristina Sisu,Danny Challis,Uday S. Evani,James T. Lu,Uma Nagaswamy,Jin Yu,Wangshen Li,Lukas Habegger,Haiyuan Yu,Fiona Cunningham,Ian Dunham,Kasper Lage,Kasper Lage,Jakob Berg Jespersen,Jakob Berg Jespersen,Jakob Berg Jespersen,Heiko Horn,Heiko Horn,Donghoon Kim,Rob DeSalle,Apurva Narechania,Melissa A. Wilson Sayres,Fernando L. Mendez,G. David Poznik,Peter A. Underhill,David Mittelman,Ruby Banerjee,Maria Cerezo,Thomas W. Fitzgerald,Sandra Louzada,Andrea Massaia,Fengtang Yang,Divya Kalra,Walker Hale,Xu Dan,Kathleen C. Barnes,Christine Beiswanger,Hongyu Cai,Hongzhi Cao,Hongzhi Cao,Brenna M. Henn,Danielle Jones,Jane Kaye,Alastair Kent,Angeliki Kerasidou,Rasika A. Mathias,Pilar N. Ossorio,Michael Parker,Charles N. Rotimi,Charmaine D.M. Royal,Karla Sandoval,Yeyang Su,Zhongming Tian,Sarah A. Tishkoff,Marc Via,Yuhong Wang,Huanming Yang,Ling Yang,Jiayong Zhu,Walter F. Bodmer,Gabriel Bedoya,Zhiming Cai,Yang Gao,Jiayou Chu,Leena Peltonen,Andrés C. García-Montero,Alberto Orfao,Julie Dutil,Juan Carlos Martínez-Cruzado,R. Mathias,Anselm Hennis,Harold Watson,Colin A. McKenzie,Firdausi Qadri,Regina C. LaRocque,Xiaoyan Deng,Danny Asogun,Onikepe A. Folarin,Christian T. Happi,Omonwunmi Omoniwa,Matt Stremlau,Matt Stremlau,Ridhi Tariyal,Ridhi Tariyal,M Jallow,M Jallow,Fatoumatta Sisay Joof,Fatoumatta Sisay Joof,Tumani Corrah,Tumani Corrah,Kirk A. Rockett,Kirk A. Rockett,Dominic P. Kwiatkowski,Dominic P. Kwiatkowski,Jaspal S. Kooner,Tran Tinh Hien,Sarah J. Dunstan,Sarah J. Dunstan,Nguyen ThuyHang,Richard Fonnie,Robert F. Garry,Lansana Kanneh,Lina M. Moses,John S. Schieffelin,Donald S. Grant,Carla Gallo,Giovanni Poletti,Danish Saleheen,Asif Rasheed,Lisa D. Brooks,Adam Felsenfeld,Jean E. McEwen,Yekaterina Vaydylevich,Audrey Duncanson,Michael Dunn,Jeffery A. Schloss +517 more