scispace - formally typeset
Open AccessProceedings ArticleDOI

Lai-Net: Local-Ancestry Inference with Neural Networks

Reads0
Chats0
TLDR
The first neural network based LAI method, named LAI-Net, is developed, providing competitive accuracy with state-of-the-art methods and robustness to missing or noisy data, while having a small number of layers.
Abstract
Local-ancestry inference (LAI), also referred to as ancestry deconvolution, provides high-resolution ancestry estimation along the human genome. In both research and industry, LAI is emerging as a critical step in DNA sequence analysis with applications extending from polygenic risk scores (used to predict traits in embryos and disease risk in adults) to genome-wide association studies, and from pharmacogenomics to inference of human population history. While many LAI methods have been developed, advances in computing hardware (GPUs) combined with machine learning techniques, such as neural networks, are enabling the development of new methods that are fast, robust and easily shared and stored. In this paper we develop the first neural network based LAI method, named LAI-Net, providing competitive accuracy with state-of-the-art methods and robustness to missing or noisy data, while having a small number of layers.

read more

Citations
More filters

Iconographies supplémentaires de l'article : Case-control admixture mapping in Latino populations enriches for known asthma-associated genes

TL;DR: Case-control admixture mapping is a promising strategy for identifying novel asthma-associated loci in Latino populations and implicates genetic variation at 6q15 and 8q12 regions with asthma susceptibility.
Posted ContentDOI

Neural ADMIXTURE: rapid population clustering with autoencoders

TL;DR: Neural ADMIXTURE as mentioned in this paper is a neural network autoencoder that follows the same modeling assumptions as ADMIXURE, providing similar (or better) clustering, while reducing the compute time by orders of magnitude.
Posted ContentDOI

XGMix: Local-Ancestry Inference With Stacked XGBoost

TL;DR: This work presents a method (XGMix) based on gradient boosted trees, which, while being accurate, is also simple to use, and fast to train, taking minutes on consumer-level laptops.
Posted ContentDOI

High Resolution Ancestry Deconvolution for Next Generation Genomic Data

TL;DR: In this paper, a set of algorithms that address each of these points, achieving higher accuracy and swifter computational performance than any existing LAI method, while also enabling portable models that are particularly useful when training data are not shareable due to privacy or other restrictions.
References
More filters
Journal ArticleDOI

Deep learning

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Journal ArticleDOI

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 - 
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Journal ArticleDOI

Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering.

TL;DR: This work presents a new method and software for inference of haplotypes phase and missing data that can accurately phase data from whole-genome association studies, and presents the first comparison of haplotype-inference methods for real and simulated data sets with thousands of genotyped individuals.
Journal ArticleDOI

Worldwide human relationships inferred from genome-wide patterns of variation.

TL;DR: A pattern of ancestral allele frequency distributions that reflects variation in population dynamics among geographic regions is observed and is consistent with the hypothesis of a serial founder effect with a single origin in sub-Saharan Africa.
Journal ArticleDOI

A linear complexity phasing method for thousands of genomes

TL;DR: A method for estimating haplotypes, using genotype data from unrelated samples or small nuclear families, that leads to improved accuracy and speed compared to several widely used methods is presented.
Related Papers (5)

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 -