scispace - formally typeset
Open AccessJournal ArticleDOI

FlashPCA2: principal component analysis of Biobank-scale genotype datasets

Gad Abraham, +2 more
- 01 Sep 2017 - 
- Vol. 33, Iss: 17, pp 2776-2778
TLDR
This work presents FlashPCA2, a tool that can perform partial PCA on 1 million individuals faster than competing approaches, while requiring substantially less memory.
Abstract
Motivation Principal component analysis (PCA) is a crucial step in quality control of genomic data and a common approach for understanding population genetic structure. With the advent of large genotyping studies involving hundreds of thousands of individuals, standard approaches are no longer feasible. However, when the full decomposition is not required, substantial computational savings can be made. Results We present FlashPCA2, a tool that can perform partial PCA on 1 million individuals faster than competing approaches, while requiring substantially less memory. Availability and implementation https://github.com/gabraham/flashpca . Contact gad.abraham@unimelb.edu.au. Supplementary information Supplementary data are available at Bioinformatics online.

read more

Citations
More filters
Journal ArticleDOI

Genetic mechanisms of critical illness in Covid-19.

Erola Pairo-Castineira, +1449 more
- 04 Mar 2021 - 
TL;DR: The GenOMICC (Genetics Of Mortality In Critical Care) genome-wide association study in 2244 critically ill Covid-19 patients from 208 UK intensive care units is reported, finding evidence in support of a causal link from low expression of IFNAR2, and high expression of TYK2, to life-threatening disease.
Journal ArticleDOI

Genome-wide association studies

TL;DR: This Primer provides an introduction to genome-wide association studies (GWAS), techniques for deriving functional inferences from the results and applications of GWAS in understanding disease risk and trait architecture, and discusses important ethical considerations when considering GWAS populations and data.
Journal ArticleDOI

A resource-efficient tool for mixed model association analysis of large-scale data

TL;DR: An MLM-based tool (fastGWA) is developed that controls for population stratification by principal components and for relatedness by a sparse genetic relationship matrix for GWA analyses of biobank-scale data.
Journal ArticleDOI

Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr.

TL;DR: Two R packages, bigstatsr and bigsnpr, allowing for the analysis of large scale genomic data to be performed within R, integrate most of the tools that are commonly used and demonstrate the scalability of the R packages by analyzing a simulated genome-wide dataset including 500 000 individuals and 1 million markers on a single desktop computer.
References
More filters
Journal ArticleDOI

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 - 
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Journal ArticleDOI

Second-generation PLINK: rising to the challenge of larger and richer datasets

TL;DR: The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility, and for the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.
Journal ArticleDOI

Population structure and eigenanalysis

TL;DR: An approach to studying population structure (principal components analysis) is discussed that was first applied to genetic data by Cavalli-Sforza and colleagues, and results from modern statistics are used to develop formal significance tests for population differentiation.
Journal ArticleDOI

A New Initiative on Precision Medicine

TL;DR: A research initiative that aims to accelerate progress toward a new era of precision medicine, with a near-term focus on cancers and a longer-term aim to generate knowledge applicable to the whole range of health and disease.
Related Papers (5)

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 -