A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data
Andrew E. Teschendorff,Francesco Marabita,Matthias Lechner,Thomas E. Bartlett,Jesper Tegnér,David Gomez-Cabrero,Stephan Beck +6 more
TLDR
A novel model-based intra-array normalization strategy for 450 k data, called BMIQ (Beta MIxture Quantile dilation), to adjust the beta-values of type2 design probes into a statistical distribution characteristic of type1 probes is proposed.Abstract:
Motivation: The Illumina Infinium 450 k DNA Methylation Beadchip is a prime candidate technology for Epigenome-Wide Association Studies (EWAS). However, a difficulty associated with these beadarrays is that probes come in two different designs, characterized by widely different DNA methylation distributions and dynamic range, which may bias downstream analyses. A key statistical issue is therefore how best to adjust for the two different probe designs.
Results: Here we propose a novel model-based intra-array normalization strategy for 450 k data, called BMIQ (Beta MIxture Quantile dilation), to adjust the beta-values of type2 design probes into a statistical distribution characteristic of type1 probes. The strategy involves application of a three-state beta-mixture model to assign probes to methylation states, subsequent transformation of probabilities into quantiles and finally a methylation-dependent dilation transformation to preserve the monotonicity and continuity of the data. We validate our method on cell-line data, fresh frozen and paraffin-embedded tumour tissue samples and demonstrate that BMIQ compares favourably with two competing methods. Specifically, we show that BMIQ improves the robustness of the normalization procedure, reduces the technical variation and bias of type2 probe values and successfully eliminates the type1 enrichment bias caused by the lower dynamic range of type2 probes. BMIQ will be useful as a preprocessing step for any study using the Illumina Infinium 450 k platform.
Availability: BMIQ is freely available from http://code.google.com/p/bmiq/.
Contact: a.teschendorff@ucl.ac.uk
Supplementary information:Supplementary data are available at Bioinformatics onlineread more
Citations
More filters
Journal ArticleDOI
DNA methylation age of human tissues and cell types
TL;DR: It is proposed that DNA methylation age measures the cumulative effect of an epigenetic maintenance system, and can be used to address a host of questions in developmental biology, cancer and aging research.
Journal ArticleDOI
Minfi: A flexible and comprehensive Bioconductor package for the analysis of Infinium DNA Methylation microarrays
Martin J. Aryee,Andrew E. Jaffe,Hector Corrada-Bravo,Christine Ladd-Acosta,Andrew P. Feinberg,Andrew P. Feinberg,Kasper D. Hansen,Kasper D. Hansen,Rafael A. Irizarry +8 more
TL;DR: A suite of computational tools that incorporate state-of-the-art statistical techniques for the analysis of DNAm data are described that include methods for preprocessing, quality assessment and detection of differentially methylated regions from the kilobase to the megabase scale.
Journal ArticleDOI
An epigenetic biomarker of aging for lifespan and healthspan
Morgan E. Levine,Ake T. Lu,Austin Quach,Brian H. Chen,Themistocles L. Assimes,Stefania Bandinelli,Lifang Hou,Andrea A. Baccarelli,James D. Stewart,Yun Li,Eric A. Whitsel,James G. Wilson,Alex P. Reiner,Abraham Aviv,Kurt Lohman,Yongmei Liu,Luigi Ferrucci,Steve Horvath +17 more
TL;DR: A new epigenetic biomarker of aging, DNAm PhenoAge, is developed that strongly outperforms previous measures in regards to predictions for a variety of aging outcomes, including all-cause mortality, cancers, healthspan, physical functioning, and Alzheimer's disease.
Journal ArticleDOI
A data-driven approach to preprocessing Illumina 450K methylation array data
Ruth Pidsley,Chloe C. Y. Wong,Manuela Volta,Katie Lunnon,Jonathan Mill,Jonathan Mill,Leonard C. Schalkwyk +6 more
TL;DR: It is demonstrated that quantile normalization methods produce marked improvement, even in highly consistent data, by all three metrics, and that careful selection of preprocessing steps can minimize variance and thus improve statistical power, especially for the detection of the small absolute DNA methylation changes likely associated with complex disease phenotypes.
Journal ArticleDOI
Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling
Ruth Pidsley,Ruth Pidsley,Elena Zotenko,Elena Zotenko,Tim J Peters,Mitchell G. Lawrence,Gail P. Risbridger,Peter L. Molloy,Susan Van Djik,Beverly S. Muhlhausler,Clare Stirzaker,Clare Stirzaker,Susan J. Clark,Susan J. Clark +13 more
TL;DR: The EPIC array is a significant improvement over the HM450 array, with increased genome coverage of regulatory regions and high reproducibility and reliability, providing a valuable tool for high-throughput human methylome analyses from diverse clinical samples.
References
More filters
Book ChapterDOI
limma: Linear Models for Microarray Data
TL;DR: This chapter starts with the simplest replicated designs and progresses through experiments with two or more groups, direct designs, factorial designs and time course experiments with technical as well as biological replication.
Journal ArticleDOI
The epigenomics of cancer.
Peter A. Jones,Stephen B. Baylin +1 more
TL;DR: Recent advances in understanding how epigenetic alterations participate in the earliest stages of neoplasia, including stem/precursor cell contributions, are reviewed and the growing implications of these advances for strategies to control cancer are discussed.
Journal ArticleDOI
CpG islands in vertebrate genomes.
TL;DR: It is shown that CpG islands in methylated genomes are maintained, despite a tendency for 5mCpG to mutate by deamination to TpG+CpA, by the structural stability of a high G+C content alone, and that C pG islands associated with exons result from some selective importance of the arginine codon CGX.
Book
Bioinformatics and Computational Biology Solutions Using R and Bioconductor
TL;DR: In this article, the authors present a detailed case study of R algorithms with publicly available data, and a major section of the book is devoted to fully worked case studies, with a companion website where readers can reproduce every number, figure and table on their own computers.
Journal ArticleDOI
The epigenetic progenitor origin of human cancer
TL;DR: This work suggests that non-neoplastic but epigenetically disrupted stem/progenitor cells might be a crucial target for cancer risk assessment and chemoprevention.