scispace - formally typeset
Search or ask a question

Showing papers in "International Journal of Epidemiology in 2015"


Journal ArticleDOI
TL;DR: An adaption of Egger regression can detect some violations of the standard instrumental variable assumptions, and provide an effect estimate which is not subject to these violations, and provides a sensitivity analysis for the robustness of the findings from a Mendelian randomization investigation.
Abstract: Background: The number of Mendelian randomization analyses including large numbers of genetic variants is rapidly increasing. This is due to the proliferation of genome-wide association studies, and the desire to obtain more precise estimates of causal effects. However, some genetic variants may not be valid instrumental variables, in particular due to them having more than one proximal phenotypic correlate (pleiotropy). Methods: We view Mendelian randomization with multiple instruments as a meta-analysis, and show that bias caused by pleiotropy can be regarded as analogous to small study bias. Causal estimates using each instrument can be displayed visually by a funnel plot to assess potential asymmetry. Egger regression, a tool to detect small study bias in meta-analysis, can be adapted to test for bias from pleiotropy, and the slope coefficient from Egger regression provides an estimate of the causal effect. Under the assumption that the association of each genetic variant with the exposure is independent of the pleiotropic effect of the variant (not via the exposure), Egger’s test gives a valid test of the null causal hypothesis and a consistent causal effect estimate even when all the genetic variants are invalid instrumental variables. Results: We illustrate the use of this approach by re-analysing two published Mendelian randomization studies of the causal effect of height on lung function, and the causal effect of blood pressure on coronary artery disease risk. The conservative nature of this approach is illustrated with these examples. Conclusions: An adaption of Egger regression (which we call MR-Egger) can detect some violations of the standard instrumental variable assumptions, and provide an effect estimate which is not subject to these violations. The approach provides a sensitivity analysis for the robustness of the findings from a Mendelian randomization investigation.

3,392 citations


Journal ArticleDOI
TL;DR: The CPRD primary care database is a rich source of health data for research, including data on demographics, symptoms, tests, diagnoses, therapies, health-related behaviours and referrals to secondary care, but researchers must be aware of the complexity of routinely collected electronic health records.
Abstract: The Clinical Practice Research Datalink (CPRD) is an ongoing primary care database of anonymised medical records from general practitioners, with coverage of over 11.3 million patients from 674 practices in the UK. With 4.4 million active (alive, currently registered) patients meeting quality criteria, approximately 6.9% of the UK population are included and patients are broadly representative of the UK general population in terms of age, sex and ethnicity. General practitioners are the gatekeepers of primary care and specialist referrals in the UK. The CPRD primary care database is therefore a rich source of health data for research, including data on demographics, symptoms, tests, diagnoses, therapies, health-related behaviours and referrals to secondary care. For over half of patients, linkage with datasets from secondary care, disease-specific cohorts and mortality records enhance the range of data available for research. The CPRD is very widely used internationally for epidemiological research and has been used to produce over 1000 research studies, published in peer-reviewed journals across a broad range of health outcomes. However, researchers must be aware of the complexity of routinely collected electronic health records, including ways to manage variable completeness, misclassification and development of disease definitions for research.

1,894 citations


Journal ArticleDOI
TL;DR: The LifeLines Cohort Study is a large population-based cohort study and biobank that was established as a resource for research on complex interactions between environmental, phenotypic and genomic factors in the development of chronic diseases and healthy ageing.
Abstract: The LifeLines Cohort Study is a large population-based cohort study and biobank that was established as a resource for research on complex interactions between environmental, phenotypic and genomic factors in the development of chronic diseases and healthy ageing. Between 2006 and 2013, inhabitants of the northern part of The Netherlands and their families were invited to participate, thereby contributing to a three-generation design. Participants visited one of the LifeLines research sites for a physical examination, including lung function, ECG and cognition tests, and completed extensive questionnaires. Baseline data were collected for 167 729 participants, aged from 6 months to 93 years. Follow-up visits are scheduled every 5 years, and in between participants receive follow-up questionnaires. Linkage is being established with medical registries and environmental data. LifeLines contains information on biochemistry, medical history, psychosocial characteristics, lifestyle and more. Genomic data are available including genome-wide genetic data of 15 638 participants. Fasting blood and 24-h urine samples are processed on the day of collection and stored at -80 °C in a fully automated storage facility. The aim of LifeLines is to be a resource for the national and international scientific community. Requests for data and biomaterials can be submitted to the LifeLines Research Office [LLscience@umcg.nl].

558 citations


Journal ArticleDOI
TL;DR: The Brazilian Longitudinal Study of Adult Health (ELSA-Brasil) aims to contribute relevant information regarding the development and progression of clinical and subclinical chronic diseases, particularly cardiovascular diseases and diabetes, in one such setting.
Abstract: Chronic diseases are a global problem, yet information on their determinants is generally scant in low- and middle-income countries. The Brazilian Longitudinal Study of Adult Health (ELSA-Brasil) aims to contribute relevant information regarding the development and progression of clinical and subclinical chronic diseases, particularly cardiovascular diseases and diabetes, in one such setting. At Visit 1, we enrolled 15105 civil servants from predefined universities or research institutes. Baseline assessment (2008‐10) included detailed interviews and measurements to assess social and biological determinants of health, as well as various clinical and subclinical conditions related to diabetes, cardiovascular diseases and mental health. A second visit of interviews and examinations is under way (2012‐14) to enrich the assessment of cohort exposures and to detect initial incident events. Annual surveillance has been conducted since 2009 for the ascertainment of incident events. Biological samples (sera, plasma, urine and DNA) obtained at both visits have been placed in long-term storage. Baseline data are available for analyses, and collaboration via specific research proposals directed to study investigators is welcome.

441 citations


Journal ArticleDOI
TL;DR: Markers of physical and mental fitness are associated with the epigenetic clock (lower abilities associated with age acceleration), however, age acceleration does not associate with decline in these measures, at least over a relatively short follow-up.
Abstract: Background: The DNA methylation-based ‘epigenetic clock’ correlates strongly with chronological age, but it is currently unclear what drives individual differences. We examine cross-sectional and longitudinal associations between the epigenetic clock and four mortality-linked markers of physical and mental fitness: lung function, walking speed, grip strength and cognitive ability. Methods: DNA methylation-based age acceleration (residuals of the epigenetic clock estimate regressed on chronological age) were estimated in the Lothian Birth Cohort 1936 at ages 70 (n = 920), 73 (n = 299) and 76 (n = 273) years. General cognitive ability, walking speed, lung function and grip strength were measured concurrently. Cross-sectional correlations between age acceleration and the fitness variables were calculated. Longitudinal change in the epigenetic clock estimates and the fitness variables were assessed via linear mixed models and latent growth curves. Epigenetic age acceleration at age 70 was used as a predictor of longitudinal change in fitness. Epigenome-wide association studies (EWASs) were conducted on the four fitness measures. Results: Cross-sectional correlations were significant between greater age acceleration and poorer performance on the lung function, cognition and grip strength measures (r range: −0.07 to −0.05, P range: 9.7 x 10−3 to 0.024). All of the fitness variables declined over time but age acceleration did not correlate with subsequent change over 6 years. There were no EWAS hits for the fitness traits. Conclusions: Markers of physical and mental fitness are associated with the epigenetic clock (lower abilities associated with age acceleration). However, age acceleration does not associate with decline in these measures, at least over a relatively short follow-up.

440 citations


Journal ArticleDOI
TL;DR: Several modelling techniques for analysis of recurrent time-to-event data are explored, including conditional models for multivariate survival data, marginal means/rates models, frailty and multi-state models, and recommendations for modelling strategy selection are made.
Abstract: In many biomedical studies, the event of interest can occur more than once in a participant. These events are termed recurrent events. However, the majority of analyses focus only on time to the first event, ignoring the subsequent events. Several statistical models have been proposed for analysing multiple events. In this paper we explore and illustrate several modelling techniques for analysis of recurrent time-to-event data, including conditional models for multivariate survival data (AG, PWP-TT and PWP-GT), marginal means/rates models, frailty and multi-state models. We also provide a tutorial for analysing such type of data, with three widely used statistical software programmes. Different approaches and software are illustrated using data from a bladder cancer project and from a study on lower respiratory tract infection in children in Brazil. Finally, we make recommendations for modelling strategy selection for analysis of recurrent event data.

328 citations


Journal ArticleDOI
TL;DR: About two-thirds of the major risk factors associated with PC are potentially modifiable, affording a unique opportunity for preventing one of the authors' deadliest cancers.
Abstract: Background The aetiology of pancreatic cancer (PC) has been extensively studied and is the subject of numerous meta-analyses and pooled analyses. We have summarized results from these pooled and meta-analytical studies to estimate the fraction of PCs attributable to each of the identified risk factors. Methods Using a comprehensive strategy, we retrieved 117 meta-analytical or pooled reports dealing with the association between specific risk factors and PC risk. We combined estimates of relative risk and estimates of exposure to calculate the fraction of PCs caused or prevented by a particular exposure. Results Tobacco smoking ('strong' evidence) and Helicobacter pylori infection ('moderate' evidence) are the major risk factors associated with PC, with respective estimated population attributable fractions of 11-32% and 4-25%. The major protective factors are history of allergy ('strong' evidence) and increasing fruit or folate intake ('moderate' evidence), with respective population preventable fractions of 3-7% and 0-12%. Conclusions We summarized results of 117 meta-analytical or pooled data reports dealing with 37 aetiological exposures, to obtain robust information about the suspected causes of PC. By combining these estimates with their prevalences in the population, we calculated population attributable or population preventable fractions. About two-thirds of the major risk factors associated with PC are potentially modifiable, affording a unique opportunity for preventing one of our deadliest cancers.

309 citations


Journal ArticleDOI
TL;DR: The Estonian Genome Center of the University of Tartu is actively collaborating with many universities, research institutes and consortia and encourages fellow scientists worldwide to co-initiate new academic or industrial joint projects with us.
Abstract: The Estonian Biobank cohort is a volunteer-based sample of the Estonian resident adult population (aged � 18 years). The current number of participants—close to 52000-—represents a large proportion, 5%, of the Estonian adult population, making it ideally suited to population-based studies. General practitioners (GPs) and medical personnel in the special recruitment offices have recruited participants throughout the country. At baseline, the GPs performed a standardized health examination of the participants, who also donated blood samples for DNA, white blood cells and plasma tests and filled out a 16-module questionnaire on health-related topics such as lifestyle, diet and clinical diagnoses described in WHO ICD-10. A significant part of the cohort has whole genome sequencing (100), genome-wide single nucleotide polymorphism (SNP) array data (20 000) and/or NMR metabolome data (11 000) available (http://www.geenivaramu.ee/ for-scientists/data-release/). The data are continuously updated through periodical linking to national electronic databases and registries. A part of the cohort has been recontacted for follow-up purposes and resampling, and targeted invitations are possible for specific purposes, for example people with a specific diagnosis. The Estonian Genome Center of the University of Tartu is actively collaborating with many universities, research institutes and consortia and encourages fellow scientists worldwide to co-initiate new academic or industrial joint projects with us.

291 citations


Journal ArticleDOI
TL;DR: Primary study outcomes include pregnancy outcomes, maternal mental and cardiometabolic health and child neurodevelopment, asthma/atopy and obesity/cardiometric health.
Abstract: We established Project Viva to examine prenatal diet and other factors in relation to maternal and child health. We recruited pregnant women at their initial prenatal visit in eastern Massachusetts between 1999 and 2002. Exclusion criteria included multiple gestation, inability to answer questions in English, gestational age ≥22 weeks at recruitment and plans to move away before delivery. We completed in-person visits with mothers during pregnancy in the late first (median 9.9 weeks of gestation) and second (median 27.9 weeks) trimesters. We saw mothers and children in the hospital during the delivery admission and during infancy (median age 6.3 months), early childhood (median 3.2 years) and mid-childhood (median 7.7 years). We collected information from mothers via interviews and questionnaires, performed anthropometric and neurodevelopmental assessments and collected biosamples. We have collected additional information from medical records and from mailed questionnaires sent annually to mothers between in-person visits and to children beginning at age 9 years. From 2341 eligible women, there were 2128 live births; 1279 mother-child pairs provided data at the mid-childhood visit. Primary study outcomes include pregnancy outcomes, maternal mental and cardiometabolic health and child neurodevelopment, asthma/atopy and obesity/cardiometabolic health. Investigators interested in learning more about how to obtain Project Viva data can contact Project_Viva@hphc.org.

285 citations


Journal ArticleDOI
TL;DR: Data Resource Profile: Accessible Resource for Integrated Epigenomic Studies (ARIES) Caroline L Relton, Tom Gaunt, Wendy McArdle, Karen Ho, Aparna Duggirala, Hashem Shihab, Geoff Woodward, Oliver Lyttleton, David M Evans, Wolf Reik, Yu-Lee Paul, Gabriella Ficz, Susan E Ozanne and Susan M Ring.
Abstract: Data Resource Profile: Accessible Resource for Integrated Epigenomic Studies (ARIES) Caroline L Relton, Tom Gaunt, Wendy McArdle, Karen Ho, Aparna Duggirala, Hashem Shihab, Geoff Woodward, Oliver Lyttleton, David M Evans, Wolf Reik, Yu-Lee Paul, Gabriella Ficz, Susan E Ozanne, Anil Wipat, Keith Flanagan, Allyson Lister, Bastiaan T Heijmans, Susan M Ring and George Davey Smith MRC Integrative Epidemiology Unit, and School of Social and Community Medicine, University of Bristol, Bristol, UK, Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, UK, University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, WA, Australia, Babraham Institute, Cambridge, UK, Wellcome Trust Sanger Institute, Cambridge, UK, Barts Cancer Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK, University of Cambridge Institute of Metabolic Sciences and MRC Metabolic Diseases Unit, Cambridge, UK, School of Computer Science, Newcastle University, Newcastle upon Tyne, UK and Molecular Epidemiology, Leiden University Medical Center, Leiden, The Netherlands

265 citations


Journal ArticleDOI
TL;DR: The Framingham Heart Study has conducted seminal research defining cardiovascular disease (CVD) risk factors and fundamentally shaping public health guidelines for CVD prevention over the past five decades, with staggering expanded breadth and depth that have far greater implications in the study of the epidemiology of a wide spectrum of human diseases.
Abstract: The Framingham Heart Study (FHS) has conducted seminal research defining cardiovascular disease (CVD) risk factors and fundamentally shaping public health guidelines for CVD prevention over the past five decades. The success of the Original Cohort, initiated in 1948, paved the way for further epidemiological research in preventive cardiology. Due to the keen observations suggesting the role of shared familial factors in the development of CVD, in 1971 the FHS began enroling the second generation cohort, comprising the children of the Original Cohort and the spouses of the children. In 2002, the third generation cohort, comprising the grandchildren of the Original Cohort, was initiated to additionally explore genetic contributions to CVD in greater depth. Additionally, because of the predominance of White individuals of European descent in the three generations of FHS participants noted above, the Heart Study enrolled the OMNI1 and OMNI2 cohorts in 1994 and 2003, respectively, aimed to reflect the current greater racial and ethnic diversity of the town of Framingham. All FHS cohorts have been examined approximately every 2-4 years since the initiation of the study. At these periodic Heart Study examinations, we obtain a medical history and perform a cardiovascular-focused physical examination, 12-lead electrocardiography, blood and urine samples testing and other cardiovascular imaging studies reflecting subclinical disease burden.The FHS has continually evolved along the cutting edge of cardiovascular science and epidemiological research since its inception. Participant studies now additionally include study of cardiovascular imaging, serum and urine biomarkers, genetics/genomics, proteomics, metabolomics and social networks. Numerous ancillary studies have been established, expanding the phenotypes to encompass multiple organ systems including the lungs, brain, bone and fat depots, among others. Whereas the FHS was originally conceived and designed to study the epidemiology of cardiovascular disease, it has evolved over the years with staggering expanded breadth and depth that have far greater implications in the study of the epidemiology of a wide spectrum of human diseases. The FHS welcomes research collaborations using existing or new collection of data. Detailed information regarding the procedures for research application submission and review are available at [http://www.framinghamheartstudy.org/researchers/index.php].

Journal ArticleDOI
TL;DR: This paper gives the most comprehensive description of published methodology for sample size calculation for cluster randomized trials and provides an important resource for those designing these trials.
Abstract: Background: The use of cluster randomized trials (CRTs) is increasing, along with the variety in their design and analysis. The simplest approach for their sample size calculation is to calculate the sample size assuming individual randomization and inflate this by a design effect to account for randomization by cluster. The assumptions of a simple design effect may not always be met; alternative or more complicated approaches are required. Methods: We summarise a wide range of sample size methods available for cluster randomized trials. For those familiar with sample size calculations for individually randomized trials but with less experience in the clustered case, this manuscript provides formulae for a wide range of scenarios with associated explanation and recommendations. For those with more experience, comprehensive summaries are provided that allow quick identification of methods for a given design, outcome and analysis method. Results: We present first those methods applicable to the simplest two-arm, parallel group, completely randomized design followed by methods that incorporate deviations from this design such as: variability in cluster sizes; attrition; non-compliance; or the inclusion of baseline covariates or repeated measures. The paper concludes with methods for alternative designs. Conclusions: There is a large amount of methodology available for sample size calculations in CRTs. This paper gives the most comprehensive description of published methodology for sample size calculation and provides an important resource for those designing these trials.

Journal ArticleDOI
TL;DR: These Mendelian randomization methods can be used to estimate direct and indirect causal effects in a mediation setting, and have potential for the investigation of more complex networks between multiple interrelated exposures and disease outcomes.
Abstract: Background: Mendelian randomization uses genetic variants, assumed to be instrumental variables for a particular exposure, to estimate the causal effect of that exposure on an outcome. If the instrumental variable criteria are satisfied, the resulting estimator is consistent even in the presence of unmeasured confounding and reverse causation. Methods: We extend the Mendelian randomization paradigm to investigate more complex networks of relationships between variables, in particular where some of the effect of an exposure on the outcome may operate through an intermediate variable (a mediator). If instrumental variables for the exposure and mediator are available, direct and indirect effects of the exposure on the outcome can be estimated, for example using either a regression-based method or structural equation models. The direction of effect between the exposure and a possible mediator can also be assessed. Methods are illustrated in an applied example considering causal relationships between body mass index, C-reactive protein and uric acid. Results: These estimators are consistent in the presence of unmeasured confounding if, in addition to the instrumental variable assumptions, the effects of both the exposure on the mediator and the mediator on the outcome are homogeneous across individuals and linear without interactions. Nevertheless, a simulation study demonstrates that even considerable heterogeneity in these effects does not lead to bias in the estimates. Conclusions: These methods can be used to estimate direct and indirect causal effects in a mediation setting, and have potential for the investigation of more complex networks between multiple interrelated exposures and disease outcomes.

Journal ArticleDOI
TL;DR: The data suggest that both maternal obesity and, to a larger degree, underweight affect the neonatal epigenome via an intrauterine mechanism, but weight gain during pregnancy has little effect.
Abstract: Background: Evidence suggests that in utero exposure to undernutrition and overnutrition might affect adiposity in later life. Epigenetic modification is suggested as a plausible mediating mechanism. Methods: We used multivariable linear regression and a negative control design to examine offspring epigenome-wide DNA methylation in relation to maternal and offspring adiposity in 1018 participants. Results: Compared with neonatal offspring of normal weight mothers, 28 and 1621 CpG sites were differentially methylated in offspring of obese and underweight mothers, respectively [false discovert rate (FDR)-corrected P-value <0.05), with no overlap in the sites that maternal obesity and underweight relate to. A positive association, where higher methylation is associated with a body mass index (BMI) outside the normal range, was seen at 78.6% of the sites associated with obesity and 87.9% of the sites associated with underweight. Associations of maternal obesity with offspring methylation were stronger than associations of paternal obesity, supporting an intrauterine mechanism. There were no consistent associations of gestational weight gain with offspring DNA methylation. In general, sites that were hypermethylated in association with maternal obesity or hypomethylated in association with maternal underweight tended to be positively associated with offspring adiposity, and sites hypomethylated in association with maternal obesity or hypermethylated in association with maternal underweight tended to be inversely associated with offspring

Journal ArticleDOI
TL;DR: Although the focus remains on factors affecting the health and well-being of women and their access to and use of health services across urban, rural and remote areas of Australia, the study has now been considerably expanded by linkage to other health data sets.
Abstract: In 1996 the Australian Longitudinal Study on Women's Health recruited a nationally representative sample of more than 40,000 women in three age cohorts, born in 1973-78, 1946-51 and 1921-26. At least six waves of 3-yearly surveys have been completed. Although the focus remains on factors affecting the health and well-being of women and their access to and use of health services across urban, rural and remote areas of Australia, the study has now been considerably expanded by linkage to other health data sets. For most women who have ever participated in the study, linked records are now available for: government-subsidized non-hospital services (e.g. all general practitioner visits); pharmaceutical prescriptions filled; national death index, including codes for multiple causes of death; aged care assessments and services; cancer registries; and, for most states and territories, hospital admissions and perinatal data. Additionally, a large cohort of women born in 1989-95 have been recruited. The data are available to approved collaborators, with more than 780 researchers using the data so far. Full details of the study materials and data access procedures are available at [http://www.alswh.org.au/].

Journal ArticleDOI
TL;DR: Most MR studies either use the genotype as a proxy for exposure without further estimation or perform an IV analysis, and the discussion of underlying assumptions and reporting of statistical methods for IV analysis are frequently insufficient.
Abstract: Background Mendelian randomization (MR) studies investigate the effect of genetic variation in levels of an exposure on an outcome, thereby using genetic variation as an instrumental variable (IV). We provide a meta-epidemiological overview of the methodological approaches used in MR studies, and evaluate the discussion of MR assumptions and reporting of statistical methods. Methods We searched PubMed, Medline, Embase and Web of Science for MR studies up to December 2013. We assessed (i) the MR approach used; (ii) whether the plausibility of MR assumptions was discussed; and (iii) whether the statistical methods used were reported adequately. Results Of 99 studies using data from one study population, 32 used genetic information as a proxy for the exposure without further estimation, 44 performed a formal IV analysis, 7 compared the observed with the expected genotype-outcome association, and 1 used both the latter two approaches. The 80 studies using data from multiple study populations used many different approaches to combine the data; 52 of these studies used some form of IV analysis; 44% of studies discussed the plausibility of all three MR assumptions in their study. Statistical methods used for IV analysis were insufficiently described in 14% of studies. Conclusions Most MR studies either use the genotype as a proxy for exposure without further estimation or perform an IV analysis. The discussion of underlying assumptions and reporting of statistical methods for IV analysis are frequently insufficient. Studies using data from multiple study populations are further complicated by the combination of data or estimates. We provide a checklist for the reporting of MR studies.

Journal ArticleDOI
TL;DR: A common bias structure (leading to 'live-birth bias') that arises from studying the effects of prenatal exposure to environmental factors on long-term health outcomes among live births only in pregnancy cohorts is illustrated and the need to identify the determinants of pregnancy loss is highlighted.
Abstract: Only 60–70% of fertilized eggs may result in a live birth, and very early fetal loss mainly goes unnoticed. Outcomes that can only be ascertained in live-born children will be missing for those who do not survive till birth. In this article, we illustrate a common bias structure (leading to ‘live-birth bias’) that arises from studying the effects of prenatal exposure to environmental factors on long-term health outcomes among live births only in pregnancy cohorts. To illustrate this we used prenatal exposure to perfluoroalkyl substances (PFAS) and attention-deficit/hyperactivity disorder (ADHD) in school-aged children as an example. PFAS are persistent organic pollutants that may impact human fecundity and be toxic for neurodevelopment. We simulated several hypothetical scenarios based on characteristics from the Danish National Birth Cohort and found that a weak inverse association may appear even if PFAS do not cause ADHD but have a considerable effect on fetal survival. The magnitude of the negative bias was generally small, and adjusting for common causes of the outcome and fetal loss can reduce the bias. Our example highlights the need to identify the determinants of pregnancy loss and the importance of quantifying bias arising from conditioning on live birth in observational studies.

Journal ArticleDOI
TL;DR: Power estimates for array-based DNA methylation EWAS under case-control and disease-discordant MZ twin designs are provided, and multiple factors that impact on EWAS power are explored.
Abstract: Background: Epigenome-wide association scans (EWAS) are under way for many complex human traits, but EWAS power has not been fully assessed. We investigate power of EWAS to detect differential methylation using case-control and disease-discordant monozygotic (MZ) twin designs with genome-wide DNA methylation arrays. Methods and Results: We performed simulations to estimate power under the casecontrol and discordant MZ twin EWAS study designs, under a range of epigenetic risk effect sizes and conditions. For example, to detect a 10% mean methylation difference between affected and unaffected subjects at a genome-wide significance threshold of P ¼1 � 10 � 6 , 98MZ twin pairs were required to reach 80% EWAS power, and 112 cases and 112 controls pairs were needed in the case-control design. We also estimated the minimum sample size required to reach 80% EWAS power under both study designs. Our analyses highlighted several factors that significantly influenced EWAS power, including sample size, epigenetic risk effect size, the variance of DNA methylation at the locus of interest and the correlation in DNA methylation patterns within the twin sample. Conclusions: We provide power estimates for array-based DNA methylation EWAS under case-control and disease-discordant MZ twin designs, and explore multiple factors that impact on EWAS power. Our results can help guide EWAS experimental design and interpretation for future epigenetic studies.

Journal ArticleDOI
TL;DR: Since the IJE published a cohort profile on the TRAILS in 2008, the participants have matured from adolescents into young adults, and the focus shifted from parents and school to entry into the labour market and family formation, including offspring.
Abstract: TRAILS consists of a population cohort (N=2230) and a clinical cohort (N=543), both of which were followed from about age 11 years onwards. To date, the population cohort has been assessed five times over a period of 11 years, with retention rates ranging between 80% and 96%. The clinical cohort has been assessed four times over a period of 8 years, with retention rates ranging between 77% and 85%. Since the IJE published a cohort profile on the TRAILS in 2008, the participants have matured from adolescents into young adults. The focus shifted from parents and school to entry into the labour market and family formation, including offspring. Furthermore, psychiatric diagnostic interviews were administered, the database was linked to a Psychiatric Case Registry, and the availability of genome-wide SNP variations opened the door to genome-wide association studies regarding a wide range of (endo)phenotypes. With some delay, TRAILS data are available to researchers outside the TRAILS consortium without costs; access can be obtained by submitting a publication proposal (see www.trails.nl).

Journal ArticleDOI
TL;DR: Maternal smoking during pregnancy was associated with cord blood methylation differences andFunctional network analysis suggested a role in activating the immune system and functional enrichment analysis pointed towards activation of cell-mediated immunity.
Abstract: Background: We examined whether the effect of maternal smoking during pregnancy on birthweight of the offspring was mediated by smoking-induced changes to DNA methylation in cord blood. Methods: First, we used cord blood of 129 Dutch children exposed to maternal smoking vs 126 unexposed to maternal and paternal smoking (53% male) participating in the GECKO Drenthe birth cohort. DNA methylation was measured using the Illumina HumanMethylation450 Beadchip. We performed an epigenome-wide association study for the association between maternal smoking and methylation followed by a mediation analysis of the top signals [false-discovery rate (FDR) < 0.05]. We adjusted both analyses for maternal age, education, pre-pregnancy BMI, offspring’s sex, gestational age and white blood cell composition. Secondly, in 175 exposed and 1248 unexposed newborns from two independent birth cohorts, we replicated and meta-analysed results of eight cytosine-phosphate-guanine (CpG) sites in the GFI1 gene, which showed the most robust mediation. Finally, we performed functional network and enrichment analysis. Results: We found 35 differentially methylated CpGs (FDR < 0.05) in newborns exposed vs unexposed to smoking, of which 23 survived Bonferroni correction (P < 1 � 10 -7 ). These 23 CpGs mapped to eight genes: AHRR, GFI1, MYO1G, CYP1A1, NEUROG1, CNTNAP2, FRMD4A and LRP5. We observed partial confirmation as three of the eight CpGs in GFI1 replicated. These CpGs partly mediated the effect of maternal smoking on birthweight (Sobel

Journal ArticleDOI
TL;DR: Intra- and inter-laboratory technical variation severely limits the usefulness of data pooling and excludes sharing of reference ranges between laboratories, and it is proposed to establish a common set of physical telomere length standards to improve comparability of telomeres length estimates between laboratories.
Abstract: Background: Telomere length is a putative biomarker of ageing, morbidity and mortality. Its application is hampered by lack of widely applicable reference ranges and uncertainty regarding the present limits of measurement reproducibility within and between laboratories. Methods: We instigated an international collaborative study of telomere length assessment: 10 different laboratories, employing 3 different techniques [Southern blotting, single telomere length analysis (STELA) and real-time quantitative PCR (qPCR)] performed two rounds of fully blinded measurements on 10 human DNA samples per round to enable unbiased assessment of intra- and inter-batch variation between laboratories and techniques. Results: Absolute results from different laboratories differed widely and could thus not be compared directly, but rankings of relative telomere lengths were highly correlated (correlation coefficients of 0.63-0.99). Intra-technique correlations were similar for Southern blotting and qPCR and were stronger than inter-technique ones. However, inter-laboratory coefficients of variation (CVs) averaged about 10% for Southern blotting and STELA and more than 20% for qPCR. This difference was compensated for by a higher dynamic range for the qPCR method as shown by equal variance after z-scoring. Technical variation per laboratory, measured as median of intra- and inter-batch CVs, ranged from 1.4% to 9.5%, with differences between laboratories only marginally significant (P = 0.06). Gel-based and PCR-based techniques were not different in accuracy. Conclusions: Intra- and inter-laboratory technical variation severely limits the usefulness of data pooling and excludes sharing of reference ranges between laboratories. We propose to establish a common set of physical telomere length standards to improve comparability of telomere length estimates between laboratories.

Journal ArticleDOI
TL;DR: This is the accepted manuscript and the final version of the manuscript is available at http://ije.oxfordjournals.org/content/44/2/379.full.
Abstract: This is the accepted manuscript. The final version is available at http://ije.oxfordjournals.org/content/44/2/379.full.

Journal ArticleDOI
TL;DR: High coffee intake was associated observationally with low risk of obesity, metabolic syndrome and type 2 diabetes, and was associated observedally with related components thereof, but with no genetic evidence to support corresponding causal relationships.
Abstract: Background: Coffee is one of the most widely consumed beverages. We tested the hypothesis that genetically high coffee intake is associated with low risk of obesity, metabolic syndrome and type 2 diabetes, and with related components thereof. Methods: We included 93179 individuals from two large general population cohorts in a Mendelian randomization study. We tested first whether high coffee intake is associated with low risk of obesity, metabolic syndrome and type 2 diabetes, and with related components thereof, in observational analyses; second, whether five genetic variants near the CYP1A1, CYP1A2 and AHR genes are associated with coffee intake; and third, whether the genetic variants are associated with obesity, metabolic syndrome and type 2 diabetes, and with related components thereof. Finally, we tested the genetic association with type 2 diabetes in a meta-analysis including up to 78021 additional individuals from the DIAGRAM consortium. Results: Observationally, high coffee intake was associated with low risk of obesity, metabolic syndrome and type 2 diabetes. Further, high coffee intake was associated with high body mass index, waist circumference, weight, height, systolic/diastolic blood pressure, triglycerides and total cholesterol and with low high-density lipoprotein cholesterol, but not with glucose levels. In genetic analyses, 9‐10 vs 0‐3 coffee-intake alleles were associated with 29% higher coffee intake. However, genetically derived high coffee intake was not associated convincingly with obesity, metabolic syndrome, type 2 diabetes, body mass index, waist circumference, weight, height, systolic/diastolic blood pressure, triglycerides, total cholesterol, high-density lipoprotein cholesterol or glucose levels. Perallele meta-analysed odds ratios for type 2 diabetes were 1.01 (0.98‐1.04) for AHR rs4410790, 0.98 (0.95‐1.01) for AHR rs6968865, 1.01 (0.99‐1.03) for CYP1A1/2 rs2470893, 1.01 (0.98‐1.03) for CYP1A1/2 rs2472297 and 0.98 (0.95‐1.01) for CYP1A1 rs2472299.

Journal ArticleDOI
TL;DR: The aim of the TARGet Kids! cohort is to link early life exposures to health problems including obesity, micronutrient deficiencies and developmental problems, to improve the health of Canadians by optimizing growth and developmental trajectories through preventive interventions in early childhood.
Abstract: The Applied Research Group for Kids (TARGet Kids!) is an ongoing open longitudinal cohort study enrolling healthy children (from birth to 5 years of age) and following them into adolescence. The aim of the TARGet Kids! cohort is to link early life exposures to health problems including obesity, micronutrient deficiencies and developmental problems. The overarching goal is to improve the health of Canadians by optimizing growth and developmental trajectories through preventive interventions in early childhood. TARGet Kids!, the only child health research network embedded in primary care practices in Canada, leverages the unique relationship between children and families and their trusted primary care practitioners, with whom they have at least seven health supervision visits in the first 5 years of life. Children are enrolled during regularly scheduled well-child visits. To date, we have enrolled 5062 children. In addition to demographic information, we collect physical measurements (e.g. height, weight), lifestyle factors (nutrition, screen time and physical activity), child behaviour and developmental screening and a blood sample (providing measures of cardiometabolic, iron and vitamin D status, and trace metals). All data are collected at each well-child visit: twice a year until age 2 and every year until age 10. Information can be found at: http://www.targetkids.ca/contact-us/.

Journal ArticleDOI
TL;DR: The new E4N complementary cohort (Epidemiology 4 kNowledge), which comprises the children and grandchildren of the E3N cohort as well as the children's fathers, will allow researchers to investigate key life periods during which exposures to environmental factors most strongly influence the later disease risk.
Abstract: The E3N (Etude Epidemiologique aupres de femmes de la Mutuelle Generale de l'Education Nationale) cohort was initiated in 1990 to investigate therisk factors associated with cancer and other major non-communicable diseases in women. The participants were insured through a national health system that primarily covered teachers, and were enrolled from 1990 after returning baseline self-administered questionnaires and providing informed consent. The cohort comprised nearly 100,000 women with baseline ages ranging from 40 to 65 years. Follow-up questionnaires were sent approximately every 2-3 years after the baseline and addressed general and lifestyle characteristics together with medical events (cancer, cardiovascular diseases, diabetes, depression, fractures and asthma, among others). The follow-up questionnaire response rate remained stable at approximately 80%. A biological material bank was generated and included blood samples collected from 25,000 women and saliva samples from an additional 47,000 women. Ageing among the E3N cohort provided the opportunity to investigate factors related to age-related diseases and conditions as well as disease survival. The new E4N complementary cohort (Epidemiology 4 kNowledge), which comprises the children and grandchildren of the E3N cohort as well as the children's fathers, will allow researchers to investigate key life periods during which exposures to environmental factors most strongly influence the later disease risk. The E3N and E4N cohort data will be used to investigate diseases and risk factors through a transgenerational approach. Requests for collaborations are welcome, particularly those in conjunction with rare diseases.

Journal ArticleDOI
TL;DR: Joint models should be preferred for simultaneous analyses of repeated measurement and survival data, especially when the former is measured with error and the association between the underlying error-free measurement process and the hazard for survival is of scientific interest.
Abstract: Backgound: The term ‘joint modelling’ is used in the statistical literature to refer to methods for simultaneously analysing longitudinal measurement outcomes, also called repeated measurement data, and time-to-event outcomes, also called survival data. A typical example from nephrology is a study in which the data from each participant consist of repeated estimated glomerular filtration rate (eGFR) measurements and time to initiation of renal replacement therapy (RRT). Joint models typically combine linear mixed effects models for repeated measurements and Cox models for censored survival outcomes. Our aim in this paper is to present an introductory tutorial on joint modelling methods, with a case study in nephrology. Methods: We describe the development of the joint modelling framework and compare the results with those obtained by the more widely used approaches of conducting separate analyses of the repeated measurements and survival times based on a linear mixed effects model and a Cox model, respectively. Our case study concerns a data set from the Chronic Renal Insufficiency Standards Implementation Study (CRISIS). We also provide details of our open-source software implementation to allow others to replicate and/or modify our analysis. Results: The results for the conventional linear mixed effects model and the longitudinal component of the joint models were found to be similar. However, there were considerable differences between the results for the Cox model with time-varying covariate and the time-to-event component of the joint model. For example, the relationship between kidney function as measured by eGFR and the hazard for initiation of RRT was significantly underestimated by the Cox model that treats eGFR as a time-varying covariate, because the Cox model does not take measurement error in eGFR into account. Conclusions: Joint models should be preferred for simultaneous analyses of repeated measurement and survival data, especially when the former is measured with error and the association between the underlying error-free measurement process and the hazard for survival is of scientific interest.

Journal ArticleDOI
TL;DR: The profile of the 1982 Pelotas Birth Cohort Study is updated and in contrast to the previous home interviews, in this wave all participants were invited to visit the research clinic to be interviewed and examined.
Abstract: In this manuscript, we update the profile of the 1982 Pelotas Birth Cohort Study.In 1982, 5914 live births whose families lived in the urban are of Pelotas were enrolled in the cohort. In 2012‐13, we tried to locate the whole original cohort; 3701 participants were interviewed who, added to the 325 known deaths, represented a follow-up rate of 68.1%. In contrast to the previous home interviews, in this wave all participants were invited to visit the research clinic to be interviewed and examined. The visit was carried out at a mean age of 30.2 years and mainly focused on four categories of outcomes: (i) mental health; (ii) body composition; (iii) precursors of complex chronic diseases; and (iv) human capital. Requests for collaboration by outside researchers are welcome.

Journal ArticleDOI
TL;DR: In this paper, the effect of body mass index (BMI) on risk of cardiovascular diseases was investigated using Mendelian randomization (MR) methods, and the results indicated a strong association between BMI and incident coronary heart disease (CHD), heart failure, and ischaemic stroke.
Abstract: Background: Adiposity, as indicated by body mass index (BMI), has been associated with risk of cardiovascular diseases in epidemiological studies. We aimed to investigate if these associations are causal, using Mendelian randomization (MR) methods. Methods: The associations of BMI with cardiovascular outcomes [coronary heart disease (CHD), heart failure and ischaemic stroke], and associations of a genetic score (32 BMI single nucleotide polymorphisms) with BMI and cardiovascular outcomes were examined in up to 22 193 individuals with 3062 incident cardiovascular events from nine prospective follow-up studies within the ENGAGE consortium. We used random-effects meta-analysis in an MR framework to provide causal estimates of the effect of adiposity on cardiovascular outcomes. Results: There was a strong association between BMI and incident CHD (HR = 1.20 per SD-increase of BMI, 95% CI, 1.12-1.28, P = 1.9·10-7), heart failure (HR = 1.47, 95% CI, 1.35-1.60, P = 9·10-19) and ischaemic stroke (HR = 1.15, 95% CI, 1.06-1.24, P = 0.0008) in observational analyses. The genetic score was robustly associated with BMI (β = 0.030 SD-increase of BMI per additional allele, 95% CI, 0.028-0.033, P = 3·10-107). Analyses indicated a causal effect of adiposity on development of heart failure (HR = 1.93 per SD-increase of BMI, 95% CI, 1.12-3.30, P = 0.017) and ischaemic stroke (HR = 1.83, 95% CI, 1.05-3.20, P = 0.034). Additional cross-sectional analyses using both ENGAGE and CARDIoGRAMplusC4D data showed a causal effect of adiposity on CHD. Conclusions: Using MR methods, we provide support for the hypothesis that adiposity causes CHD, heart failure and, previously not demonstrated, ischaemic stroke.

Journal ArticleDOI
TL;DR: Data suggest that epigenetic differences in paternal sperm may contribute to autism risk in offspring, and provide evidence that directionally consistent, potentially related epigenetic mechanisms may be operating in the cerebellum of individuals with autism.
Abstract: Background: Epigenetic mechanisms such as altered DNA methylation have been suggested to play a role in autism, beginning with the classical association of PraderWilli syndrome, an imprinting disorder, with autistic features Objectives: Here we tested for the relationship of paternal sperm DNA methylation with autism risk in offspring, examining an enriched-risk cohort of fathers of autistic children Methods: We examined genome-wide DNA methylation (DNAm) in paternal semen biosamples obtained from an autism spectrum disorder (ASD) enriched-risk pregnancy cohort, the Early Autism Risk Longitudinal Investigation (EARLI) cohort, to estimate associations between sperm DNAm and prospective ASD development, using a 12-month ASD symptoms assessment, the Autism Observation Scale for Infants (AOSI) We analysed methylation data from 44 sperm samples run on the CHARM 30 array, which contains over 4 million probes (over 7 million CpG sites), including 30 samples also run on the Illumina Infinium HumanMethylation450 (450K) BeadChip platform (� 485 000 CpG sites) We also examined associated regions in an independent sample of postmortem human brain ASD and control samples for which Illumina 450K DNA methylation data were available Results: Using region-based statistical approaches, we identified 193 differentially methylated regions (DMRs) in paternal sperm with a family-wise empirical P-value

Journal ArticleDOI
TL;DR: The NUHDSS was the first urban-based longitudinal health and demographic surveillance platform in sub-Saharan Africa and has provided a robust platform for nesting several studies examining the challenges of rapid urbanization in SSA and associated health and poverty dynamics.
Abstract: The Nairobi Urban Health and Demographic Surveillance System (NUHDSS) was the first urban-based longitudinal health and demographic surveillance platform in sub-Saharan Africa (SSA). The NUHDSS was established in 2002 to provide a platform to investigate the long-term social, economic and health consequences of urban residence, and to serve as a primary research tool for intervention and impact evaluation studies focusing on the needs of the urban poor in SSA. Since its inception, the NUHDSS has successfully followed every year a population of about 65,000 individuals in 24,000 households in two slum communities--Korogocho and Viwandani--in Nairobi, Kenya. Data collected include key demographic and health information (births, deaths including verbal autopsy, in- and out-migration, immunization) and other information that characterizes living conditions in the slums (livelihood opportunities, household amenities and possessions, type of housing etc.). In addition to the routine data, it has provided a robust platform for nesting several studies examining the challenges of rapid urbanization in SSA and associated health and poverty dynamics. NUHDSS data are shared through internal and external collaborations, in accordance with the Centre's guidelines for publications, data sharing.