scispace - formally typeset
Search or ask a question

Showing papers by "Margaret R. Karagas published in 2013"


Journal ArticleDOI
TL;DR: Constrained projection was used to obtain predictions of the proportions of lymphocytes, monocytes and granulocytes for each of the study samples based on their DNA methylation signatures, and these results were robust to the number of leukocyte differentially methylated regions (L-DMRs) used for CP prediction.
Abstract: The potential influence of underlying differences in relative leukocyte distributions in studies involving blood-based profiling of DNA methylation is well recognized and has prompted development of a set of statistical methods for inferring changes in the distribution of white blood cells using DNA methylation signatures. However, the extent to which this methodology can accurately predict cell-type proportions based on blood-derived DNA methylation data in a large-scale epigenome-wide association study (EWAS) has yet to be examined. We used publicly available data deposited in the Gene Expression Omnibus (GEO) database (accession number GSE37008), which consisted of both blood-derived epigenome-wide DNA methylation data assayed using the Illumina Infinium HumanMethylation27 BeadArray and complete blood cell (CBC) counts among a community cohort of 94 non-diseased individuals. Constrained projection (CP) was used to obtain predictions of the proportions of lymphocytes, monocytes and granulocytes for each...

219 citations


Journal ArticleDOI
TL;DR: As millions worldwide are exposed to arsenic and evidence continues to support a role for in utero arsenic exposure in the development of a range of later life diseases, there is a need for more prospective studies examining arsenic's relation to early indicators of disease and at lower exposure levels.

178 citations


Journal ArticleDOI
TL;DR: In utero exposure to low levels of arsenic may affect the epigenome, and an association between urinary inorganic arsenic concentration and the estimated proportion of CD8+ T lymphocytes is found.
Abstract: Background: There is increasing epidemiologic evidence that arsenic exposure in utero, even at low levels found throughout much of the world, is associated with adverse reproductive outcomes and may contribute to long-term health effects. Animal models, in vitro studies, and human cancer data suggest that arsenic may induce epigenetic alterations, specifically by altering patterns of DNA methylation. Objectives: In this study we aimed to identify differences in DNA methylation in cord blood samples of infants with in utero, low-level arsenic exposure. Methods: DNA methylation of cord-blood derived DNA from 134 infants involved in a prospective birth cohort in New Hampshire was profiled using the Illumina Infinium Methylation450K array. In utero arsenic exposure was estimated using maternal urine samples collected at 24–28 weeks gestation. We used a novel cell mixture deconvolution methodology for examining the association between inferred white blood cell mixtures in infant cord blood and in utero arsenic exposure; we also examined the association between methylation at individual CpG loci and arsenic exposure levels. Results: We found an association between urinary inorganic arsenic concentration and the estimated proportion of CD8+ T lymphocytes (1.18; 95% CI: 0.12, 2.23). Among the top 100 CpG loci with the lowest p-values based on their association with urinary arsenic levels, there was a statistically significant enrichment of these loci in CpG islands (p = 0.009). Of those in CpG islands (n = 44), most (75%) exhibited higher methylation levels in the highest exposed group compared with the lowest exposed group. Also, several CpG loci exhibited a linear dose-dependent relationship between methylation and arsenic exposure. Conclusions: Our findings suggest that in utero exposure to low levels of arsenic may affect the epigenome. Long-term follow-up is planned to determine whether the observed changes are associated with health outcomes.

175 citations


Journal ArticleDOI
TL;DR: A review of issues encountered in the processing and analysis of array-based DNA methylation data and a summary of the advantages of recent approaches proposed for handling those issues are provided, focusing on approaches publicly available in open-source environments such as R and Bioconductor.
Abstract: The promise of epigenome-wide association studies and cancer-specific somatic DNA methylation changes in improving our understanding of cancer, coupled with the decreasing cost and increasing coverage of DNA methylation microarrays, has brought about a surge in the use of these technologies. Here, we aim to provide both a review of issues encountered in the processing and analysis of array-based DNA methylation data and a summary of the advantages of recent approaches proposed for handling those issues, focusing on approaches publicly available in open-source environments such as R and Bioconductor. We hope that the processing tools and analysis flowchart described herein will facilitate researchers to effectively use these powerful DNA methylation array-based platforms, thereby advancing our understanding of human health and disease.

172 citations


Journal ArticleDOI
TL;DR: Initial evidence is provided that in utero As exposure may be related to infant infection and infection severity and provide insight into the early life impacts of fetal As exposure.

123 citations


Journal ArticleDOI
TL;DR: The potential impact of eliminating smoking on the number of bladder cancer cases prevented is larger for individuals at higher than lower genetic risk, which could have implications for targeted prevention strategies.
Abstract: Bladder cancer results from the combined effects of environmental and genetic factors, smoking being the strongest risk factor. Evaluating absolute risks resulting from the joint effects of smoking and genetic factors is critical to assess the public health relevance of genetic information. Analyses included up to 3,942 cases and 5,680 controls of European background in seven studies. We tested for multiplicative and additive interactions between smoking and 12 susceptibility loci, individually and combined as a polygenic risk score (PRS). Thirty-year absolute risks and risk differences by levels of the PRS were estimated for U.S. males aged 50 years. Six of 12 variants showed significant additive gene-environment interactions, most notably NAT2 (P = 7 × 10(-4)) and UGT1A6 (P = 8 × 10(-4)). The 30-year absolute risk of bladder cancer in U.S. males was 6.2% for all current smokers. This risk ranged from 2.9% for current smokers in the lowest quartile of the PRS to 9.9% for current smokers in the upper quartile. Risk difference estimates indicated that 8,200 cases would be prevented if elimination of smoking occurred in 100,000 men in the upper PRS quartile compared with 2,000 cases prevented by a similar effort in the lowest PRS quartile (P(additive) = 1 × 10(-4)). Thus, the potential impact of eliminating smoking on the number of bladder cancer cases prevented is larger for individuals at higher than lower genetic risk. Our findings could have implications for targeted prevention strategies. However, other smoking-related diseases, as well as practical and ethical considerations, need to be considered before any recommendations could be made.

105 citations


Journal ArticleDOI
TL;DR: Assessment of the association between a history of photosensitizing medication use and non-melanoma skin cancer and an increased risk of basal cell carcinoma with tetracycline use and of squamous cell carcinomas with diuretic use suggests appropriate counseling regarding sun exposure may reduce skin cancer in patients exposed to these medications.

100 citations


Journal ArticleDOI
TL;DR: The applicability of Bayesian networks (BNs) for discovering relations between genes, environment, and disease is reviewed and a variety of algorithms for learning the structure of a network from observational data are described.
Abstract: We review the applicability of Bayesian networks (BNs) for discovering relations between genes, environment, and disease. By translating probabilistic dependencies among variables into graphical models and vice versa, BNs provide a comprehensible and modular framework for representing complex systems. We first describe the Bayesian network approach and its applicability to understanding the genetic and environmental basis of disease. We then describe a variety of algorithms for learning the structure of a network from observational data. Because of their relevance to real-world applications, the topics of missing data and causal interpretation are emphasized. The BN approach is then exemplified through application to data from a population-based study of bladder cancer in New Hampshire, USA. For didactical purposes, we intentionally keep this example simple. When applied to complete data records, we find only minor differences in the performance and results of different algorithms. Subsequent incorporation of partial records through application of the EM algorithm gives us greater power to detect relations. Allowing for network structures that depart from a strict causal interpretation also enhances our ability to discover complex associations including gene-gene (epistasis) and gene-environment interactions. While BNs are already powerful tools for the genetic dissection of disease and generation of prognostic models, there remain some conceptual and computational challenges. These include the proper handling of continuous variables and unmeasured factors, the explicit incorporation of prior knowledge, and the evaluation and communication of the robustness of substantive conclusions to alternative assumptions and data manifestations.

86 citations


Journal ArticleDOI
TL;DR: The expression of AQP9 was identified as a potential fetal biomarker for arsenic exposure and a positive association between the placental expression of phospholipase ENPP2 and infant birth weight was identified.
Abstract: Epidemiologic studies and animal models suggest that in utero arsenic exposure affects fetal health, with a negative association between maternal arsenic ingestion and infant birth weight often observed. However, the molecular mechanisms for this association remain elusive. In the present study, we aimed to increase our understanding of the impact of low-dose arsenic exposure on fetal health by identifying possible arsenic-associated fetal tissue biomarkers in a cohort of pregnant women exposed to arsenic at low levels. Arsenic concentrations were determined from the urine samples of a cohort of 133 pregnant women from New Hampshire. Placental tissue samples collected from enrollees were homogenized and profiled for gene expression across a panel of candidate genes, including known arsenic regulated targets and genes involved in arsenic transport, metabolism, or disease susceptibility. Multivariable adjusted linear regression models were used to examine the relationship of candidate gene expression with arsenic exposure or with birth weight of the baby. Placental expression of the arsenic transporter AQP9 was positively associated with maternal urinary arsenic levels during pregnancy (coefficient estimate: 0.25; 95% confidence interval: 0.05 – 0.45). Placental expression of AQP9 related to expression of the phospholipase ENPP2 which was positively associated with infant birth weight (coefficient estimate: 0.28; 95% CI: 0.09 – 0.47). A structural equation model indicated that these genes may mediate arsenic’s effect on infant birth weight (coefficient estimate: -0.009; 95% confidence interval: -0.032 – -0.001; 10,000 replications for bootstrapping). We identified the expression of AQP9 as a potential fetal biomarker for arsenic exposure. Further, we identified a positive association between the placental expression of phospholipase ENPP2 and infant birth weight. These findings suggest a path by which arsenic may affect birth outcomes.

79 citations


Journal ArticleDOI
TL;DR: It is suggested that arsenic exposure at levels common in the United States relates to SCC and that arsenic metabolism ability does not modify the association.
Abstract: Background: Chronic high arsenic exposure is associated with squamous cell carcinoma (SCC) of the skin, and inorganic arsenic (iAs) metabolites may play an important role in this association. Howev...

66 citations


Journal ArticleDOI
TL;DR: The hypothesis that an LCS approach can offer greater insight into complex patterns of association can be supported, as this methodology appears to be well suited to the dissection of disease heterogeneity, a key component in the advancement of personalized medicine.

Journal ArticleDOI
TL;DR: Overall, the association between cutaneous HPVs and skin cancers appears to be specific to SCC and to genus beta HPVs in a general US population.
Abstract: Human papillomavirus (HPV) infection is common worldwide and, in immunodeficient populations, may contribute to the pathogenesis of keratinocyte cancers, particularly squamous cell carcinomas (SCC). However, their role in SCC in the general population is less clear. We conducted a comprehensive analysis to investigate the independent effects of seropositivity for cutaneous alpha, beta and gamma HPV types on risk of SCC, and a meta-analysis of the available literature. In a population-based case-control study from New Hampshire, USA (n = 1,408), histologically confirmed SCC cases and controls were tested for L1 antibodies to alpha, beta and gamma cutaneous HPV types 2–5, 7–10, 15, 17, 20, 23, 24, 27b, 36, 38, 48–50, 57, 65, 75–77, 88, 92, 95, 96, 101, 103 and 107 using multiplex serology. An increasing risk of SCC with number of beta HPVs to which an individual tested positive was observed even among those seronegative for gamma types (p for trend = 0.016) with an odds ratio of 1.95 (95% confidence interval (CI) = 1.07–3.56) for four or more beta types positive. In a meta-analysis of six case-control studies, increased SCC risks in relation to beta HPV seropositivity were found across studies (meta odds ratio = 1.45, CI = 1.27–1.66). While the prevalence of gamma HPVs assayed was somewhat higher among SCC cases than controls, the association was only weakly evident among those seronegative for beta HPVs. Overall, the association between cutaneous HPVs and skin cancers appears to be specific to SCC and to genus beta HPVs in a general US population.

Journal ArticleDOI
TL;DR: Diet can be an important contributor to total arsenic exposure in U.S. populations regardless of arsenic concentrations in drinking water, and dietary exposure to arsenic in the US warrants consideration as a potential health risk.
Abstract: Background Limited data exist on the contribution of dietary sources of arsenic to an individual’s total exposure, particularly in populations with exposure via drinking water. Here, the association between diet and toenail arsenic concentrations (a long-term biomarker of exposure) was evaluated for individuals with measured household tap water arsenic. Foods known to be high in arsenic, including rice and seafood, were of particular interest.

Journal ArticleDOI
TL;DR: CART and random forest models extracted decision rules and accurately predicted an expert's exposure decisions for the majority of jobs, and identified questionnaire response patterns that would require further expert review if the rules were applied to other jobs in the same or different study.
Abstract: Objectives Evaluating occupational exposures in population-based case-control studies often requires exposure assessors to review each study participant's reported occupational information job-by-job to derive exposure estimates. Although such assessments likely have underlying decision rules, they usually lack transparency, are time consuming and have uncertain reliability and validity. We aimed to identify the underlying rules to enable documentation, review and future use of these expert-based exposure decisions. Methods Classification and regression trees (CART, predictions from a single tree) and random forests ( predictions from many trees) were used to identify the underlying rules from the questionnaire responses, and an expert's exposure assignments for occupational diesel exhaust exposure for several metrics: binary exposure probability and ordinal exposure probability, intensity and frequency. Data were split into training (n=10 488 jobs), testing (n=2247) and validation (n=2248) datasets. Results The CART and random forest models' predictions agreed with 92-94% of the expert's binary probability assignments. For ordinal probability, intensity and frequency metrics, the two models extracted decision rules more successfully for unexposed and highly exposed jobs (86-90% and 57-85%, respectively) than for low or medium exposed jobs (7-71%). Conclusions CART and random forest models extracted decision rules and accurately predicted an expert's exposure decisions for the majority of jobs, and identified questionnaire response patterns that would require further expert review if the rules were applied to other jobs in the same or different study. This approach makes the exposure assessment process in case-control studies more transparent, and creates a mechanism to efficiently replicate exposure decisions in future studies.

Journal ArticleDOI
TL;DR: It is suggested that SLC14A1 could be a unique urea transporter in the bladder that has the ability to influence urine concentration and that this mechanism might explain the increased bladder cancer susceptibility associated with rs10775480.
Abstract: Genome-wide association studies (GWAS) identified associations between markers within the solute carrier family 14 (urea transporter), member 1 (SLC14A1) gene and risk of bladder cancer SLC14A1 defines the Kidd blood groups in erythrocytes and is also involved in concentration of the urine in the kidney We evaluated the association between a representative genetic variant (rs10775480) of SLC14A1 and urine concentration, as measured by urinary specific gravity (USG), in a subset of 275 population-based controls enrolled in the New England Bladder Cancer Study Overnight urine samples were collected, and USG was measured using refractometry Analysis of covariance was used to estimate adjusted least square means for USG in relation to rs10775480 We also examined the mRNA expression of both urea transporters, SLC14A1 and SLC14A2, in a panel of human tissues USG was decreased with each copy of the rs10775480 risk T allele (p-trend = 0011) with a significant difference observed for CC vs TT genotypes (p-valuetukey = 0024) RNA-sequencing in the bladder tissue showed high expression of SLC14A1 and the absence of SLC14A2, while both transporters were expressed in the kidney We suggest that the molecular phenotype of this GWAS finding is the genotype-specific biological activity of SLC14A1 in the bladder tissue Our data suggest that SLC14A1 could be a unique urea transporter in the bladder that has the ability to influence urine concentration and that this mechanism might explain the increased bladder cancer susceptibility associated with rs10775480

Journal ArticleDOI
TL;DR: Regular use of nonaspirin, nonselective NSAIDs was associated with reduced bladder cancer risk, with a statistically significant inverse trend in risk with duration of use, mainly by ibuprofen, and a previously unrecognized risk associated with use of COX‐2 inhibitors was observed.
Abstract: A few epidemiologic studies have found that use of nonsteroidal anti-inflammatory drugs (NSAIDs) is associated with reduced risk of bladder cancer. However, the effects of specific NSAID use and individual variability in risk have not been well studied. We examined the association between NSAIDs use and bladder cancer risk, and its modification by 39 candidate genes related to NSAID metabolism. A population-based case-control study was conducted in northern New England, enrolling 1,171 newly diagnosed cases and 1,418 controls. Regular use of nonaspirin, nonselective NSAIDs was associated with reduced bladder cancer risk, with a statistically significant inverse trend in risk with duration of use (ORs of 1.0, 0.8, 0.6 and 0.6 for <5, 5-9, 10-19 and 20+ years, respectively; p(trend) = 0.015). This association was driven mainly by ibuprofen; significant inverse trends in risk with increasing duration and dose of ibuprofen were observed (p(trend) = 0.009 and 0.054, respectively). The reduced risk from ibuprofen use was limited to individuals carrying the T allele of a single nucleotide polymorphism (rs4646450) compared to those who did not use ibuprofen and did not carry the T allele in the CYP3A locus, providing new evidence that this association might be modified by polymorphisms in genes that metabolize ibuprofen. Significant positive trends in risk with increasing duration and cumulative dose of selective cyclooxygenase (COX-2) inhibitors were observed. Our results are consistent with those from previous studies linking use of NSAIDs, particularly ibuprofen, with reduced risk. We observed a previously unrecognized risk associated with use of COX-2 inhibitors, which merits further evaluation.

Journal ArticleDOI
TL;DR: Results from a U.S. study of American Indians from the Strong Heart Study report that study participants with a urinary arsenic concentration greater than 15.7 μg/g creatinine were 1.65, 1.71, and 3.03 times more likely to die of CVD, coronary heart disease, and stroke, respectively, than their counterparts with urinary arsenic levelsLess precise estimates for lower exposure levels were less precise.
Abstract: Mounting evidence suggests that exposure to chemicals and other environmental substances, such as ambient urban air particles, cadmium, lead, and inorganic arsenic, could have a profound effect on cardiovascular disease (CVD) risk. Inorganic arsenic, a known carcinogen, occurs naturally in groundwater, exposing millions of people in the United States and worldwide. Epidemiologic studies in villages of southwestern Taiwan with high levels of arsenic in groundwater (median, 780 μg/L) provided early evidence of a dose–response relationship of water arsenic concentrations at less than 300 μg/L, 300 to 590 μg/L, and greater than 590 μg/L with CVD mortality (1). Arsenic exposure has also been related to increased mortality from acute myocardial infarction in Chile, where arsenic concentrations in drinking water increased from 90 to 870 μg/L between 1958 and 1970 (2). Recent prospective studies from Bangladesh reported a dose–response relationship between arsenic exposure and CVD mortality at well water arsenic concentration of 10 to 300 μg/L (3, 4) or urinary creatinine–adjusted arsenic concentration of 106 to 641 μg/g creatinine (3), although estimates for lower exposure levels were less precise. In this issue, Moon and colleagues (5) report results from a U.S. study of American Indians from the Strong Heart Study. Study participants with a urinary arsenic concentration greater than 15.7 μg/g creatinine were 1.65, 1.71, and 3.03 times more likely to die of CVD, coronary heart disease, and stroke, respectively, than their counterparts with urinary arsenic levels less than 5.8 μg/g creatinine (5). The risk for incident CVD associated with urinary arsenic, although lower than that for CVD mortality, was also elevated.

Book ChapterDOI
03 Apr 2013
TL;DR: A hybrid algorithm, MIN-guided RF (MINGRF), which overlays the neighborhood structure of MIN onto the growth of trees and concludes that MINGRF produces trees with a better accuracy at a smaller computational cost.
Abstract: Genome-wide association studies (GWAS) have become a powerful and affordable tool to study the genetic variations associated with common human diseases. However, only few of the loci found are associated with a moderate or large increase in disease risk and therefore using GWAS findings to study the underlying biological mechanisms remains a challenge. One possible cause for the "missing heritability" is the gene-gene interactions or epistasis. Several methods have been developed and among them Random Forest (RF) is a popular one. RF has been successfully applied in many studies. However, it is also known to rely on marginal main effects. Meanwhile, networks have become a popular approach for characterizing the space of pairwise interactions systematically, which can be informative for classification problems. In this study, we compared the findings of Mutual Information Network (MIN) to that of RF and observed that the variables identified by the two methods overlap with differences. To integrate advantages of MIN into RF, we proposed a hybrid algorithm, MIN-guided RF (MINGRF), which overlays the neighborhood structure of MIN onto the growth of trees. After comparing MINGRF to the standard RF on a bladder cancer dataset, we conclude that MINGRF produces trees with a better accuracy at a smaller computational cost.