Showing papers by "Marc Jan Bonder published in 2014"

PDF

Open Access

Journal Article•DOI•

Genotype harmonizer: automatic strand alignment and format conversion for genotype data integration

[...]

Patrick Deelen¹, Marc Jan Bonder¹, K. Joeri van der Velde¹, Harm-Jan Westra¹, Erwin Winder¹, Dennis Hendriksen¹, Lude Franke¹, Morris A. Swertz¹ - Show less +4 more•Institutions (1)

University Medical Center Groningen¹

11 Dec 2014-BMC Research Notes

TL;DR: Genotype Harmonizer is a command-line tool to harmonize genetic datasets by automatically solving issues concerning genomic strand and file format and can be easily integrated as a step in routine meta-analysis and imputation pipelines.

...read moreread less

Abstract: To gain statistical power or to allow fine mapping, researchers typically want to pool data before meta-analyses or genotype imputation. However, the necessary harmonization of genetic datasets is currently error-prone because of many different file formats and lack of clarity about which genomic strand is used as reference. Genotype Harmonizer (GH) is a command-line tool to harmonize genetic datasets by automatically solving issues concerning genomic strand and file format. GH solves the unknown strand issue by aligning ambiguous A/T and G/C SNPs to a specified reference, using linkage disequilibrium patterns without prior knowledge of the used strands. GH supports many common GWAS/NGS genotype formats including PLINK, binary PLINK, VCF, SHAPEIT2 & Oxford GEN. GH is implemented in Java and a large part of the functionality can also be used as Java ‘Genotype-IO’ API. All software is open source under license LGPLv3 and available from http://www.molgenis.org/systemsgenetics . GH can be used to harmonize genetic datasets across different file formats and can be easily integrated as a step in routine meta-analysis and imputation pipelines.

...read moreread less

121 citations

Journal Article•DOI•

Genetic and epigenetic regulation of gene expression in fetal and adult human livers

[...]

Marc Jan Bonder¹, Silva Kasela², Mart Kals², Riin Tamm², Kaie Lokk², Isabel Barragan³, Wim A. Buurman⁴, Patrick Deelen¹, Jan Greve, Maxim Ivanov³, Sander S. Rensen⁴, Jana V. van Vliet-Ostaptchouk¹, Marcel G. M. Wolfs¹, Jingyuan Fu¹, Marten H. Hofker¹, Cisca Wijmenga¹, Alexandra Zhernakova¹, Magnus Ingelman-Sundberg³, Lude Franke¹, Lili Milani² - Show less +16 more•Institutions (4)

University Medical Center Groningen¹, University of Tartu², Karolinska Institutet³, Maastricht University⁴

04 Oct 2014-BMC Genomics

TL;DR: The authors' analyses generated a comprehensive resource of factors involved in the regulation of hepatic gene expression, and allowed us to estimate the proportion of variation in gene expression that could be attributed to genetic and epigenetic variation, both crucial to understanding differences in drug response and the etiology of liver diseases.

...read moreread less

Abstract: The liver plays a central role in the maintenance of homeostasis and health in general. However, there is substantial inter-individual variation in hepatic gene expression, and although numerous genetic factors have been identified, less is known about the epigenetic factors. By analyzing the methylomes and transcriptomes of 14 fetal and 181 adult livers, we identified 657 differentially methylated genes with adult-specific expression, these genes were enriched for transcription factor binding sites of HNF1A and HNF4A. We also identified 1,000 genes specific to fetal liver, which were enriched for GATA1, STAT5A, STAT5B and YY1 binding sites. We saw strong liver-specific effects of single nucleotide polymorphisms on both methylation levels (28,447 unique CpG sites (meQTL)) and gene expression levels (526 unique genes (eQTL)), at a false discovery rate (FDR) < 0.05. Of the 526 unique eQTL associated genes, 293 correlated significantly not only with genetic variation but also with methylation levels. The tissue-specificities of these associations were analyzed in muscle, subcutaneous adipose tissue and visceral adipose tissue. We observed that meQTL were more stable between tissues than eQTL and a very strong tissue-specificity for the identified associations between CpG methylation and gene expression. Our analyses generated a comprehensive resource of factors involved in the regulation of hepatic gene expression, and allowed us to estimate the proportion of variation in gene expression that could be attributed to genetic and epigenetic variation, both crucial to understanding differences in drug response and the etiology of liver diseases.

...read moreread less

120 citations

Journal Article•DOI•

Breast Cancer Subtype Specific Classifiers of Response to Neoadjuvant Chemotherapy Do Not Outperform Classifiers Trained on All Subtypes

[...]

Jorma J. de Ronde¹, Marc Jan Bonder¹, Esther H. Lips¹, Sjoerd Rodenhuis¹, Lodewyk F. A. Wessels² - Show less +1 more•Institutions (2)

Netherlands Cancer Institute¹, Delft University of Technology²

18 Feb 2014-PLOS ONE

TL;DR: It depends on the specific context which type of predictor – subtype specific or generic- performed better, it is highly recommended to evaluate both specific and generic predictors when attempting to predict treatment response in breast cancer.

...read moreread less

Abstract: Introduction: Despite continuous efforts, not a single predictor of breast cancer chemotherapy resistance has made it into the clinic yet. However, it has become clear in recent years that breast cancer is a collection of molecularly distinct diseases. With ever increasing amounts of breast cancer data becoming available, we set out to study if gene expression based predictors of chemotherapy resistance that are specific for breast cancer subtypes can improve upon the performance of generic predictors. Methods: We trained predictors of resistance that were specific for a subtype and generic predictors that were not specific for a particular subtype, i.e. trained on all subtypes simultaneously. Through a rigorous double-loop cross-validation we compared the performance of these two types of predictors on the different subtypes on a large set of tumors all profiled on the same expression platform (n = 394). We evaluated predictors based on either mRNA gene expression or clinical features. Results: For HER2+, ER2 breast cancer, subtype specific predictor based on clinical features outperformed the generic, nonspecific predictor. This can be explained by the fact that the generic predictor included HER2 and ER status, features that are predictive over the whole set, but not within this subtype. In all other scenarios the generic predictors outperformed the subtype specific predictors or showed equal performance. Conclusions: Since it depends on the specific context which type of predictor – subtype specific or generic- performed better, it is highly recommended to evaluate both specific and generic predictors when attempting to predict treatment response in breast cancer.

...read moreread less

13 citations

Posted Content•DOI•

Calling genotypes from public RNA-sequencing data enables identification of genetic variants that affect gene-expression levels

[...]

Patrick Deelen¹, Daria V. Zhernakova¹, Mark de Haan¹, Marijke R. van der Sijde¹, Marc Jan Bonder¹, Juha Karjalainen¹, K. Joeri van der Velde¹, Kristin M. Abbott¹, Jingyuan Fu¹, Cisca Wijmenga¹, Richard J. Sinke¹, Morris A. Swertz¹, Lude Franke¹ - Show less +9 more•Institutions (1)

University of Groningen¹

01 Aug 2014-bioRxiv

TL;DR: Given the exponential growth of the number of publicly available RNA-seq samples, it is expected this approach will become relevant for studying tissue-specific effects of rare pathogenic genetic variants.

...read moreread less

Abstract: Given increasing numbers of RNA-seq samples in the public domain, we studied to what extent expression quantitative trait loci (eQTLs) and allele-specific expression (ASE) can be identified in public RNA-seq data while also deriving the genotypes from the RNA-seq reads. 4,978 human RNA-seq runs, representing many different tissues and cell-types, passed quality control. Even though this data originated from many different laboratories, samples reflecting the same cell-type clustered together, suggesting that technical biases due to different sequencing protocols were limited. We derived genotypes from the RNA-seq reads and imputed non-coding variants. In a joint analysis on 1,262 samples combined, we identified cis-eQTLs effects for 8,034 unique genes. Additionally, we observed strong ASE effects for 34 rare pathogenic variants, corroborating previously observed effects on the corresponding protein levels. Given the exponential growth of the number of publicly available RNA-seq samples, we expect this approach will become relevant for studying tissue-specific effects of rare pathogenic genetic variants.

...read moreread less

4 citations