scispace - formally typeset
Search or ask a question

Showing papers by "Chris Sander published in 2014"


Journal ArticleDOI
Adam J. Bass1, Vesteinn Thorsson2, Ilya Shmulevich2, Sheila Reynolds2  +254 moreInstitutions (32)
11 Sep 2014-Nature
TL;DR: A comprehensive molecular evaluation of 295 primary gastric adenocarcinomas as part of The Cancer Genome Atlas (TCGA) project is described and a molecular classification dividing gastric cancer into four subtypes is proposed.
Abstract: Gastric cancer was the world’s third leading cause of cancer mortality in 2012, responsible for 723,000 deaths1. The vast majority of gastric cancers are adenocarcinomas, which can be further subdivided into intestinal and diffuse types according to the Lauren classification2. An alternative system, proposed by the World Health Organization, divides gastric cancer into papillary, tubular, mucinous (colloid) and poorly cohesive carcinomas3. These classification systems have little clinical utility, making the development of robust classifiers that can guide patient therapy an urgent priority. The majority of gastric cancers are associated with infectious agents, including the bacterium Helicobacter pylori4 and Epstein–Barr virus (EBV). The distribution of histological subtypes of gastric cancer and the frequencies of H. pylori and EBV associated gastric cancer vary across the globe5. A small minority of gastric cancer cases are associated with germline mutation in E-cadherin (CDH1)6 or mismatch repair genes7 (Lynch syndrome), whereas sporadic mismatch repair-deficient gastric cancers have epigenetic silencing of MLH1 in the context of a CpG island methylator phenotype (CIMP)8. Molecular profiling of gastric cancer has been performed using gene expression or DNA sequencing9–12, but has not led to a clear biologic classification scheme. The goals of this study by The Cancer Genome Atlas (TCGA) were to develop a robust molecular classification of gastric cancer and to identify dysregulated pathways and candidate drivers of distinct classes of gastric cancer.

4,583 citations


Journal ArticleDOI
01 Jan 2014-Nature
TL;DR: In this paper, the authors report molecular profiling of 230 resected lung adnocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses.
Abstract: Adenocarcinoma of the lung is the leading cause of cancer death worldwide. Here we report molecular profiling of 230 resected lung adenocarcinomas using messenger RNA, microRNA and DNA sequencing integrated with copy number, methylation and proteomic analyses. High rates of somatic mutation were seen (mean 8.9 mutations per megabase). Eighteen genes were statistically significantly mutated, including RIT1 activating mutations and newly described loss-of-function MGA mutations which are mutually exclusive with focal MYC amplification. EGFR mutations were more frequent in female patients, whereas mutations in RBM10 were more common in males. Aberrations in NF1, MET, ERBB2 and RIT1 occurred in 13% of cases and were enriched in samples otherwise lacking an activated oncogene, suggesting a driver role for these events in certain tumours. DNA and mRNA sequence from the same tumour highlighted splicing alterations driven by somatic genomic changes, including exon 14 skipping in MET mRNA in 4% of cases. MAPK and PI(3)K pathway activity, when measured at the protein level, was explained by known mutations in only a fraction of cases, suggesting additional, unexplained mechanisms of pathway activation. These data establish a foundation for classification and further investigations of lung adenocarcinoma molecular pathogenesis.

4,104 citations


Journal ArticleDOI
John N. Weinstein1, Rehan Akbani1, Bradley M. Broom1, Wenyi Wang1  +293 moreInstitutions (30)
01 Jan 2014-Nature
TL;DR: Ch Chromatin regulatory genes were more frequently mutated in urothelial carcinoma than in any other common cancer studied so far, indicating the future possibility of targeted therapy for chromatin abnormalities.
Abstract: Urothelial carcinoma of the bladder is a common malignancy that causes approximately 150,000 deaths per year worldwide. To date, no molecularly targeted agents have been approved for the disease. As part of The Cancer Genome Atlas project, we report here an integrated analysis of 131 urothelial carcinomas to provide a comprehensive landscape of molecular alterations. There were statistically significant recurrent mutations in 32 genes, including multiple genes involved in cell Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#termsThis paper is distributed under the terms of the Creative Commons. Attribution-Non-Commercial-Share Alike license, and the online version of the paper is freely available to all readers.

2,257 citations


Journal ArticleDOI
Nishant Agrawal1, Rehan Akbani1, B. Arman Aksoy1, Adrian Ally1  +239 moreInstitutions (1)
23 Oct 2014-Cell
TL;DR: The genomic landscape of 496 PTCs is described and a reclassification of thyroid cancers into molecular subtypes that better reflect their underlying signaling and differentiation properties is proposed, which has the potential to improve their pathological classification and better inform the management of the disease.

2,096 citations


Journal ArticleDOI
TL;DR: The goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics, and the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures.
Abstract: PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.

577 citations


Journal ArticleDOI
25 Sep 2014-eLife
TL;DR: Analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes, and predicts protein–protein contacts in 32 complexes of unknown structure.
Abstract: DNA is often referred to as the ‘blueprint of life’, as this molecule contains the instructions that are required to build a living organism from a single cell. But these instructions largely play out through the proteins that DNA encodes; and most proteins do not work alone. Instead they come together in different combinations, or complexes, and a single protein may participate in many complexes with different activities. Proteins are so small that it is difficult to get clear information about what they look like. Visualizing protein complexes is even harder. Most protein–protein interactions remain poorly understood, even in the best-studied organisms such as humans, yeast, and bacteria. Proteins are made from smaller molecules, called amino acids, strung together one after the other. The order in which different amino acids are arranged in a protein determines the protein’s shape and ultimately its function. Like DNA, protein sequences can change over time. Sometimes, the sequence of one protein changes in a way that prevents it binding to another protein. If these two proteins must work together for an organism to survive, the second protein will often develop a compensating change that allows the protein–protein complex to reform. Identifying pairs of changes in the sequences of pairs of proteins suggests that the two proteins interact and gives some information about how the proteins fit together. Different species can have copies of the same proteins that have slightly different sequences. Since the DNA sequences from many different organisms are already known, there are now many opportunities to find sites in pairs of proteins that have evolved together, or co-evolved, over time. To find sites that seem to have co-evolved, Hopf et al. used a computer program based on an approach from statistical physics to look at pairs of proteins that were already known to form complexes. Co-evolving sites were found in over 300 pairs of proteins; including 76 where the structure of the complex was already known. When sites that were predicted to be co-evolving were then mapped to these known complex structures, the co-evolving sites were remarkably close to the true protein–protein contacts. This indicates that the information from the co-evolved sequences is sufficient to show how two proteins fit together. Hopf et al. then turned their attention to 82 pairs of proteins that were thought to interact, but where a structure was unavailable. For 32 of these pairs, structures of the entire complex could be predicted, showing how the two proteins might interact. Furthermore, when other researchers subsequently worked out the structure of one of these complexes, the prediction was a good match to the solved complex structure. The machinery of life is largely made up of proteins, which must interact in ever-changing but precise ways. The new methods developed by Hopf et al. provide a new way to discover and investigate the details of these interactions.

497 citations


Journal ArticleDOI
TL;DR: New frequency- and sequence-based approaches are used to comprehensively scan the genome for noncoding mutations with potential regulatory impact and identify recurrent mutations in regulatory elements upstream of PLEKHS1, WDR74 and SDHD, as well as previously identified mutations in the TERT promoter.
Abstract: Cancer primarily develops because of somatic alterations in the genome. Advances in sequencing have enabled large-scale sequencing studies across many tumor types, emphasizing the discovery of alterations in protein-coding genes. However, the protein-coding exome comprises less than 2% of the human genome. Here we analyze the complete genome sequences of 863 human tumors from The Cancer Genome Atlas and other sources to systematically identify noncoding regions that are recurrently mutated in cancer. We use new frequency- and sequence-based approaches to comprehensively scan the genome for noncoding mutations with potential regulatory impact. These methods identify recurrent mutations in regulatory elements upstream of PLEKHS1, WDR74 and SDHD, as well as previously identified mutations in the TERT promoter. SDHD promoter mutations are frequent in melanoma and are associated with reduced gene expression and poor prognosis. The non-protein-coding cancer genome remains widely unexplored, and our findings represent a step toward targeting the entire genome for clinical purposes.

477 citations


Journal ArticleDOI
TL;DR: A cohort of 279 head and neck cancers with next generation RNA and DNA sequencing is profiled to provide insight into the mechanisms by which HPV interacts with the human genome beyond expression of viral oncoproteins and suggest that specific integration events are an integral component of viraloncogenesis.
Abstract: Previous studies have established that a subset of head and neck tumors contains human papillomavirus (HPV) sequences and that HPV-driven head and neck cancers display distinct biological and clinical features. HPV is known to drive cancer by the actions of the E6 and E7 oncoproteins, but the molecular architecture of HPV infection and its interaction with the host genome in head and neck cancers have not been comprehensively described. We profiled a cohort of 279 head and neck cancers with next generation RNA and DNA sequencing and show that 35 (12.5%) tumors displayed evidence of high-risk HPV types 16, 33, or 35. Twenty-five cases had integration of the viral genome into one or more locations in the human genome with statistical enrichment for genic regions. Integrations had a marked impact on the human genome and were associated with alterations in DNA copy number, mRNA transcript abundance and splicing, and both inter- and intrachromosomal rearrangements. Many of these events involved genes with documented roles in cancer. Cancers with integrated vs. nonintegrated HPV displayed different patterns of DNA methylation and both human and viral gene expressions. Together, these data provide insight into the mechanisms by which HPV interacts with the human genome beyond expression of viral oncoproteins and suggest that specific integration events are an integral component of viral oncogenesis.

311 citations


Journal ArticleDOI
TL;DR: It is demonstrated that copy number alteration (CNA) burden, a measure of the fraction of a tumor genome that is copy number altered, is prognostic for recurrence and metastasis and can be measured in diagnostic needle biopsies using low-input whole-genome sequencing, setting the stage for studies of prognostic impact in conservatively treated cohorts.
Abstract: Primary prostate cancer is the most common malignancy in men but has highly variable outcomes, highlighting the need for biomarkers to determine which patients can be managed conservatively. Few large prostate oncogenome resources currently exist that combine the molecular and clinical outcome data necessary to discover prognostic biomarkers. Previously, we found an association between relapse and the pattern of DNA copy number alteration (CNA) in 168 primary tumors, raising the possibility of CNA as a prognostic biomarker. Here we examine this question by profiling an additional 104 primary prostate cancers and updating the initial 168 patient cohort with long-term clinical outcome. We find that CNA burden across the genome, defined as the percentage of the tumor genome affected by CNA, was associated with biochemical recurrence and metastasis after surgery in these two cohorts, independent of the prostate-specific antigen biomarker or Gleason grade, a major existing histopathological prognostic variable in prostate cancer. Moreover, CNA burden was associated with biochemical recurrence in intermediate-risk Gleason 7 prostate cancers, independent of prostate-specific antigen or nomogram score. We further demonstrate that CNA burden can be measured in diagnostic needle biopsies using low-input whole-genome sequencing, setting the stage for studies of prognostic impact in conservatively treated cohorts.

297 citations


Journal ArticleDOI
S. Chatrchyan1, Vardan Khachatryan1, Albert M. Sirunyan1, Armen Tumasyan1  +2230 moreInstitutions (144)
TL;DR: The observed (expected) upper limit on the invisible branching fraction at 0.58 (0.44) is interpreted in terms of a Higgs-portal model of dark matter interactions.
Abstract: A search for invisible decays of Higgs bosons is performed using the vector boson fusion and associated ZH production modes. In the ZH mode, the Z boson is required to decay to a pair of charged leptons or a $b\bar{b}$ quark pair. The searches use the 8 TeV pp collision dataset collected by the CMS detector at the LHC, corresponding to an integrated luminosity of up to 19.7 inverse femtobarns. Certain channels include data from 7 TeV collisions corresponding to an integrated luminosity of 4.9 inverse femtobarns. The searches are sensitive to non-standard-model invisible decays of the recently observed Higgs boson, as well as additional Higgs bosons with similar production modes and large invisible branching fractions. In all channels, the observed data are consistent with the expected standard model backgrounds. Limits are set on the production cross section times invisible branching fraction, as a function of the Higgs boson mass, for the vector boson fusion and ZH production modes. By combining all channels, and assuming standard model Higgs boson cross sections and acceptances, the observed (expected) upper limit on the invisible branching fraction at $m_H$=125 GeV is found to be 0.58 (0.44) at 95% confidence level. We interpret this limit in terms of a Higgs-portal model of dark matter interactions.

246 citations


Journal ArticleDOI
TL;DR: Tumors with somatic mutations in the proofreading exonuclease domain of DNA polymerase epsilon (POLE-exo*) exhibit a novel mutator phenotype, with markedly elevated TCT→TAT and TCG→TTG mutations and overall mutation frequencies often exceeding 100 mutations/Mb.
Abstract: Tumors with somatic mutations in the proofreading exonuclease domain of DNA polymerase epsilon (POLE-exo*) exhibit a novel mutator phenotype, with markedly elevated TCT→TAT and TCG→TTG mutations and overall mutation frequencies often exceeding 100 mutations/Mb. Here, we identify POLE-exo* tumors in numerous cancers and classify them into two groups, A and B, according to their mutational properties. Group A mutants are found only in POLE, whereas Group B mutants are found in POLE and POLD1 and appear to be nonfunctional. In Group A, cell-free polymerase assays confirm that mutations in the exonuclease domain result in high mutation frequencies with a preference for C→A mutation. We describe the patterns of amino acid substitutions caused by POLE-exo* and compare them to other tumor types. The nucleotide preference of POLE-exo* leads to increased frequencies of recurrent nonsense mutations in key tumor suppressors such as TP53, ATM, and PIK3R1. We further demonstrate that strand-specific mutation patterns arise from some of these POLE-exo* mutants during genome duplication. This is the first direct proof of leading strand-specific replication by human POLE, which has only been demonstrated in yeast so far. Taken together, the extremely high mutation frequency and strand specificity of mutations provide a unique identifier of eukaryotic origins of replication.

Journal ArticleDOI
TL;DR: Analysis of outlier cases can facilitate identification of potential biomarkers for targeted agents, and two genes are implicate as candidates for further study in this class of drugs.
Abstract: Purpose: Rapalogs are allosteric mTOR inhibitors and approved agents for advanced kidney cancer. Reports of clonal heterogeneity in this disease challenge the concept of targeted monotherapy, yet a small subset of patients derives extended benefit. Our aim was to analyze such outliers and explore the genomic background of extreme rapalog sensitivity in the context of intratumor heterogeneity. Experimental Design: We analyzed archived tumor tissue of 5 patients with renal cell carcinoma, who previously achieved durable disease control with rapalogs (median duration, 28 months). DNA was extracted from spatially separate areas of primary tumors and metastases. Custom target capture and ultradeep sequencing was used to identify alterations across 230 target genes. Whole-exome sequence analysis was added to investigate genes beyond this original target list. Results: Five long-term responders contributed 14 specimens to explore clonal heterogeneity. Genomic alterations with activating effect on mTOR signaling were detected in 11 of 14 specimens, offering plausible explanation for exceptional treatment response through alterations in two genes (TSC1 and MTOR). In two subjects, distinct yet functionally convergent alterations activated the mTOR pathway in spatially separate sites. In 1 patient, concurrent genomic events occurred in two separate pathway components across different tumor regions. Conclusions: Analysis of outlier cases can facilitate identification of potential biomarkers for targeted agents, and we implicate two genes as candidates for further study in this class of drugs. The previously reported phenomenon of clonal convergence can occur within a targetable pathway which might have implications for biomarker development beyond this disease and this class of agents. Clin Cancer Res; 20(7); 1955–64. ©2014 AACR .

Journal ArticleDOI
TL;DR: This work shows that the TM-scores of top-ranked models are improved by on average 33% using PconsFold compared with the original version of EVfold and using Rosetta instead of CNS does not significantly improve global model accuracy, but the chemistry of models generated with Rosetta is improved.
Abstract: Motivation: Recently it has been shown that the quality of protein contact prediction from evolutionary information can be improved significantly if direct and indirect information is separated. Given sufficiently large protein families, the contact predictions contain sufficient information to predict the structure of many protein families. However, since the first studies contact prediction methods have improved. Here, we ask how much the final models are improved if improved contact predictions are used. Results: In a small benchmark of 15 proteins, we show that the TMscores of top-ranked models are improved by on average 33% using PconsFold compared with the original version of EVfold. In a larger benchmark, we find that the quality is improved with 15–30% when using PconsC in comparison with earlier contact prediction methods. Further, using Rosetta instead of CNS does not significantly improve global model accuracy, but the chemistry of models generated with Rosetta is improved. Availability: PconsFold is a fully automated pipeline for ab initio protein structure prediction based on evolutionary information. PconsFold is based on PconsC contact prediction and uses the Rosetta folding protocol. Due to its modularity, the contact prediction tool can be easily exchanged. The source code of PconsFold is available on GitHub at https://www.github.com/ElofssonLab/pcons-fold under the MIT license. PconsC is available from http://c.pcons.net/. Contact: arne@bioinfo.se Supplementary information: Supplementary data are available at Bioinformatics online.



Journal ArticleDOI
TL;DR: Genetic profiling will in the future provide a promising basis for network pharmacology of epistatic vulnerabilities as a promising therapeutic strategy in cancer, and up to 44% of vulnerabilities can be targeted with at least one Food and Drug Administration-approved drug.
Abstract: Motivation: Somatic homozygous deletions of chromosomal regions in cancer, while not necessarily oncogenic, may lead to therapeutic vulnerabilities specific to cancer cells compared with normal cells. A recently reported example is the loss of one of the two isoenzymes in glioblastoma cancer cells such that the use of a specific inhibitor selectively inhibited growth of the cancer cells, which had become fully dependent on the second isoenzyme. We have now made use of the unprecedented conjunction of large-scale cancer genomics profiling of tumor samples in The Cancer Genome Atlas (TCGA) and of tumorderived cell lines in the Cancer Cell Line Encyclopedia, as well as the availability of integrated pathway information systems, such as Pathway Commons, to systematically search for a comprehensive set of such epistatic vulnerabilities. Results: Based on homozygous deletions affecting metabolic enzymes in 16 TCGA cancer studies and 972 cancer cell lines, we identified 4104 candidate metabolic vulnerabilities present in 1019 tumor samples and 482 cell lines. Up to 44% of these vulnerabilities can be targeted with at least one Food and Drug Administration-approved drug. We suggest focused experiments to test these vulnerabilities and clinical trials based on personalized genomic profiles of those that pass preclinical filters. We conclude that genomic profiling will in the future provide a promising basis for network pharmacology of epistatic vulnerabilities as a promising therapeutic strategy.

Journal ArticleDOI
TL;DR: The recently released version 2 of ChiBE can search for neighborhoods, paths between molecules, and common regulators/targets of molecules, on large integrated cellular networks in the Pathway Commons database as well as in local BioPAX models.
Abstract: Dynamic visual exploration of detailed pathway information can help researchers digest and interpret complex mechanisms and genomic datasets. ChiBE is a free, open-source software tool for visualizing, querying, and analyzing human biological pathways in BioPAX format. The recently released version 2 can search for neighborhoods, paths between molecules, and common regulators/targets of molecules, on large integrated cellular networks in the Pathway Commons database as well as in local BioPAX models. Resulting networks can be automatically laid out for visualization using a graphically rich, process-centric notation. Profiling data from the cBioPortal for Cancer Genomics and expression data from the Gene Expression Omnibus can be overlaid on these networks. ChiBE’s new capabilities are organized around a genomics-oriented workflow and offer a unique comprehensive pathway analysis solution for genomics researchers. The software is freely available at http://code.google.com/p/chibe .

Posted ContentDOI
06 May 2014-bioRxiv
TL;DR: In this paper, the authors present a new generalized method showing that patterns of evolutionary sequence changes across proteins reflect residues that are close in space, and with sufficient accuracy to determine the three-dimensional structure of the protein complexes.
Abstract: High-throughput experiments in bacteria and eukaryotic cells have identified tens of thousands of possible interactions between proteins. This genome-wide view of the protein interaction universe is coarse-grained, whilst fine-grained detail of macro- molecular interactions critically depends on lower throughput, labor-intensive experiments. Computational approaches using measures of residue co-evolution across proteins show promise, but have been limited to specific interactions. Here we present a new generalized method showing that patterns of evolutionary sequence changes across proteins reflect residues that are close in space, and with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We demonstrate that the inferred evolutionary coupling scores distinguish between interacting and non-interacting proteins and the accurate prediction of residue interactions. To illustrate the utility of the method, we predict unknown 3D interactions between subunits of ATP synthase and find results consistent with detailed experimental data. We expect that the method can be generalized to genome- wide interaction predictions at residue resolution.

Journal ArticleDOI
TL;DR: CDK4 activation/RB phosphorylation occurs in 50% of indolent but high-risk follicular lymphomas and implies susceptibility to dual CDK4 and BCL2 inhibition.
Abstract: Loss of cell cycle controls is a hallmark of cancer and has a well-established role in aggressive B cell malignancies. However, the role of such lesions in indolent follicular lymphoma (FL) is unclear and individual lesions have been observed with low frequency. By analyzing genomic data from two large cohorts of indolent FLs, we identify a pattern of mutually exclusive (P = 0.003) genomic lesions that impair the retinoblastoma (RB) pathway in nearly 50% of FLs. These alterations include homozygous and heterozygous deletions of the p16/CDKN2a/b (7%) and RB1 (12%) loci, and more frequent gains of chromosome 12 that include CDK4 (29%). These aberrations are associated with high-risk disease by the FL prognostic index (FLIPI), and studies in a murine FL model confirm their pathogenic role in indolent FL. Increased CDK4 kinase activity toward RB1 is readily measured in tumor samples and indicates an opportunity for CDK4 inhibition. We find that dual CDK4 and BCL2 inhibitor treatment is safe and effective against available models of FL. In summary, frequent RB pathway lesions in indolent, high-risk FLs indicate an untapped therapeutic opportunity.

Posted ContentDOI
02 Oct 2014-bioRxiv
TL;DR: A novel method is developed for the identification of sets of mutually exclusive gene alterations in a signaling network that scans the groups of genes with a common downstream effect, and detects multiple previously unreported alterations that show significant mutual exclusivity and are likely to be driver events.
Abstract: Recent cancer genome studies have identified numerous genomic alterations in cancer genomes. It is hypothesized that only a fraction of these genomic alterations drive the progression of cancer -- often called driver mutations. Current sample sizes for cancer studies, often in the hundreds, are sufficient to detect pivotal drivers solely based on their high frequency of alterations. In cases where the alterations for a single function are distributed among multiple genes of a common pathway, however, single gene alteration frequencies might not be statistically significant. In such cases, we expect to observe that most samples are altered in only one of those alternative genes because additional alterations would not convey an additional selective advantage to the tumor. This leads to a mutual exclusion pattern of alterations, that can be exploited to identify these groups. We developed a novel method for the identification of sets of mutually exclusive gene alterations in a signaling network. We scan the groups of genes with a common downstream effect, using a mutual exclusivity criterion that makes sure that each gene in the group significantly contributes to the mutual exclusivity pattern. We have tested the method on all available TCGA cancer genomics datasets, and detected multiple previously unreported alterations that show significant mutual exclusivity and are likely to be driver events.

Journal ArticleDOI
10 Jul 2014-PLOS ONE
TL;DR: The results of this large study suggest that ATM may be a novel locus associated with the risk of multiple subtypes of NHL and suggests the presence of biologically relevant variants that correlate with the observed association signals.
Abstract: Molecular and genetic evidence suggests that DNA repair pathways may contribute to lymphoma susceptibility Several studies have examined the association of DNA repair genes with lymphoma risk, but the findings from these reports have been inconsistent Here we provide the results of a focused analysis of genetic variation in DNA repair genes and their association with the risk of non-Hodgkin's lymphoma (NHL) With a population of 1,297 NHL cases and 1,946 controls, we have performed a two-stage case/control association analysis of 446 single nucleotide polymorphisms (SNPs) tagging the genetic variation in 81 DNA repair genes We found the most significant association with NHL risk in the ATM locus for rs227060 (OR = 127, 95% CI: 113-143, p = 677×10(-5)), which remained significant after adjustment for multiple testing In a subtype-specific analysis, associations were also observed for the ATM locus among both diffuse large B-cell lymphomas (DLBCL) and small lymphocytic lymphomas (SLL), however there was no association observed among follicular lymphomas (FL) In addition, our study provides suggestive evidence of an interaction between SNPs in MRE11A and NBS1 associated with NHL risk (OR = 051, 95% CI: 034-077, p = 00002) Finally, an imputation analysis using the 1,000 Genomes Project data combined with a functional prediction analysis revealed the presence of biologically relevant variants that correlate with the observed association signals While the findings generated here warrant independent validation, the results of our large study suggest that ATM may be a novel locus associated with the risk of multiple subtypes of NHL

Journal ArticleDOI
TL;DR: An open source and extensible framework for defining and searching graph patterns in BioPAX models is developed and it is shown that a pattern search in public pathway data can identify a substantial amount of signaling relations that do not exist in signaling databases.
Abstract: Motivation: BioPAX is a standard language for representing complex cellular processes, including metabolic networks, signal transduction and gene regulation. Owing to the inherent complexity of a BioPAX model, searching for a specific type of subnetwork can be non-trivial and difficult. Results: We developed an open source and extensible framework for defining and searching graph patterns in BioPAX models. We demonstrate its use with a sample pattern that captures directed signaling relations between proteins. We provide search results for the pattern obtained from the Pathway Commons database and compare these results with the current data in signaling databases SPIKE and SignaLink. Results show that a pattern search in public pathway data can identify a substantial amount of signaling relations that do not exist in signaling databases. Availability: BioPAX-pattern software was developed in Java. Source code and documentation is freely available at http://code.google.com/p/biopax-pattern under Lesser GNU Public License. Contact: gro.ccksm.oibc@hcraesnrettap Supplementary information: Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
TL;DR: This work maps the recurrent genomic alterations, such as somatic mutations and focal DNA copy-number alterations, onto individual tumor samples as tumor-specific event calls to facilitate the identification of altered processes and pathways.

Journal ArticleDOI
12 Dec 2014-PLOS ONE
TL;DR: The adoption of the method results in increased agreement between technical and biological replicates of various tumor and cell-line derived samples and several slides that had previously been rejected because they had a coefficient of variation greater than 15%, are rescued by reduction of CV below this threshold.
Abstract: Reverse phase protein arrays (RPPA) are an efficient, high-throughput, cost-effective method for the quantification of specific proteins in complex biological samples. The quality of RPPA data may be affected by various sources of error. One of these, spatial variation, is caused by uneven exposure of different parts of an RPPA slide to the reagents used in protein detection. We present a method for the determination and correction of systematic spatial variation in RPPA slides using positive control spots printed on each slide. The method uses a simple bi-linear interpolation technique to obtain a surface representing the spatial variation occurring across the dimensions of a slide. This surface is used to calculate correction factors that can normalize the relative protein concentrations of the samples on each slide. The adoption of the method results in increased agreement between technical and biological replicates of various tumor and cell-line derived samples. Further, in data from a study of the melanoma cell-line SKMEL-133, several slides that had previously been rejected because they had a coefficient of variation (CV) greater than 15%, are rescued by reduction of CV below this threshold in each case. The method is implemented in the R statistical programing language. It is compatible with MicroVigene and SuperCurve, packages commonly used in RPPA data analysis. The method is made available, along with suggestions for implementation, at http://bitbucket.org/rppa_preprocess/rppa_preprocess/src.

Journal ArticleDOI
S. Chatrchyan1, Vardan Khachatryan1, Albert M. Sirunyan1, A. Tumasyan1  +3953 moreInstitutions (145)
TL;DR: In this paper, a study of color coherence effects in pp collisions at a center-of-mass energy of 7 TeV is presented, where the two jets with the largest transverse momentum exhibit a back-to-back topology.
Abstract: A study of color coherence effects in pp collisions at a center-of-mass energy of 7 TeV is presented. The data used in the analysis were collected in 2010 with the CMS detector at the LHC and correspond to an integrated luminosity of 36 inverse picobarns. Events are selected that contain at least three jets and where the two jets with the largest transverse momentum exhibit a back-to-back topology. The measured angular correlation between the second- and third-leading jet is shown to be sensitive to color coherence effects, and is compared to the predictions of Monte Carlo models with various implementations of color coherence. None of the models describe the data satisfactorily.

Journal ArticleDOI
TL;DR: PconsFold as mentioned in this paper is a fully automated pipeline for ab initio protein structure prediction based on evolutionary information, which is based on PconsC contact prediction and uses the Rosetta folding protocol.
Abstract: Motivation: Recently it has been shown that the quality of protein contact prediction from evolutionary information can be improved significantly if direct and indirect information is separated. Given sufficiently large protein families, the contact predictions contain sufficient information to predict the structure of many protein families. However, since the first studies contact prediction methods have improved. Here, we ask how much the final models are improved if improved contact predictions are used. Results: In a small benchmark of 15 proteins, we show that the TMscores of top-ranked models are improved by on average 33% using PconsFold compared with the original version of EVfold. In a larger benchmark, we find that the quality is improved with 15–30% when using PconsC in comparison with earlier contact prediction methods. Further, using Rosetta instead of CNS does not significantly improve global model accuracy, but the chemistry of models generated with Rosetta is improved. Availability: PconsFold is a fully automated pipeline for ab initio protein structure prediction based on evolutionary information. PconsFold is based on PconsC contact prediction and uses the Rosetta folding protocol. Due to its modularity, the contact prediction tool can be easily exchanged. The source code of PconsFold is available on GitHub at https://www.github.com/ElofssonLab/pcons-fold under the MIT license. PconsC is available from http://c.pcons.net/. Contact: arne@bioinfo.se Supplementary information: Supplementary data are available at Bioinformatics online.

Posted ContentDOI
12 Sep 2014-bioRxiv
TL;DR: Analysis of the downstream effects of the resulting imbalance 5p/3p shows a statistically significant effect on the expression of mRNAs targeted by major conserved miRNA families, which appears to contribute to the oncogenesis.
Abstract: Mutations in the RNase IIIb domain of DICER1 are known to disrupt processing of 5p-strand pre-miRNAs and these mutations have previously been associated with cancer. Using data from the Cancer Genome Atlas project, we show that these mutations are recurrent across four cancer types and that a previously uncharacterized recurrent mutation in the adjacent RNase IIIa domain also disrupts 5p-strand miRNA processing. Analysis of the downstream effects of the resulting imbalance 5p/3p shows a statistically significant effect on the expression of mRNAs targeted by major conserved miRNA families. In summary, these mutations in DICER1 lead to an imbalance in miRNA strands, which has an effect on mRNA transcript levels that appear to contribute to the oncogenesis.

S. Chatrchyan1, Vardan Khachatryan1, Albert M. Sirunyan1, A. Tumasyan1  +3953 moreInstitutions (145)
01 Jun 2014
TL;DR: In this article, a study of color coherence effects in pp collisions at a center-of-mass energy of 7 TeV is presented, where the two jets with the largest transverse momentum exhibit a back-to-back topology.
Abstract: A study of color coherence effects in pp collisions at a center-of-mass energy of 7 TeV is presented. The data used in the analysis were collected in 2010 with the CMS detector at the LHC and correspond to an integrated luminosity of 36 inverse picobarns. Events are selected that contain at least three jets and where the two jets with the largest transverse momentum exhibit a back-to-back topology. The measured angular correlation between the second- and third-leading jet is shown to be sensitive to color coherence effects, and is compared to the predictions of Monte Carlo models with various implementations of color coherence. None of the models describe the data satisfactorily.


01 Jan 2014
TL;DR: It is shown that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes.