scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Circular binary segmentation for the analysis of array-based DNA copy number data.

01 Oct 2004-Biostatistics (Oxford University Press)-Vol. 5, Iss: 4, pp 557-572
TL;DR: A modification ofbinary segmentation is developed, which is called circular binary segmentation, to translate noisy intensity measurements into regions of equal copy number in DNA sequence copy number.
Abstract: DNA sequence copy number is the number of copies of DNA at a region of a genome. Cancer progression often involves alterations in DNA copy number. Newly developed microarray technologies enable simultaneous measurement of copy number at thousands of sites in a genome. We have developed a modification of binary segmentation, which we call circular binary segmentation, to translate noisy intensity measurements into regions of equal copy number. The method is evaluated by simulation and is demonstrated on cell line data with known copy number alterations and on a breast cancer cell line data set.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: A robust gene expression-based molecular classification of GBM into Proneural, Neural, Classical, and Mesenchymal subtypes is described and multidimensional genomic data is integrated to establish patterns of somatic mutations and DNA copy number.

5,764 citations


Cites methods from "Circular binary segmentation for th..."

  • ...Briefly, the circular binary segmentation algorithm (Olshen et al., 2004) was used to estimate raw copy number for genomic segments....

    [...]

Journal ArticleDOI
Adam J. Bass1, Vesteinn Thorsson2, Ilya Shmulevich2, Sheila Reynolds2  +254 moreInstitutions (32)
11 Sep 2014-Nature
TL;DR: A comprehensive molecular evaluation of 295 primary gastric adenocarcinomas as part of The Cancer Genome Atlas (TCGA) project is described and a molecular classification dividing gastric cancer into four subtypes is proposed.
Abstract: Gastric cancer was the world’s third leading cause of cancer mortality in 2012, responsible for 723,000 deaths1. The vast majority of gastric cancers are adenocarcinomas, which can be further subdivided into intestinal and diffuse types according to the Lauren classification2. An alternative system, proposed by the World Health Organization, divides gastric cancer into papillary, tubular, mucinous (colloid) and poorly cohesive carcinomas3. These classification systems have little clinical utility, making the development of robust classifiers that can guide patient therapy an urgent priority. The majority of gastric cancers are associated with infectious agents, including the bacterium Helicobacter pylori4 and Epstein–Barr virus (EBV). The distribution of histological subtypes of gastric cancer and the frequencies of H. pylori and EBV associated gastric cancer vary across the globe5. A small minority of gastric cancer cases are associated with germline mutation in E-cadherin (CDH1)6 or mismatch repair genes7 (Lynch syndrome), whereas sporadic mismatch repair-deficient gastric cancers have epigenetic silencing of MLH1 in the context of a CpG island methylator phenotype (CIMP)8. Molecular profiling of gastric cancer has been performed using gene expression or DNA sequencing9–12, but has not led to a clear biologic classification scheme. The goals of this study by The Cancer Genome Atlas (TCGA) were to develop a robust molecular classification of gastric cancer and to identify dysregulated pathways and candidate drivers of distinct classes of gastric cancer.

4,583 citations

01 Jan 2010
TL;DR: The Cancer Genome Atlas Network recently cataloged recurrent genomic abnormalities in glioblastoma multiforme (GBM) and proposed a robust gene expression-based molecular classification of GBM into Proneural, Neural, Classical, and Mesenchymal subtypes as discussed by the authors.
Abstract: The Cancer Genome Atlas Network recently cataloged recurrent genomic abnormalities in glioblastoma multiforme (GBM). We describe a robust gene expression-based molecular classification of GBM into Proneural, Neural, Classical, and Mesenchymal subtypes and integrate multidimensional genomic data to establish patterns of somatic mutations and DNA copy number. Aberrations and gene expression of EGFR, NF1, and PDGFRA/IDH1 each define the Classical, Mesenchymal, and Proneural subtypes, respectively. Gene signatures of normal brain cell types show a strong relationship between subtypes and different neural lineages. Additionally, response to aggressive therapy differs by subtype, with the greatest benefit in the Classical subtype and no benefit in the Proneural subtype. We provide a framework that unifies transcriptomic and genomic dimensions for GBM molecular stratification with important implications for future studies.

4,464 citations

Journal ArticleDOI
Rameen Beroukhim, Craig H. Mermel1, Craig H. Mermel2, Dale Porter3, Guo Wei1, Soumya Raychaudhuri4, Soumya Raychaudhuri1, Jerry Donovan3, Jordi Barretina2, Jordi Barretina1, Jesse S. Boehm1, Jennifer Dobson2, Jennifer Dobson1, Mitsuyoshi Urashima5, Kevin T. Mc Henry3, Reid M. Pinchback1, Azra H. Ligon4, Yoon Jae Cho6, Leila Haery2, Leila Haery1, Heidi Greulich, Michael R. Reich1, Wendy Winckler1, Michael S. Lawrence1, Barbara A. Weir2, Barbara A. Weir1, Kumiko E. Tanaka2, Kumiko E. Tanaka1, Derek Y. Chiang1, Derek Y. Chiang7, Derek Y. Chiang2, Adam J. Bass2, Adam J. Bass4, Adam J. Bass1, Alice Loo3, Carter Hoffman1, Carter Hoffman2, John R. Prensner2, John R. Prensner1, Ted Liefeld1, Qing Gao1, Derek Yecies2, Sabina Signoretti2, Sabina Signoretti4, Elizabeth A. Maher8, Frederic J. Kaye, Hidefumi Sasaki9, Joel E. Tepper7, Jonathan A. Fletcher4, Josep Tabernero10, José Baselga10, Ming-Sound Tsao11, Francesca Demichelis12, Mark A. Rubin12, Pasi A. Jänne2, Pasi A. Jänne4, Mark J. Daly1, Mark J. Daly2, Carmelo Nucera13, Ross L. Levine14, Benjamin L. Ebert4, Benjamin L. Ebert2, Benjamin L. Ebert1, Stacey Gabriel1, Anil K. Rustgi15, Cristina R. Antonescu14, Marc Ladanyi14, Anthony Letai2, Levi A. Garraway1, Levi A. Garraway2, Massimo Loda4, Massimo Loda2, David G. Beer16, Lawrence D. True17, Aikou Okamoto5, Scott L. Pomeroy6, Samuel Singer14, Todd R. Golub2, Todd R. Golub1, Todd R. Golub18, Eric S. Lander2, Eric S. Lander1, Eric S. Lander19, Gad Getz1, William R. Sellers3, Matthew Meyerson1, Matthew Meyerson2 
18 Feb 2010-Nature
TL;DR: It is demonstrated that cancer cells containing amplifications surrounding the MCL1 and BCL2L1 anti-apoptotic genes depend on the expression of these genes for survival, and a large majority of SCNAs identified in individual cancer types are present in several cancer types.
Abstract: A powerful way to discover key genes with causal roles in oncogenesis is to identify genomic regions that undergo frequent alteration in human cancers. Here we present high-resolution analyses of somatic copy-number alterations (SCNAs) from 3,131 cancer specimens, belonging largely to 26 histological types. We identify 158 regions of focal SCNA that are altered at significant frequency across several cancer types, of which 122 cannot be explained by the presence of a known cancer target gene located within these regions. Several gene families are enriched among these regions of focal SCNA, including the BCL2 family of apoptosis regulators and the NF-kappaBeta pathway. We show that cancer cells containing amplifications surrounding the MCL1 and BCL2L1 anti-apoptotic genes depend on the expression of these genes for survival. Finally, we demonstrate that a large majority of SCNAs identified in individual cancer types are present in several cancer types.

3,375 citations

Journal ArticleDOI
Peter S. Hammerman1, Doug Voet1, Michael S. Lawrence1, Douglas Voet1  +342 moreInstitutions (32)
27 Sep 2012-Nature
TL;DR: It is shown that the tumour type is characterized by complex genomic alterations, with a mean of 360 exonic mutations, 165 genomic rearrangements, and 323 segments of copy number alteration per tumour.
Abstract: Lung squamous cell carcinoma is a common type of lung cancer, causing approximately 400,000 deaths per year worldwide. Genomic alterations in squamous cell lung cancers have not been comprehensively characterized, and no molecularly targeted agents have been specifically developed for its treatment. As part of The Cancer Genome Atlas, here we profile 178 lung squamous cell carcinomas to provide a comprehensive landscape of genomic and epigenomic alterations. We show that the tumour type is characterized by complex genomic alterations, with a mean of 360 exonic mutations, 165 genomic rearrangements, and 323 segments of copy number alteration per tumour. We find statistically recurrent mutations in 11 genes, including mutation of TP53 in nearly all specimens. Previously unreported loss-of-function mutations are seen in the HLA-A class I major histocompatibility gene. Significantly altered pathways included NFE2L2 and KEAP1 in 34%, squamous differentiation genes in 44%, phosphatidylinositol-3-OH kinase pathway genes in 47%, and CDKN2A and RB1 in 72% of tumours. We identified a potential therapeutic target in most tumours, offering new avenues of investigation for the treatment of squamous cell lung cancers.

3,356 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Abstract: SUMMARY The common approach to the multiplicity problem calls for controlling the familywise error rate (FWER). This approach, though, has faults, and we point out a few. A different approach to problems of multiple significance testing is presented. It calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate. This error rate is equivalent to the FWER when all hypotheses are true but is smaller otherwise. Therefore, in problems where the control of the false discovery rate rather than that of the FWER is desired, there is potential for a gain in power. A simple sequential Bonferronitype procedure is proved to control the false discovery rate for independent test statistics, and a simulation study shows that the gain in power is substantial. The use of the new procedure and the appropriateness of the criterion are illustrated with examples.

83,420 citations

Book
01 Jan 1983
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Abstract: The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

14,825 citations

Journal ArticleDOI
William S. Cleveland1
TL;DR: Robust locally weighted regression as discussed by the authors is a method for smoothing a scatterplot, in which the fitted value at z k is the value of a polynomial fit to the data using weighted least squares, where the weight for (x i, y i ) is large if x i is close to x k and small if it is not.
Abstract: The visual information on a scatterplot can be greatly enhanced, with little additional cost, by computing and plotting smoothed points. Robust locally weighted regression is a method for smoothing a scatterplot, (x i , y i ), i = 1, …, n, in which the fitted value at z k is the value of a polynomial fit to the data using weighted least squares, where the weight for (x i , y i ) is large if x i is close to x k and small if it is not. A robust fitting procedure is used that guards against deviant points distorting the smoothed points. Visual, computational, and statistical issues of robust locally weighted regression are discussed. Several examples, including data on lead intoxication, are used to illustrate the methodology.

10,225 citations

Journal ArticleDOI
TL;DR: This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments.
Abstract: There are many sources of systematic variation in cDNA microarray experiments which affect the measured gene expression levels (e.g. differences in labeling efficiency between the two fluorescent dyes). The term normalization refers to the process of removing such variation. A constant adjustment is often used to force the distribution of the intensity log ratios to have a median of zero for each slide. However, such global normalization approaches are not adequate in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments. The selection of appropriate controls for normalization is discussed and a novel set of controls (microarray sample pool, MSP) is introduced to aid in intensity-dependent normalization. Lastly, to allow for comparisons of expression levels across slides, a robust method based on maximum likelihood estimation is proposed to adjust for scale differences among slides.

3,605 citations

Related Papers (5)