scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Gene expression profiling predicts clinical outcome of breast cancer

TL;DR: DNA microarray analysis on primary breast tumours of 117 young patients is used and supervised classification is applied to identify a gene expression signature strongly predictive of a short interval to distant metastases (‘poor prognosis’ signature) in patients without tumour cells in local lymph nodes at diagnosis, providing a strategy to select patients who would benefit from adjuvant therapy.
Abstract: Breast cancer patients with the same stage of disease can have markedly different treatment responses and overall outcome. The strongest predictors for metastases (for example, lymph node status and histological grade) fail to classify accurately breast tumours according to their clinical behaviour. Chemotherapy or hormonal therapy reduces the risk of distant metastases by approximately one-third; however, 70-80% of patients receiving this treatment would have survived without it. None of the signatures of breast cancer gene expression reported to date allow for patient-tailored therapy strategies. Here we used DNA microarray analysis on primary breast tumours of 117 young patients, and applied supervised classification to identify a gene expression signature strongly predictive of a short interval to distant metastases ('poor prognosis' signature) in patients without tumour cells in local lymph nodes at diagnosis (lymph node negative). In addition, we established a signature that identifies tumours of BRCA1 carriers. The poor prognosis signature consists of genes regulating cell cycle, invasion, metastasis and angiogenesis. This gene expression profile will outperform all currently used clinical parameters in predicting disease outcome. Our findings provide a strategy to select patients who would benefit from adjuvant therapy.

Content maybe subject to copyright    Report

letters to nature
530 NATURE
|
VOL 415
|
31 JANUARY 2002
|
www.nature.com
24. Chorvatova, A., Gendron, L., Bilodeau, L., Gallo-Payet, N. & Payet, M. D. A Ras-dependent chloride
current activated by adrenocorticotropin in rat adrenal zona glomerulosa cells. Endocrinology 141,
684±692 (2000).
25. Tong, J. et al. NF1-regulated adenylyl cyclase pathway. Soc. Neurosci. Abstr. abstract no. 345.9 (Society
for Neuroscience, New Orleans, 2000).
26. Ingram, D. A. et al. Hyperactivation of p21(ras) and the hematopoietic-speci®c Rho GTPase, Rac2,
cooperate to alter the proliferation of neuro®bromin-de®cient mast cells in vivo and in vitro. J. Exp.
Med. 194, 57±69 (2001).
27. Jacks, T. et al. Tumour predisposition in mice heterozygous for a targeted mutation in Nf1. Nature
Genet. 7, 353±361 (1994).
28. Umanoff, H., Edelmann, W., Pellicer, A. & Kucherlapati, R. The murine N-ras gene is not essential for
growth and development. Proc. Natl Acad. Sci. USA 92, 1709±1713 (1995).
29. Voikar, V., Koks, S., Vasar, E. & Rauvala, H. Strain and gender differences in the behaviour of mouse
lines commonly used in transgenic studies. Physiol. Behav. 72, 271±281 (2001).
30. Blanton, M. G., Lo Turco, J. J. & Kriegstein, A. R. Whole cell recording from neurons in slices of
reptilian and mammalian cerebral cortex. J. Neurosci. Methods 30, 203±210 (1989).
Acknowledgements
We thank V. Manne for the BMS191563, and E. Friedman for technical assistance in earlier
experiments. We are grateful to M. Barad, D. Buonomano, T. Cannon, J. Colicelli,
P. Frankland, L. Kaczmarek, A. Matynia, M. Sanders and D. Smith for discussions, and to
C. Brannan and S. Schlussel for encouragement. R.M.C. received support from the
Graduated Program in Basic and Applied Biology (GABBA) of the University of Oporto,
the Portuguese Foundation for Science and Technology (FCT) and the National Neuro-
®bromatosis Foundation (NNF). This work was also supported by a generous donation
from K. M. Spivak, and by grants from the NIH (R01 NS38480), Neuro®bromatosis Inc.
(National, Illinois, Mass Bay Area, Minnesota, Arizona, Kansas and Central Plains,
Mid-Atlantic, and Texas chapters), the Merck and the NNF foundations to A.J.S.
Competing interests statement
The authors declare that they have no competing ®nancial interests.
Correspondence and requests for materials should be addressed to A.J.S.
(e-mail: Silvaa@mednet.ucla.edu).
.................................................................
Gene expression pro®ling predicts
clinical outcome of breast cancer
Laura J. van 't Veer*
²
, Hongyue Dai
²³
, Marc J. van de Vijver*
²
,
Yudong D. He
³
, Augustinus A. M. Hart*, Mao Mao
³
, Hans L. Peterse*,
Karin van der Kooy*, Matthew J. Marton
³
, Anke T. Witteveen*,
George J. Schreiber
³
, Ron M. Kerkhoven*, Chris Roberts
³
,
Peter S. Linsley
³
, Rene
Â
Bernards* & Stephen H. Friend
³
* Divisions of Diagnostic Oncology, Radiotherapy and Molecular Carcinogenesis
and Center for Biomedical Genetics, The Netherlands Cancer Institute,
121 Plesmanlaan, 1066 CX Amsterdam, The Netherlands
³
Rosetta Inpharmatics, 12040 115th Avenue NE, Kirkland, Washington 98034,
USA
²
These authors contributed equally to this work
..............................................................................................................................................
Breast cancer patients with the same stage of disease can have
markedly different treatment responses and overall outcome. The
strongest predictors for metastases (for example, lymph node
status and histological grade) fail to classify accurately breast
tumours according to their clinical behaviour
1±3
. Chemotherapy
or hormonal therapy reduces the risk of distant metastases by
approximately one-third; however, 70±80% of patients receiving
this treatment would have survived without it
4,5
. None of the
signatures of breast cancer gene expression reported to date
6±12
allow for patient-tailored therapy strategies. Here we used DNA
microarray analysis on primary breast tumours of 117 young
patients, and applied supervised classi®cation to identify a gene
expression signature strongly predictive of a short interval to
distant metastases (`poor prognosis' signature) in patients with-
out tumour cells in local lymph nodes at diagnosis (lymph node
negative). In addition, we established a signature that identi®es
tumours of BRCA1 carriers. The poor prognosis signature con-
sists of genes regulating cell cycle, invasion, metastasis and
angiogenesis. This gene expression pro®le will outperform all
currently used clinical parameters in predicting disease outcome.
Our ®ndings provide a strategy to select patients who would
bene®t from adjuvant therapy.
We selected 98 primary breast cancers: 34 from patients who
developed distant metastases within 5 years, 44 from patients who
continued to be disease-free after a period of at least 5 years, 18 from
patients with BRCA1 germline mutations, and 2 from BRCA2
carriers. All `sporadic' patients were lymph node negative, and
under 55 years of age at diagnosis. From each patient, 5 mg total
RNA was isolated from snap-frozen tumour material and used to
derive complementary RNA (cRNA). A reference cRNA pool was
made by pooling equal amounts of cRNA from each of the sporadic
carcinomas. Two hybridizations were carried out for each tumour
using a ¯uorescent dye reversal technique on microarrays contain-
ing approximately 25,000 human genes synthesized by inkjet
technology
13
. Fluorescence intensities of scanned images were
quanti®ed, normalized and corrected to yield the transcript abun-
dance of a gene as an intensity ratio with respect to that of the signal
of the reference pool
14
. Some 5,000 genes were signi®cantly regu-
lated across the group of samples (that is, at least a twofold
difference and a P-value of less than 0.01 in more than ®ve
tumours).
An unsupervised, hierarchical clustering algorithm allowed us to
cluster the 98 tumours on the basis of their similarities measured
over these approximately 5,000 signi®cant genes. Similarly, the
,5,000 genes were clustered on the basis of their similarities
measured over the group of 98 tumours (Fig. 1a). In the dendro-
grams shown in Fig. 1a (left and top), the length and the subdivision
of the branches displays the relatedness of the breast tumours (left)
and the expression of the genes (top). Two distinct groups of
tumours are the dominant feature in this two-dimensional display
(top and bottom of plot, representing 62 and 36 tumours, respec-
tively), suggesting that the tumours can be divided into two types on
the basis of this set of ,5,000 signi®cant genes. Notably, in the
upper group only 34% of the sporadic patients were from the group
who developed distant metastases within 5 years, whereas in the
lower group 70% of the sporadic patients had progressive disease
(Fig. 1b). Thus, using unsupervised clustering we can already, to
some extent, distinguish between `good prognosis' and `poor prog-
nosis' tumours.
To gain insight into the genes of the dominant expression
signatures, we associated them with histopathological data; for
example, oestrogen receptor (ER)-a expression as determined by
immunohistochemical (IHC) staining (Fig. 1b). Out of 39 IHC-
stained tumours negative for ER-a expression (ER negative), 34
clustered together in the bottom branch of the tumour dendrogram.
In the enlargement shown in Fig. 1c, a group of downregulated
genes is represented containing both the ER-a gene (ESR1) and
genes that are apparently co-regulated with ER, some of which are
known ER target genes. A second dominant gene cluster is asso-
ciated with lymphocytic in®ltrate and includes several genes
expressed primarily by B and T cells (Fig. 1d).
Sixteen out of eighteen tumours of BRCA1 carriers are found in
the bottom branch intermingled with sporadic tumours. This is
consistent with the idea that most BRCA1 mutant tumours are ER
negative and manifest a higher amount of lymphocytic in®ltrate
15
.
The two tumours of BRCA2 carriers are part of the upper cluster of
tumours and do not show similarity with BRCA1 tumours. Neither
high histological grade nor angioinvasion is a speci®c feature of
either of the clusters (Fig. 1b). We conclude that unsupervised
clustering detects two subgroups of breast cancers, which differ in
ER status and lymphocytic in®ltration. A similar conclusion has
also been reported previously
7,16
.
The 78 sporadic lymph-node-negative patients were selected
speci®cally to search for a prognostic signature in their gene
expression pro®les. Forty-four patients remained free of disease
© 2002 Macmillan Magazines Ltd

letters to nature
NATURE
|
VOL 415
|
31 JANUARY 2002
|
www.nature.com 531
B
R
C
A
1
ER
G
rad
e 3
Lym
phocytic infiltrate
A
gioinvasion
a b
cd
Clustering of ~5,000 significant genes
000
Clustering of 98 breast tumours
Log
10
(expression ratio)
0.6
0
–0.6
M
etastases
Contig 37571RC
KIAA0882
CA12
ESR1
GATA3
MYB
P28
FLJ20262
AL133619
Contig 56390RC
CELSR1
Contig 58301RC
UGCG
AL049265
BCL2
EMAP-2
HSU79303
Contig 51994RC
Contig 237RC
Contig 47045RC
XBP1
HNF3A
VAV3
Contig 54295RC
AL133074
Contig 53968RC
Contig 49342RC
ZFP103
AL110139
FLJ12538
ERBB3
FBP1
Contig 50297RC
FLJ20273
AL080192
TCEB1L
D5S346
AL137761
TEGT
Contig 41887RC
Contig 27915RC
Contig 14390RC
POU2AF1
PIM2
LOC51237
LOC57823
LOC57823
AJ249377
X93006
U96394
X79782
AF063725
IGLL1
IGL@
IGL@
AJ225092
IGKV3D-15
AF103458
AJ225093
Contig 10268RC
Contig 44195RC
AF058075
IGL@
IGKC
TLX3
Contig 42547
Contig 20907RC
ICAP-1A
FLJ20340
AF103530
MTR1
CD19
CD19
IGHM
VPREB3
BM040
KIAA0167
TRD@
IRF5
Contig 50634RC
Figure 1 Unsupervised two-dimensional cluster analysis of 98 breast tumours. a, Two-
dimensional presentation of transcript ratios for 98 breast tumours. There were 4,968
signi®cant genes across the group. Each row represents a tumour and each column a
single gene. As shown in the colour bar, red indicates upregulation, green
downregulation, black no change, and grey no data available. The yellow line marks the
subdivision into two dominant tumour clusters. b, Selected clinical data for the 98 patients
in a: BRCA1 germline mutation carrier (or sporadic patient), ER expression, tumour grade
3 (versus grade 1 and 2), lymphocytic in®ltrate, angioinvasion, and metastasis status.
White indicates positive, black negative and grey denotes tumours derived from BRCA1
germline carriers who were excluded from the metastasis evaluation. The cluster below
the yellow line consists of 36 tumours, of which 34 are ER negative (total 39 ER-negative)
and 16 are carriers of the BRCA1 mutation (total 18). c, Enlarged portion from a
containing a group of genes that co-regulate with the ER-a gene (ESR1). Each gene is
labelled by its gene name or accession number from GenBank. Contig ESTs ending with
RC are reverse-complementary of the named contig EST. d, Enlarged portion from a
containing a group of co-regulated genes that are the molecular re¯ection of extensive
lymphocytic in®ltrate, and comprise a set of genes expressed in T and B cells. (Gene
annotation as in c.)
© 2002 Macmillan Magazines Ltd

letters to nature
532 NATURE
|
VOL 415
|
31 JANUARY 2002
|
www.nature.com
after their initial diagnosis for an interval of at least 5 years (good
prognosis group, mean follow-up of 8.7 years), and 34 patients had
developed distant metastases within 5 years (poor prognosis group,
mean time to metastases 2.5 years) (Fig. 2a). To identify reliably
good and poor prognostic tumours, we used a powerful three-step
supervised classi®cation method, similar to those used
previously
8,17,18
. In brief, approximately 5,000 genes (signi®cantly
regulated in more than 3 tumours out of 78) were selected from the
25,000 genes on the microarray. The correlation coef®cient of the
expression for each gene with disease outcome was calculated and
231 genes were found to be signi®cantly associated with disease
outcome (correlation coef®cient ,-0.3 or .0.3). In the second
step, these 231 genes were rank-ordered on the basis of the
magnitude of the correlation coef®cient. Third, the number of
genes in the `prognosis classi®er' was optimized by sequentially
adding subsets of 5 genes from the top of this rank-ordered list and
10
20
30
40
50
60
70
Tumours
5
10
15
Tumours
AL080059
Contig 63649RC
LOC51203
Contig 46218RC
Contig 38288RC
AA555029RC
Contig 28552RC
FLT1
MMP9
DC13
EXT1
AL137718
PK428
HEC
ECT2
GMPS
Contig 32185RC
UCH37
Contig 35251RC
KIAA1067
GNAZ
SERF1A
OXCT
ORC6L
L2DTL
PRC1
AF052162
COL4A2
KIAA0175
RAB6B
Contig 55725RC
DCK
CENPA
SM20
MCM6
AKAP2
Contig 56457RC
RFC4
DKFZP564D0462
SLC2A3
MP1
Contig 40831RC
Contig 24252RC
FLJ11190
Contig 51464RC
IGFBP5
IGFBP5
CCNE2
ESM1
Contig 20217RC
NMU
LOC57110
Contig 63102RC
PECI
AP2B1
CFFM4
PECI
TGFB3
Contig 46223RC
Contig 55377RC
HSA250839
GSTM3
BBC3
CEGP1
Contig 48328RC
WISP1
ALDH4
KIAA1442
Contig 32125RC
FGF18
1 0 1
1 0 1
Sporadic breast tumours
patients <55 years
tumour size <5 cm
lymph node negative (LN0)
No distant metastases
>5 years
Distant metastases
<5 years
Prognosis reporter genes
a
b
c
C
orrelation to average
good prognosis profile
M
etastases
Figure 2 Supervised classi®cation on prognosis signatures. a, Use of prognostic reporter
genes to identify optimally two types of disease outcome from 78 sporadic breast tumours
into a poor prognosis and good prognosis group (for patient data see Supplementary
Information Table S1). b, Expression data matrix of 70 prognostic marker genes from
tumours of 78 breast cancer patients (left panel). Each row represents a tumour and each
column a gene, whose name is labelled between b and c. Genes are ordered according to
their correlation coef®cient with the two prognostic groups. Tumours are ordered by the
correlation to the average pro®le of the good prognosis group (middle panel). Solid line,
prognostic classi®er with optimal accuracy; dashed line, with optimized sensitivity. Above
the dashed line patients have a good prognosis signature, below the dashed line the
prognosis signature is poor. The metastasis status for each patient is shown in the right
panel: white indicates patients who developed distant metastases within 5 years after the
primary diagnosis; black indicates patients who continued to be disease-free for at least
5 years. c, Same as for b, but the expression data matrix is for tumours of 19 additional
breast cancer patients using the same 70 optimal prognostic marker genes. Thresholds in
the classi®er (solid and dashed line) are the same as b. (See Fig. 1 for colour scheme.)
© 2002 Macmillan Magazines Ltd

letters to nature
NATURE
|
VOL 415
|
31 JANUARY 2002
|
www.nature.com 533
evaluating its power for correct classi®cation using the `leave-one-
out' method for cross-validation (see Supplementary Information).
Classi®cation was made on the basis of the correlations of the
expression pro®le of the `leave-one-out' sample with the mean
expression levels of the remaining samples from the good and the
poor prognosis patients, respectively. The accuracy improved until
the optimal number of marker genes was reached (70 genes).
The expression pattern of the 70 genes in the 78 samples is shown
in the colour plot of Fig. 2b (left panel), where tumours were
ordered by rank according to their correlation coef®cients with the
average good prognosis pro®le (Fig. 2b, middle panel). The classi®er
predicted correctly the actual outcome of disease for 65 out of the 78
patients (83%), with respectively 5 poor prognosis and 8 good
prognosis patients assigned to the opposite category (Fig. 2b,
threshold `optimal accuracy', solid line). However, for the selection
of patients eligible for adjuvant systemic therapy, a lower number of
poor prognosis patients assigned to the good prognosis category
should be attained. For this purpose, we set a threshold that resulted
in misclassi®cation of no more than 10% of the poor prognosis
patients (3 patients out of 34 of the poor prognosis group). This
optimized sensitivity threshold resulted in a total of 15 misclassi-
®cations: 3 poor prognosis tumours were classi®ed as good prog-
nosis, and 12 good prognosis tumours were classi®ed as poor
prognosis (Fig. 2b, dashed line). We classi®ed tumours having a
gene expression pro®le with a correlation coef®cient above the
`optimized sensitivity' threshold (dashed line) as a good prognosis
Figure 3 Supervised classi®cation on ER and BRCA1 signatures. a, Outline of a two-level
classi®cation system: 98 breast tumours are ®rst classi®ed into an ER-positive group and
an ER-negative group, which is further divided into BRCA1 mutation and sporadic
tumours. b, Expression data matrix of the 98 sporadic tumours across 550 optimal ER
reporter genes. The contrasting patterns discriminate between tumours with an ER-
negative signature (below solid line) and an ER-positive signature (above solid line). The
reporter genes were ordered on the basis of their level of contribution to the classi®ers.
Tumours are arranged according to the leave-one-out correlation coef®cients to the
average signatures of the classi®er. The ER status, as determined by IHC and microarray,
are indicated in the two right panels. c, Expression data matrix of 38 ER-negative tumours
de®ned by the ER classi®er over the 100 optimal BRCA1 reporter genes. The degree of the
patterns divides the tumours in the ER-negative group into two subgroups: BRCA1-like
and sporadic-like. Patients above the solid line are characterized by a BRCA1 signature.
The classi®cation for each tumour was based on the leave-one-out procedure. The BRCA1
germline mutation status is indicated in the right panel (white indicates mutation). (See
Fig. 1 for colour scheme.)
© 2002 Macmillan Magazines Ltd

letters to nature
534 NATURE
|
VOL 415
|
31 JANUARY 2002
|
www.nature.com
signature, and below this threshold as a poor prognosis signature.
Even small primary tumours without lymph node metastases can
display the poor prognosis signature, indicating that they are
already programmed for this metastatic phenotype.
The functional annotation for the genes provides insight into the
underlying biological mechanism leading to rapid metastases.
Genes involved in cell cycle, invasion and metastasis, angiogenesis,
and signal transduction are signi®cantly upregulated in the poor
prognosis signature (for example cyclin E2, MCM6, metalloprotei-
nases MMP9 and MP1, RAB6B, PK428, ESM1, and the VEGF
receptor FLT1; see Fig. 2b). If we evaluate all 231 prognostic reporter
genes, more genes belonging to these functional categories become
apparent (for example, RAD21, cyclin B2, PCTAIRE, CDC25B,
CENPF, VEGF, PGK1, MAD2, CKS2, BUB1) (for a complete list,
see Supplementary Information Table S2).
Many clinical studies have correlated alterations in expression
of individual genes with breast cancer disease outcome, often
with contradictory results. Examples include cyclin D1, ER-a,
UPA, PAI-1, HER2/neu and c-myc
19±22
. Surprisingly, none of these
genes are present in our set of 70 marker genes. This could be due to
the fact that here we determine gene expression at the level of
transcription, whereas most previous studies measured protein
levels. However, it is more likely that these genes in isolation have
only limited predictive power, which highlights the need for an
approach based on many genes.
To validate the prognosis classi®er, an additional independent set
of primary tumours from 19 young, lymph-node-negative breast
cancer patients was selected. This group consisted of 7 patients who
remained metastasis free for at least ®ve years, and 12 patients who
developed distant metastases within ®ve years. The disease outcome
was predicted by the 70-gene classi®er and resulted in 2 out of 19
incorrect classi®cations using both the optimal accuracy threshold
(Fig. 2c, solid line) and the optimized sensitivity threshold (Fig. 2c,
dashed line). Thus, the classi®er showed a comparable performance
on the validation set of 19 independent sporadic tumours and
con®rmed the predictive power and robustness of prognosis classi-
®cation using the 70 optimal marker genes (Fisher's exact test for
association P = 0.0018).
The prediction of the classi®er presented in Fig. 2b would indicate
that women under 55 years of age who are diagnosed with lymph-
node-negative breast cancer that has a poor prognosis signature
have a 28-fold odds ratio (OR) (95% con®dence interval, CI 7±107,
P = 1.0 ´ 10
-8
) to develop a distant metastasis within 5 years
compared with those that have the good prognosis signature (see
Methods for odds ratio de®nition). This estimate, however, is based
on the same series of patients that the classi®er was derived from,
and therefore this odds ratio represents an upper limit. A perfor-
mance cross-validation procedure, in which the leave-one-out
sample is not involved in selecting the prognosis reporter genes
and the number of reporter genes is not optimized, results in an
odds ratio of 15 for a short interval to metastases (95% CI 4±56,
P = 4.1 ´ 10
-6
) (see Supplementary Information). This cross-
validated predictive value of our classi®er is superior to the
currently available clinical and histopathological prognostic
factors: high grade (odds ratio, OR = 6.4 (95% CI 2.1±19), P =
0.0008), tumour size greater than 2 cm (OR = 4.4 (95% CI 1.7±11),
P = 0.0028), angioinvasion (OR = 4.2 (95% CI 1.5±12), P =0.01),
age #40 (OR = 3.7 (95% CI 1.3±11), P = 0.02), and ER negative
(OR = 2.4 (95% CI 0.9±6.6), P = 0.13). Furthermore, the evaluation
of the cross-validated classi®er in a multivariate model that includes
all classical prognostic factors indicates that it is an independent
factor in predicting outcome of disease (logistic regression OR = 18
(3.3±94), P-value of likelihood ratio test 1.4 ´ 10
-4
). Studying a large
and unselected cohort of breast cancer patients is required to provide
a more accurate estimate of the metastatic risk associated with the
prognosis signature.
Unsupervised cluster analysis distinguishes between ER-positive
and ER-negative tumours (Fig. 1a). To investigate the expression
patterns associated with the immunohistochemical staining of ER
and to explore the differences between the sporadic and BRCA1
tumours that fall into the ER-negative cluster (Fig. 1a), a supervised
two-layer classi®cation was performed (Fig. 3a). Figure 3b shows
that 550 genes optimally report the dominant pattern associated
with ER status, including genes such as keratin 18, BCL2, ERBB3
and ERBB4 (see Supplementary Information Table S3). The leave-
one-out analysis shows that only two ER-positive and three ER-
negative tumours (as determined by IHC) were classi®ed in the
opposite gene expression group (95% correct classi®cation, Fig. 3b,
middle panel). However, in all ®ve discordant cases, the abundance
of ER messenger RNA measured by the microarray agrees with the
classi®cation (Fig. 3b, right panel). An ER status reporter signature
was also determined by others using a similar classi®cation
method
8
, and their ER signature gene set overlaps with ours (21
out of their 50 ER status reporter genes are present in our set of 550
ER reporters). Our observation in the unsupervised analysis that ER
clustering has predictive power for prognosis is also valid for the ER
supervised classi®cation, although it does not reach the level of
signi®cance of the prognosis classi®er (ER signature prediction for
prognosis, OR = 3.7 (95% CI 1.3±11) P = 0.02; data not shown).
Figure 3c shows the leave-one-out classi®cation of the 38 ER-
negative tumours into sporadic cases and BRCA1-associated cases
based on an optimal set of 100 genes. This set is enriched in
lymphocyte-speci®c genes (see Supplementary Information Table
S4). The classi®cation into sporadic and BRCA1 tumours was
caused mainly by the differences in levels of gene expression
(amplitude), in concordance with recent ®ndings that BRCA1
mediates ligand-independent transcriptional repression of the
ER
23
(95% accuracy, 2/38 misclassi®ed, Fig. 3c). The one sporadic
tumour that was classi®ed as a BRCA1 tumour was shown to
contain methylation of the BRCA1 promoter, indicating an epige-
netic modi®cation of BRCA1
24
(data not shown). Notably, the
discordant BRCA1 tumour is from a patient where the germline
mutation has only altered the last 29 amino acids of the BRCA1
protein (BRCA1 mutation 5,622del62), which abolishes transcrip-
tional activation by BRCA1
25
). One previous study de®ned a gene
expression signature associated with BRCA1 germline mutations
using a panel of seven tumours
26
; however, the study was unable
to appreciate the overlap in signatures between the ER-negative
and BRCA1 tumours. Furthermore, the nine BRCA1 status repor-
ter genes
26
were not present in our set of 100 optimal reporter
genes. The two-layer cluster analysis that we have used and the
larger number of tumours we analysed may account for these
differences.
Our results indicate that breast cancer prognosis can already be
derived from the gene expression pro®le of the primary tumour.
Recent consensus conferences on treatment of breast cancer in
Europe and the USA (St. Gallen
2
and NIH consensus
3
) have
developed guidelines for the eligibility of adjuvant chemotherapy
based on histological and clinical characteristics. Following these
Table 1 Breast cancer patients eligible for adjuvant systemic therapy
Patient group
Consensus Total patient group
(n 78)
Metastatic disease
at 5 yr (n 34)
Disease free
at 5 yr (n 44)
.............................................................................................................................................................................
St Gallen 64/78 (82%) 33/34 (97%) 31/44 (70%)
NIH 72/78 (92%) 32/34 (94%) 40/44 (91%)
Prognosis pro®le* 43/78 (55%) 31/34 (91%) 12/44 (27%)
(18/44 (41%)²)
.............................................................................................................................................................................
The conventional consensus criteria are: tumour $2 cm, ER negative, grade 3, patient ,35 yr
(either one of these criteria; St Gallen consensus); tumour .1 cm (NIH consensus).
* Number of tumours having a poor prognosis signature using our microarray pro®le, de®ned by the
optimized sensitivity threshold in the 70-gene classi®er (see Fig. 2b).
² Number of tumours with a poor prognosis signature in the group of disease-free patients, when
the cross-validated classi®er is applied.
© 2002 Macmillan Magazines Ltd

Citations
More filters
Journal ArticleDOI
04 Oct 2012-Nature
TL;DR: The ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity.
Abstract: We analysed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, microRNA sequencing and reverse-phase protein arrays. Our ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at >10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the luminal A subtype. We identified two novel protein-expression-defined subgroups, possibly produced by stromal/microenvironmental elements, and integrated analyses identified specific signalling pathways dominant in each molecular subtype including a HER2/phosphorylated HER2/EGFR/phosphorylated EGFR signature within the HER2-enriched expression subtype. Comparison of basal-like breast tumours with high-grade serous ovarian tumours showed many molecular commonalities, indicating a related aetiology and similar therapeutic opportunities. The biological finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biological subtypes of breast cancer.

9,355 citations


Cites background or methods from "Gene expression profiling predicts ..."

  • ...We identified: (1) BRCA1 inactivation; (2) RB1 loss and cyclin E1 amplification; (3) high expression of AKT3; (4) MYC amplification and high expression; and (5) a high frequency of TP53 mutations (Fig....

    [...]

  • ...Technology platforms used include: (1) gene expression DNA microarrays(52); (2) DNA methylation arrays; (3) miRNA sequencing; (4) Affymetrix SNP...

    [...]

Journal ArticleDOI
Jean Paul Thiery1
TL;DR: Epithelial–mesenchymal transition provides a new basis for understanding the progression of carcinoma towards dedifferentiated and more malignant states.
Abstract: Without epithelial–mesenchymal transitions, in which polarized epithelial cells are converted into motile cells, multicellular organisms would be incapable of getting past the blastula stage of embryonic development. However, this important developmental programme has a more sinister role in tumour progression. Epithelial–mesenchymal transition provides a new basis for understanding the progression of carcinoma towards dedifferentiated and more malignant states.

6,362 citations

Journal ArticleDOI
TL;DR: The gene-expression profile studied is a more powerful predictor of the outcome of disease in young patients with breast cancer than standard systems based on clinical and histologic criteria.
Abstract: Background A more accurate means of prognostication in breast cancer will improve the selection of patients for adjuvant systemic therapy. Methods Using microarray analysis to evaluate our previously established 70-gene prognosis profile, we classified a series of 295 consecutive patients with primary breast carcinomas as having a gene-expression signature associated with either a poor prognosis or a good prognosis. All patients had stage I or II breast cancer and were younger than 53 years old; 151 had lymph-node–negative disease, and 144 had lymph-node–positive disease. We evaluated the predictive power of the prognosis profile using univariable and multivariable statistical analyses. Results Among the 295 patients, 180 had a poor-prognosis signature and 115 had a good-prognosis signature, and the mean (±SE) overall 10-year survival rates were 54.6±4.4 percent and 94.5±2.6 percent, respectively. At 10 years, the probability of remaining free of distant metastases was 50.6±4.5 percent in the group with a...

5,902 citations

Journal ArticleDOI
TL;DR: The recurrence score has been validated as quantifying the likelihood of distant recurrence in tamoxifen-treated patients with node-negative, estrogen-receptor-positive breast cancer and could be used as a continuous function to predict distant recurrent in individual patients.
Abstract: background The likelihood of distant recurrence in patients with breast cancer who have no involved lymph nodes and estrogen-receptor–positive tumors is poorly defined by clinical and histopathological measures. methods We tested whether the results of a reverse-transcriptase–polymerase-chain-reaction (RT-PCR) assay of 21 prospectively selected genes in paraffin-embedded tumor tissue would correlate with the likelihood of distant recurrence in patients with node-negative, tamoxifen-treated breast cancer who were enrolled in the National Surgical Adjuvant Breast and Bowel Project clinical trial B-14. The levels of expression of 16 cancerrelated genes and 5 reference genes were used in a prospectively defined algorithm to calculate a recurrence score and to determine a risk group (low, intermediate, or high) for each patient. results Adequate RT-PCR profiles were obtained in 668 of 675 tumor blocks. The proportions of patients categorized as having a low, intermediate, or high risk by the RT-PCR assay were 51, 22, and 27 percent, respectively. The Kaplan–Meier estimates of the rates of distant recurrence at 10 years in the low-risk, intermediate-risk, and high-risk groups were 6.8 percent (95 percent confidence interval, 4.0 to 9.6), 14.3 percent (95 percent confidence interval, 8.3 to 20.3), and 30.5 percent (95 percent confidence interval, 23.6 to 37.4). The rate in the low-risk group was significantly lower than that in the high-risk group (P<0.001). In a multivariate Cox model, the recurrence score provided significant predictive power that was independent of age and tumor size (P<0.001). The recurrence score was also predictive of overall survival (P<0.001) and could be used as a continuous function to predict distant recurrence in individual patients. conclusions The recurrence score has been validated as quantifying the likelihood of distant recurrence in tamoxifen-treated patients with node-negative, estrogen-receptor–positive breast cancer.

5,685 citations

Journal ArticleDOI
TL;DR: This review discusses patterns of DNA methylation and chromatin structure in neoplasia and the molecular alterations that might cause them and/or underlie altered gene expression in cancer.
Abstract: Patterns of DNA methylation and chromatin structure are profoundly altered in neoplasia and include genome-wide losses of, and regional gains in, DNA methylation. The recent explosion in our knowledge of how chromatin organization modulates gene transcription has further highlighted the importance of epigenetic mechanisms in the initiation and progression of human cancer. These epigenetic changes -- in particular, aberrant promoter hypermethylation that is associated with inappropriate gene silencing -- affect virtually every step in tumour progression. In this review, we discuss these epigenetic events and the molecular alterations that might cause them and/or underlie altered gene expression in cancer.

5,492 citations

References
More filters
Journal ArticleDOI
17 Aug 2000-Nature
TL;DR: Variation in gene expression patterns in a set of 65 surgical specimens of human breast tumours from 42 different individuals were characterized using complementary DNA microarrays representing 8,102 human genes, providing a distinctive molecular portrait of each tumour.
Abstract: Human breast tumours are diverse in their natural history and in their responsiveness to treatments. Variation in transcriptional programs accounts for much of the biological diversity of human cells and tumours. In each cell, signal transduction and regulatory systems transduce information from the cell's identity to its environmental status, thereby controlling the level of expression of every gene in the genome. Here we have characterized variation in gene expression patterns in a set of 65 surgical specimens of human breast tumours from 42 different individuals, using complementary DNA microarrays representing 8,102 human genes. These patterns provided a distinctive molecular portrait of each tumour. Twenty of the tumours were sampled twice, before and after a 16-week course of doxorubicin chemotherapy, and two tumours were paired with a lymph node metastasis from the same patient. Gene expression patterns in two tumour samples from the same individual were almost always more similar to each other than either was to any other sample. Sets of co-expressed genes were identified for which variation in messenger RNA levels could be related to specific features of physiological variation. The tumours could be classified into subtypes distinguished by pervasive differences in their gene expression patterns.

14,768 citations

Journal ArticleDOI
TL;DR: Survival analyses on a subcohort of patients with locally advanced breast cancer uniformly treated in a prospective study showed significantly different outcomes for the patients belonging to the various groups, including a poor prognosis for the basal-like subtype and a significant difference in outcome for the two estrogen receptor-positive groups.
Abstract: The purpose of this study was to classify breast carcinomas based on variations in gene expression patterns derived from cDNA microarrays and to correlate tumor characteristics to clinical outcome. A total of 85 cDNA microarray experiments representing 78 cancers, three fibroadenomas, and four normal breast tissues were analyzed by hierarchical clustering. As reported previously, the cancers could be classified into a basal epithelial-like group, an ERBB2-overexpressing group and a normal breast-like group based on variations in gene expression. A novel finding was that the previously characterized luminal epithelial/estrogen receptor-positive group could be divided into at least two subgroups, each with a distinctive expression profile. These subtypes proved to be reasonably robust by clustering using two different gene sets: first, a set of 456 cDNA clones previously selected to reflect intrinsic properties of the tumors and, second, a gene set that highly correlated with patient outcome. Survival analyses on a subcohort of patients with locally advanced breast cancer uniformly treated in a prospective study showed significantly different outcomes for the patients belonging to the various groups, including a poor prognosis for the basal-like subtype and a significant difference in outcome for the two estrogen receptor-positive groups.

10,791 citations

Journal ArticleDOI
TL;DR: The absolute improvement in recurrence was greater during the first 5 years, whereas the improvement in survival grew steadily larger throughout the first 10 years, and these benefits appeared to be largely irrespective of age, menopausal status, daily tamoxifen dose, and of whether chemotherapy had been given to both groups.

3,701 citations

Journal Article
TL;DR: There have been many randomised trials of adjuvant tamoxifen among women with early breast cancer, and an updated overview of their results is presented in this paper, which approximately doubles the amount of evidence from trials of about 5 years of tamoxifier and, taking all trials together, on events occurring more than 5 years after randomisation.

3,447 citations

Journal Article
TL;DR: The age-specific benefits of polychemotherapy appeared to be largely irrespective of menopausal status at presentation, oestrogen receptor status of the primary tumour, and of whether adjuvant tamoxifen had been given.

2,945 citations

Related Papers (5)
Frequently Asked Questions (1)
Q1. What are the contributions in this paper?

In this paper, the authors defined the following types of relationships: 1 ) `` ( ! !. &. &. & %: %: =/ % ' # = % & & % % # % % % > 425 6 7 > ? - !. # % -.