scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Molecular portraits of human breast tumours

TL;DR: Variation in gene expression patterns in a set of 65 surgical specimens of human breast tumours from 42 different individuals were characterized using complementary DNA microarrays representing 8,102 human genes, providing a distinctive molecular portrait of each tumour.
Abstract: Human breast tumours are diverse in their natural history and in their responsiveness to treatments. Variation in transcriptional programs accounts for much of the biological diversity of human cells and tumours. In each cell, signal transduction and regulatory systems transduce information from the cell's identity to its environmental status, thereby controlling the level of expression of every gene in the genome. Here we have characterized variation in gene expression patterns in a set of 65 surgical specimens of human breast tumours from 42 different individuals, using complementary DNA microarrays representing 8,102 human genes. These patterns provided a distinctive molecular portrait of each tumour. Twenty of the tumours were sampled twice, before and after a 16-week course of doxorubicin chemotherapy, and two tumours were paired with a lymph node metastasis from the same patient. Gene expression patterns in two tumour samples from the same individual were almost always more similar to each other than either was to any other sample. Sets of co-expressed genes were identified for which variation in messenger RNA levels could be related to specific features of physiological variation. The tumours could be classified into subtypes distinguished by pervasive differences in their gene expression patterns.

Summary (1 min read)

Jump to:  and [Summary]

Summary

  • Sample damage by X-rays and other radiation limits the resolution of structural studies on non-repetitive and non-reproducible structures such as individual biomolecules or cells 1 .
  • Here the authors have used computer simulations to investigate the structural information that can be recovered from the scattering of intense femtosecond X-ray pulses by single protein molecules and small assemblies.
  • The authors predict that such ultrashort, high-intensity X-ray pulses from free-electron lasers 6, 7 that are currently under development, in combination with container-free sample handling methods based on spraying techniques, will provide a new approach to structural determinations with X-rays.
  • At 1 A Ê wavelength, the photoelectric crosssection of carbon is about 10 times higher than its elastic-scattering cross-section, making the photoelectric effect the primary source of damage.
  • The photoelectric effect is a resonance phenomenon in which a photon is absorbed and an electron ejected 8 , usually from a low-lying orbital of the atom (about 95% of the photoelectric events remove K-shell electrons from carbon, nitrogen, oxygen and sulphur), producing a hollow ion with an unstable electronic con®guration.
  • Relaxation is achieved through an electron from a higher shell falling into the vacant orbital.
  • In heavy elements this usually gives rise to X-ray ¯uorescence, whereas in light elements the falling electron is more likely to give up its energy to another electron, which is then ejected in the Auger effect.
  • Auger emission is predominant in carbon, nitrogen, oxygen and sulphur (. 95%) 9 ; thus, most photoelectric events ultimately remove two electrons from these elements.
  • These two electrons have different energies (,12 keV for photoelectrons and ,0.25 keV for Auger electrons),.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Biol.
A
.................................................................
Molecular portraits of
human breast tumours
Charles M. Perou*
²
, Therese Sùrlie
²³
, Michael B. Eisen*,
Matt van de Rijn§, Stefanie S. Jeffreyk, Christian A. Rees*,
Jonathan R. Pollack, Douglas T. Ross, Hilde Johnsen
³
,
Lars A. Akslen#, éystein FlugeI, Alexander Pergamenschikov*,
Cheryl Williams*, Shirley X. Zhu§, Per E. Lùnning**,
Anne-Lise Bùrresen-Dale
³
, Patrick O. Brown
²²
& David Botstein*
*
..............................................................................................................................................
Human breast tumours are diverse in their natural history and in
their responsiveness to treatments
1
. Variation in transcriptional
programs accounts for much of the biological diversity of human
cells and tumours. In each cell, signal transduction and regulatory
systems transduce information from the cell's identity to its
environmental status, thereby controlling the level of expression
of every gene in the genome. Here we have characterized variation
in gene expression patterns in a set of 65 surgical specimens of
human breast tumours from 42 different individuals, using
complementary DNA microarrays representing 8,102 human
genes. These patterns provided a distinctive molecular portrait
of each tumour. Twenty of the tumours were sampled twice,
before and after a 16-week course of doxorubicin chemotherapy,
and two tumours were paired with a lymph node metastasis from
the same patient. Gene expression patterns in two tumour
samples from the same individual were almost always more
similar to each other than either was to any other sample. Sets
of co-expressed genes were identi®ed for which variation in
messenger RNA levels could be related to speci®c features of
physiological variation. The tumours could be classi®ed into
subtypes distinguished by pervasive differences in their gene
expression patterns.
We proposed that the phenotypic diversity of breast tumours
might be accompanied by a corresponding diversity in gene expres-
sion patterns that we could capture using cDNA microarrays.
Systematic investigation of gene expression patterns in human
breast tumours might then provide the basis for an improved
molecular taxonomy of breast cancers. We analysed gene expression
patterns in grossly dissected normal or malignant human breast
tissues from 42 individuals (36 in®ltrating ductal carcinomas, 2
lobular carcinomas, 1 ductal carcinoma in situ, 1 ®broadenoma and
3 normal breast samples). Fluorescently labelled (Cy5) cDNA was
prepared from mRNA from each experimental sample. We prepared
cDNA, labelled using a second distinguishable ¯uorescent nucleo-
tide (Cy3), from a pool of mRNAs isolated from 11 different

748
cultured cell lines (see Supplementary Information Table 1); this
common `reference' sample provided an internal standard against
which the gene expression of each experimental sample was
compared
2,3
.
Twenty of the forty breast tumours examined were sampled twice,
as part of a larger study on locally advanced breast cancers (T
3
/T
4
and/or N
2
tumours; see ref. 4). After an open surgical biopsy to
obtain the `before' sample, each of these patients was treated with
doxorubicin for an average of 16 weeks (range 12±23), followed by
resection of the remaining tumour. In addition, primary tumours
a
b
NORWAY 100-BE
NORWAY 100-AF
NORWAY 10-AF
NORWAY 10-BE
NORWAY 102-BE
NORWAY 102-AF
NORWAY 7-AF
NORWAY 17
NORWAY 39-AF
NORWAY 39-BE
NORWAY 15-AF
STANFORD 17
STANFORD 35
NORWAY 14-AF
NORWAY 14-BE
NORMAL 3
NORMAL 1
NORMAL 2
STANFORD 37-FA
NORWAY 61-BE
NORWAY 101-AF
NORWAY 61-AF
NORWAY 47-AF
NORWAY 65
NORWAY 112-BE
NORWAY 112-AF
NORWAY 109-BE
NORWAY 109-AF
NORWAY 101-BE
NORWAY 57
NORWAY 53-AF
NORWAY 53-BE
NORWAY 104-BE
NORWAY 104-AF
NORWAY 11
NORWAY 12-BE
NORWAY 12-AF
NEW YORK 1
NORWAY 111
NORWAY 27-AF
NORWAY 27-BE
NORWAY 18-AF
NORWAY 18-BE
STANFORD 24
NORWAY 16
NORWAY 56
NORWAY 7-BE
STANFORD 38-P
STANFORD 38-LN
STANFORD 16
STANFORD 14
NEW YORK 3
NORWAY 41-BE
NORWAY 41-AF
NEW YORK 2
STANFORD 23
STANFORD 2-LN
STANFORD 2-P
NORWAY 26-BE
NORWAY 26-AF
NORWAY 19
NORWAY 15-BE
NORWAY 48-AF
NORWAY 48-BE
NORWAY 47-BE
HMEC-C
HMEC+IFNα
HMEC-C CON
184Aa
184A1
184B5
HMVEC
HUVEC
MDA-MB-231
SW872
BT-549
Hs578T
RPMI-8226
MOLT4
NB4+ATRA
SK-BR-3
BT-474
MCF7
T47D
c
d
e
f
g
h
i
j
TISSUE FACTOR PATHWAY INHIBITOR
ALDEHYDE DEHYDROGENASE 1, SOLUBLE
HOMO SAPIENS MRNA FOR KIAA0758 PROTEIN
VON WILLEBRAND FACTOR
PLATELET/ENDOTHELIAL CELL ADHESION MOLECULE CD31
MANIC FRINGE DROSOPHILA HOMOLOG
INTERCELLULAR ADHESION MOLECULE 2
245147
REGULATOR OF G-PROTEIN SIGNALLING 5
TEK TYROSINE KINASE,VENOUS MALFORMATIONS
LIM BINDING DOMAIN 2
KINASE SCAFFOLD PROTEIN GRAVIN
359722
TYROSINE KINASE WITH IG AND EGF HOMOLOGY DOMAINS
CD34 ANTIGEN
SEQUENCE FROM CLONE 1033B10 ON 6P21.2-21.31.
69672
HOMO SAPIENS KDR/FLK-1 PROTEIN
MUSCULIN ACTIVATED B-CELL FACTOR-1
COLLAGEN,TYPE V, ALPHA 1
471748
SMOOTH MUSCLE ACTIN, ALPHA2
TRANSGELIN/SM22
SMOOTH MUSCLE PROTEIN 22-ALPHA
LUMICAN
FIBULIN 1
COLLAGEN, TYPE VI, ALPHA 3
OSTEOBLAST SPECIFIC FACTOR 2 OSF-2P1
COLLAGEN, TYPE III, ALPHA 1
COLLAGEN,TYPE I, ALPHA 1
COLLAGEN,TYPE I, ALPHA 2
COLLAGEN,TYPE III, ALPHA 1
COLLAGEN,TYPE III, ALPHA 1
COLLAGEN,TYPE I, ALPHA 2
THY-1 CELL SURFACE ANTIGEN
HOMO SAPIENS, ALPHA-1 VI COLLAGEN
COLLAGEN,TYPE VI, ALPHA 1
COLLAGEN,TYPE VI, ALPHA 1
ALPHA-2 COLLAGEN TYPE VI
HUMAN METHIONINE SYNTHASE
265694
TROPONIN I, SKELETAL, FAST
MATRIX METALLOPROTEINASE 14
LAMININ, GAMMA 2
ANNEXIN VIII
EST SIMILAR TO ATAXIA-TELANGIECTASIA D PROTEIN
KERATIN 17
KERATIN 17
ESTS, HIGHLY SIMILAR TO KERATIN K5
KERATIN 5
ESTS, HIGHLY SIMILAR TO KERATIN K5
BULLOUS PEMPHIGOID ANTIGEN 1
S100 CALCIUM-BINDING PROTEIN A2
INTEGRIN, BETA 4
INTEGRIN, BETA 4
2255577
LAMININ, ALPHA 3
COLLAGEN,TYPE XVII, ALPHA 1
BASONUCLIN
IMMUNOGLOBULIN GAMMA 3 GM MARKER
COLONY STIMULATING FACTOR 1 MACROPHAGE
NEUTROPHIL CYTOSOLIC FACTOR 1
IMMUNOGLOBULIN LAMBDA-LIKE POLYPEPTIDE 2
IMMUNOGLOBULIN LAMBDA LIGHT CHAIN
IMMUNOGLOBULIN LAMBDA LIGHT CHAIN
IMMUNOGLOBULIN LAMBDA LIGHT CHAIN
HUMAN IG J CHAIN GENE
IMMUNOGLOBULIN J CHAIN
HUMAN IG J CHAIN GENE
MHC CLASS II, DQ BETA 1
IMMUNOGLOBULIN MU
EARLY DEVELOPMENT REGULATOR 2
MAX-INTERACTING PROTEIN 1
MESENCHYME HOMEO BOX 1
INSULIN-LIKE GROWTH FACTOR 1 SOMATOMEDIN C
CYCLIN-DEPENDENT KINASE INHIBITOR 1C P57, KIP2
78946
FATTY ACID BINDING PROTEIN 4, ADIPOCYTE
FATTY ACID BINDING PROTEIN 4, ADIPOCYTE
FATTY ACID BINDING PROTEIN 4, ADIPOCYTE
MDGI/FATTY ACID BINDING PROTEIN 3
CD36 ANTIGEN COLLAGEN TYPE I RECEPTOR
CD36 ANTIGEN COLLAGEN TYPE I RECEPTOR
GLUTATHIONE PEROXIDASE 3 PLASMA
FOUR AND A HALF LIM DOMAINS 1
ALCOHOL DEHYDROGENASE 2 CLASS I, BETA
AQUAPORIN 7
484535
LIPOPROTEIN LIPASE
GLYCEROL-3-PHOSPHATE DEHYDROGENASE 1
RETINOL-BINDING PROTEIN 4, INTERSTITIAL
INTEGRIN, ALPHA 7
85660
PHOSPHOLEMMAN
AQUAPORIN 1 CHANNEL-FORMING INTEGRAL PROTEIN
APOLIPOPROTEIN A-I
SMALL INDUCIBLE CYTOKINE SUBFAMILY A CYS-CYS
PPAR, GAMMA
ENDOTHELIN RECEPTOR TYPE B
ESTS SIMILAR TO !!!! ALU SUBFAMILY SX WARNING
CHITINASE 1
53341
SMALL INDUCIBLE CYTOKINE SUBFAMILY A, MEMBER 18
FOLYLPOLYGLUTAMATE SYNTHASE
LYSOZYME RENAL AMYLOIDOSIS
LYSOZYME RENAL AMYLOIDOSIS
AP-2 ALPHA ACTIVATING ENHANCER-BINDING PROTEIN 2
LIPASE A, LYSOSOMAL ACID
CD68 ANTIGEN
ACID PHOSPHATASE 5, TARTRATE RESISTANT
FC FRAGMENT OF IGE, HIGH AFFINITY I, RECEPTOR FOR
CATHEPSIN Z
INTERLEUKIN 10 RECEPTOR, ALPHA
INTEGRIN, ALPHA L, CD11A
742143
T-CELL RECEPTOR, BETA CLUSTER
80186
T-CELL RECEPTOR, DELTA V,D,J,C
ESTS SIMILAR TO S-ACYL FATTY ACID SUNTHETASE
LYMPHOCYTE-SPECIFIC PROTEIN TYROSINE KINASE
CD3D ANTIGEN, DELTA
CD3G ANTIGEN, GAMMA
DP-2 E2F DIMERIZATION PARTNER 2
HUMAN ENDOGENOUS RETROVIRUS ENVELOPE PL1
X-BOX BINDING PROTEIN 1
HEPATOCYTE NUCLEAR FACTOR 3, ALPHA
GATA-BINDING PROTEIN 3
GATA-BINDING PROTEIN 3
GATA-BINDING PROTEIN 3
GATA-BINDING PROTEIN 3
ESTROGEN RECEPTOR 1
ESTROGEN RECEPTOR 1
ANNEXIN XXXI
ANNEXIN XXXI
1:1 >2 >4 >6 >8>2>4>6>8 >16>16

from two patients were also paired with a lymph node metastasis
from the same patient. To help interpret the variation in expression
patterns seen in the tumour samples, we also characterized 17
cultured cell lines (with one cell line cultured under three different
conditions), which provided models for many of the cell types
encountered in these tissue samples. In total, we analysed 84 cDNA
microarray experiments (see Supplementary Information, Table 2;
the primary data tables can be obtained at http://genome-
www.stanford.edu/molecularportraits/).
A hierarchical clustering method was used to group genes on the
basis of similarity in the pattern with which their expression varied
over all samples
5
. The same clustering method was used to group the
experimental samples (cell lines and tissues separately) on the basis
of similarity in their patterns of expression. We focus ®rst on a set of
1,753 genes (about 22% of the 8,102 genes analysed), whose
transcripts varied in abundance by at least fourfold from their
median abundance in this sample set in at least three of the samples
(Fig. 1; see Supplementary Information Fig. 4 for the complete
cluster diagram).
Three striking features of the gene expression patterns of these
tumours are evident in Fig. 1. First, the tumours show great
variation in their patterns of gene expression. Second, this variation
is multidimensional; that is, many different sets of genes show
mainly independent patterns of variation. Third, these patterns
have a pervasive order re¯ecting relationships among the genes,
relationships among the tumours and connections between speci®c
genes and speci®c tumours.
The hierarchical clustering algorithm organizes the experimental
samples only on the basis of overall similarity in their gene
expression patterns; these relationships are summarized in a den-
drogram (Fig. 1a), in which the pattern and length of the branches
re¯ects the relatedness of the samples
5
. Fifteen of the twenty before
and after doxorubicin pairs (red dendrogram branches), and both
primary tumour/lymph node metastasis pairs (light blue branches)
were clustered together on terminal branches in the dendrogram;
that is, despite an interval of 16 weeks, independent surgical
procedures and cytotoxic chemotherapy, independent samples
taken from the same tumour were in most cases recognizably
more similar to each other than either was to any of the other
samples. In three instances (Norway 47, 61 and 101), the `after'
chemotherapy specimens clustered in a branch of the dendrogram
that also contained the three normal breast samples; we know from
the clinical data that these tumours were 3 of the 20 tumours that
were classi®ed as doxorubicin `responders' (data not shown). An
analysis of the relationship between gene expression and correla-
tions with clinical data will be reported elsewhere (T.S. et al.,
manuscript in preparation).
The `molecular portraits' revealed in the patterns of gene expres-
sion not only uncovered similarities and differences among the
tumours, but in many cases pointed to a biological interpretation.
Variation in growth rate, in the activity of speci®c signalling path-
ways, and in the cellular composition of the tumours were all
re¯ected in the corresponding variation in the expression of speci®c
subsets of genes. The largest distinct cluster of genes within the
1,753-gene cluster diagram was the `proliferation cluster' (Supple-
mentary Information Fig. 5), which is a group of genes whose levels
of expression correlate with cellular proliferation rates
3,6
. Expression
of this cluster of genes varied widely among the tumour samples,
and was generally well correlated with the mitotic index. As one
might expect, this cluster also included the genes encoding two
widely used immunohistochemical markers of cell proliferation
(Ki-67 and PCNA).
Several groups of co-expressed genes provided views of the
activities of speci®c signalling and/or regulatory systems. A large
cluster of genes regulated by the interferon pathway (including
STAT1) showed substantial variation in expression among the
tumours, as was previously observed in a smaller set of breast
Figure 1 Variation in expression of 1,753 genes in 84 experimental samples. Data are
presented in a matrix format: each row represents a single gene, and each column an
experimental sample. In each sample, the ratio of the abundance of transcripts of each
gene to the median abundance of the gene's transcript among all the cell lines (left panel),
or to its median abundance across all tissue samples (right panel), is represented by the
colour of the corresponding cell in the matrix. Green squares, transcript levels below the
median; black squares, transcript levels equal to the median; red squares, transcript
levels greater than the median; grey squares, technically inadequate or missing data.
Colour saturation re¯ects the magnitude of the ratio relative to the median for each set of
samples (see scale, bottom left; and Supplementary Information Fig. 4). a, Dendrogram
representing similarities in the expression patterns between experimental samples. All
`before and after' chemotherapy pairs that were clustered on terminal branches are
highlighted in red; the two primary tumour/lymph node metastasis pairs in light blue; the
three clustered normal breast samples in light green. Branches representing the four
breast luminal epithelial cell lines are shown in dark blue; breast basal epithelial cell lines
in orange, the endothelial cell lines in dark yellow, the mesynchemal-like cell lines in dark
green, and the lymphocyte-derived cell lines in brown. b, Scaled-down representation of
the 1,753-gene cluster diagram; coloured bars to the right identify the locations of the
inserts displayed in c±j. c, Endothelial cell gene expression cluster; d, stromal/®broblast
cluster; e, breast basal epithelial cluster; f, B-cell cluster; g, adipose-enriched/normal
breast; h, macrophage; i, T-cell; j, breast luminal epithelial cell.
dc
ba
Figure 2 Breast tissue immunohistochemistry. a, Normal mammary duct using antibodies
against the basal keratins 5/6. b, Normal mammary duct using antibodies against the
luminal keratins 8/18 (adjacent tissues sections were used in a and b ). c, Tumour
Stanford 16 using antibodies against keratins 8/18. d, Tumour New York 3 using
antibodies against keratins 5/6.

tumours
6
. Variation in expression of the oestrogen receptor-a gene
(ER) correlated well with the direct clinical measurement of the ER
protein levels in the tumours (Supplementary Information Table 3;
concordance in 36/38 samples), and paralleled variation in the
expression of a larger group of genes that included three other
transcription factors (GATA-binding protein 3 (refs 7, 8), X-box
binding protein 1 and hepatocyte nuclear factor 3a). HER2/neu,
also known as Erb-B2, is overexpressed in 20±30% of all breast
tumours, usually associated with DNA ampli®cation of the Erb-B2
locus
9,10
. Notably, most of the other genes contained within the
Figure 3 Cluster analysis using the `intrinsic' gene subset. Two large branches were apparent in the dendrogram, and within these large branches were smaller branches for which
common biological themes could be inferred. Branches are coloured accordingly: basal-like, orange; Erb-B2+, pink; normal-breast-like, light green; and luminal epithelial/ER+, dark
blue. a, Experimental sample associated cluster dendrogram. Small black bars beneath the dendrogram identify the 17 pairs that were matched by this hierarchical clustering; larger
green bars identify the positions of the three pairs that were not matched by the clustering. b, Scaled-down representation of the intrinsic cluster diagram (see Supplementary
Information Fig. 6). c, Luminal epithelial/ER gene cluster. d, Erb-B2 overexpression cluster. e, Basal epithelial cell associated cluster containing keratins 5 and 17. f, A second basal
epithelial-cell-enriched gene cluster.

Erb-B2 cluster were located in this same region of chromosome 17,
and were also ampli®ed on the genomic DNA level (ref. 10; and
J.R.P., unpublished data). Finally, a cluster of genes that included c-
Fos and JunB co-varied in expression among the tumour specimens.
We have found that this subset of genes is characteristically induced
by prolonged handling of the samples after surgical resection
(M.v.d.R. and C.M.P., unpublished data).
Human breast tumours are histologically complex tissues, con-
taining a variety of cell types in addition to the carcinoma cells
11
.In
analysing the gene expression patterns in solid human tumours, we
used two lines of reasoning to infer the lineage of the cells that
accounted for the apparently cell-type-speci®c expression of par-
ticular clustered groups of genes. First, such clusters included genes
whose expression patterns have been previously characterized and
that consistently pointed to a speci®c cell type. Second, these
inferences were often corroborated by comparable expression of
the same cluster in one or more of the cultured cell lines. Thus, eight
independent clusters of genes appeared to re¯ect variation in
speci®c cell types present within the tumours (Fig. 1c±j).
(1) Endothelial cells: a cluster of genes characteristically expressed
by endothelial cells, including CD34, CD31 and von Willebrand
factor were also strongly expressed in the two endothelial cell lines
HUVEC and HMVEC (Fig. 1c). (2) Stromal cells: a previously
characterized cluster of genes that included several isoforms of
collagen showed signi®cant variation in expression among samples
(Fig. 1d)
3,6
. (3) Adipose-enriched/normal breast cells: a cluster of
genes including fatty-acid binding protein 4 and PPARg may
represent the presence of adipose cells (Fig. 1g). (4) B lymphocytes:
variation in expression of a cluster of genes that were highly
expressed in the multiple myeloma-derived cell line RPMI-8226,
including many immunoglobulin genes, appears to represent vari-
able B-cell in®ltration (Fig. 1f). (5) T lymphocytes: a cluster of genes
including CD3d and two subunits of the T-cell receptor were highly
expressed in the T-cell leukaemia-derived cell line MOLT-4 and
probably indicate T-cell in®ltrates (Fig. 1i). (6) Macrophages: a
cluster of genes that appeared to be markers of macrophage/
monocytes included CD68, acid phosphatase 5, chitinase and
lysozyme (Fig. 1h).
Two distinct types of epithelial cell are found in the human
mammary gland: basal (and/or myoepithelial) cells and luminal
epithelial cells
11,12
. These two cell types are conveniently distin-
guished immunohistochemically; basal epithelial cells can be
stained with antibodies to keratin 5/6 (Fig. 2a), whereas luminal
epithelial cells stain with antibodies against keratins 8/18 (Fig. 2b).
Many genes were expressed by one of these two cell lineages, but not
by the other (Fig. 1e and j). The gene expression cluster character-
istic of basal epithelial cells included keratin 5, keratin 17, integrin-
b4 and laminin (Fig. 1e)
11
. The gene expression cluster character-
istic of the luminal cells was anchored by the previously noted
cluster of transcription factors that included ER (Fig. 1j).
One goal of this study was to develop a system for classifying
tumours on the basis of their gene expression patterns. The subset of
genes shown in Fig. 1 was not necessarily optimal for this purpose,
as the choice of genes whose expression levels provided the basis for
the ordering of the tumour samples determined which phenotypic
relationships among the tumours were re¯ected in the clustering
patterns. We therefore selected an alternative subset of genes to use
as the basis for a new clustering analysis.
The rationale behind this alternative gene subset was that speci®c
features of a gene expression pattern that are to be used to classify
tumours should be similar in any sample taken from the same
tumour, and they should vary among different tumours. The 22
paired samples provided a unique opportunity for a deliberate and
systematic search for such genes. From the genes whose expression
was well measured in the 65 tissue samples, we selected a subset of
496 genes (termed the `intrinsic' gene subset) that consisted of genes
with signi®cantly greater variation in expression between different
tumours than between paired samples from the same tumour (see
Supplementary Information). When variation in expression of this
set of genes was used to order the tissue samples (Fig. 3; and
Supplementary Information Fig. 6), 17 of the 20 `before and after'
doxorubicin pairs were grouped together as were both of the tumour/
lymph node metastasis pairs. Qualitatively similar sample clustering
patterns were obtained when a second gene subset that focused on
genes expressed by epithelial cell types, and which had only 25%
overlap with the intrinsic gene subset, was used (data not shown).
The division of the tissue samples into two subgroups was a
striking feature of the intrinsic gene subset cluster analysis (Fig. 3a).
As a test of the robustness of this division, we applied the `weighted
voting' method
13
. This algorithm recapitulated the sorting of the
tissue samples between these two subgroups for all but 1 of the 65
samples (data not shown). It is important to note, however, that there
is extensive residual variation in expression patterns within each of
these two broad subgroups. Indeed, many of the ®ner subdivisions
probably have important biological properties (see below).
The two dendrogram branches in Fig. 3 largely separate the
tumour samples into those that were clinically described as ER
positive (blue) and those that were ER negative (other colours). The
tumours in the ER+ group were characterized by the relatively high
expression of many genes expressed by breast luminal cells (Fig. 3c).
This connection was further corroborated using immunohisto-
chemical analysis and antibodies against the luminal cell keratins
8/18 (Fig. 2c). With one exception, none of the tumours in this
group expressed Erb-B2 at high levels (Fig. 3d).
Many of the genes characteristic of breast basal epithelial cells
were also highly expressed in a group of six clustered tumours
(Fig. 3e). To corroborate the `basal-like' characteristics of these
tumours, we carried out immunohistochemistry using antibodies
against the breast basal cell keratins 5/6 and 17. All six of these
tumours showed staining for either keratins 5/6 or 17 or both
(Fig. 2d). Notably, these six tumours also failed to express ER and
most of the other genes that were usually co-expressed with it
(Fig. 3c). Breast tumours that stain positive for basal keratins have
been described
14±16
, and basal keratins may account for 3±15% of all
breast tumours
15,17±19
; in this study, the incidence was 15% (6/40).
As mentioned above, overexpression of the Erb-B2 oncogene was
associated with the high expression of a speci®c subset of genes. We
identi®ed a cluster of tumours that was partially characterized by
the high level of expression of this subset of genes (Fig. 3d). These
tumours also showed low levels of expression of ER
20,21
and of
almost all of the other genes associated with ER expressionÐa trait
they share with the basal-like tumours.
Several tumour samples and the single ®broadenoma tested
(Fig. 3, light green), were clustered with a group of samples that
also contained the three normal breast specimens (Fig. 3a). The
`normal breast' gene expression pattern is typi®ed by the high
expression of genes characteristic of basal epithelial cells and
adipose cells, and the low expression of genes characteristic of
luminal epithelial cells.
The number of clearly different molecular phenotypes observed
among the breast tumours suggests that we are far from having a
complete picture of the diversity of breast tumours. When hundreds
(instead of tens) of breast tumours have been characterized, a more
de®ned tumour classi®cation is likely, and statistically signi®cant
relationships with clinical parameters should be uncovered. We
were, however, able to identify four groups of samples that might
be related to different molecular features of mammary epithelial
biology (that is, ER+/luminal-like, basal-like, Erb-B2+ and normal
breast). An important implication of this study is that the clinical
designation of `oestrogen receptor negative' breast carcinoma
encompasses at least two biologically distinct subtypes of tumours
(basal-like and ErB-B2 positive), which may need to be treated as
distinct diseases.
A striking conclusion from these data concerns the stability,

Citations
More filters
Journal ArticleDOI
TL;DR: Survival analyses on a subcohort of patients with locally advanced breast cancer uniformly treated in a prospective study showed significantly different outcomes for the patients belonging to the various groups, including a poor prognosis for the basal-like subtype and a significant difference in outcome for the two estrogen receptor-positive groups.
Abstract: The purpose of this study was to classify breast carcinomas based on variations in gene expression patterns derived from cDNA microarrays and to correlate tumor characteristics to clinical outcome. A total of 85 cDNA microarray experiments representing 78 cancers, three fibroadenomas, and four normal breast tissues were analyzed by hierarchical clustering. As reported previously, the cancers could be classified into a basal epithelial-like group, an ERBB2-overexpressing group and a normal breast-like group based on variations in gene expression. A novel finding was that the previously characterized luminal epithelial/estrogen receptor-positive group could be divided into at least two subgroups, each with a distinctive expression profile. These subtypes proved to be reasonably robust by clustering using two different gene sets: first, a set of 456 cDNA clones previously selected to reflect intrinsic properties of the tumors and, second, a gene set that highly correlated with patient outcome. Survival analyses on a subcohort of patients with locally advanced breast cancer uniformly treated in a prospective study showed significantly different outcomes for the patients belonging to the various groups, including a poor prognosis for the basal-like subtype and a significant difference in outcome for the two estrogen receptor-positive groups.

10,791 citations

Journal ArticleDOI
31 Jan 2002-Nature
TL;DR: DNA microarray analysis on primary breast tumours of 117 young patients is used and supervised classification is applied to identify a gene expression signature strongly predictive of a short interval to distant metastases (‘poor prognosis’ signature) in patients without tumour cells in local lymph nodes at diagnosis, providing a strategy to select patients who would benefit from adjuvant therapy.
Abstract: Breast cancer patients with the same stage of disease can have markedly different treatment responses and overall outcome. The strongest predictors for metastases (for example, lymph node status and histological grade) fail to classify accurately breast tumours according to their clinical behaviour. Chemotherapy or hormonal therapy reduces the risk of distant metastases by approximately one-third; however, 70-80% of patients receiving this treatment would have survived without it. None of the signatures of breast cancer gene expression reported to date allow for patient-tailored therapy strategies. Here we used DNA microarray analysis on primary breast tumours of 117 young patients, and applied supervised classification to identify a gene expression signature strongly predictive of a short interval to distant metastases ('poor prognosis' signature) in patients without tumour cells in local lymph nodes at diagnosis (lymph node negative). In addition, we established a signature that identifies tumours of BRCA1 carriers. The poor prognosis signature consists of genes regulating cell cycle, invasion, metastasis and angiogenesis. This gene expression profile will outperform all currently used clinical parameters in predicting disease outcome. Our findings provide a strategy to select patients who would benefit from adjuvant therapy.

9,664 citations

Journal ArticleDOI
09 Jun 2005-Nature
TL;DR: A new, bead-based flow cytometric miRNA expression profiling method is used to present a systematic expression analysis of 217 mammalian miRNAs from 334 samples, including multiple human cancers, and finds the miRNA profiles are surprisingly informative, reflecting the developmental lineage and differentiation state of the tumours.
Abstract: Recent work has revealed the existence of a class of small non-coding RNA species, known as microRNAs (miRNAs), which have critical functions across various biological processes. Here we use a new, bead-based flow cytometric miRNA expression profiling method to present a systematic expression analysis of 217 mammalian miRNAs from 334 samples, including multiple human cancers. The miRNA profiles are surprisingly informative, reflecting the developmental lineage and differentiation state of the tumours. We observe a general downregulation of miRNAs in tumours compared with normal tissues. Furthermore, we were able to successfully classify poorly differentiated tumours using miRNA expression profiles, whereas messenger RNA profiles were highly inaccurate when applied to the same samples. These findings highlight the potential of miRNA profiling in cancer diagnosis.

9,470 citations

Journal ArticleDOI
04 Oct 2012-Nature
TL;DR: The ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity.
Abstract: We analysed primary breast cancers by genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, microRNA sequencing and reverse-phase protein arrays. Our ability to integrate information across platforms provided key insights into previously defined gene expression subtypes and demonstrated the existence of four main breast cancer classes when combining data from five platforms, each of which shows significant molecular heterogeneity. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at >10% incidence across all breast cancers; however, there were numerous subtype-associated and novel gene mutations including the enrichment of specific mutations in GATA3, PIK3CA and MAP3K1 with the luminal A subtype. We identified two novel protein-expression-defined subgroups, possibly produced by stromal/microenvironmental elements, and integrated analyses identified specific signalling pathways dominant in each molecular subtype including a HER2/phosphorylated HER2/EGFR/phosphorylated EGFR signature within the HER2-enriched expression subtype. Comparison of basal-like breast tumours with high-grade serous ovarian tumours showed many molecular commonalities, indicating a related aetiology and similar therapeutic opportunities. The biological finding of the four main breast cancer subtypes caused by different subsets of genetic and epigenetic abnormalities raises the hypothesis that much of the clinically observable plasticity and heterogeneity occurs within, and not across, these major biological subtypes of breast cancer.

9,355 citations

Journal ArticleDOI
01 Nov 2001-Nature
TL;DR: Stem cell biology has come of age: Unequivocal proof that stem cells exist in the haematopoietic system has given way to the prospective isolation of several tissue-specific stem and progenitor cells, the initial delineation of their properties and expressed genetic programmes, and the beginnings of their utility in regenerative medicine.
Abstract: Stem cell biology has come of age. Unequivocal proof that stem cells exist in the haematopoietic system has given way to the prospective isolation of several tissue-specific stem and progenitor cells, the initial delineation of their properties and expressed genetic programmes, and the beginnings of their utility in regenerative medicine. Perhaps the most important and useful property of stem cells is that of self-renewal. Through this property, striking parallels can be found between stem cells and cancer cells: tumours may often originate from the transformation of normal stem cells, similar signalling pathways may regulate self-renewal in stem cells and cancer cells, and cancer cells may include 'cancer stem cells' - rare cells with indefinite potential for self-renewal that drive tumorigenesis.

8,999 citations

References
More filters
Journal ArticleDOI
TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Abstract: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is de- scribed that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be inter- preted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly charac- terized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.

16,371 citations

Journal ArticleDOI
15 Oct 1999-Science
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Abstract: Although cancer classification has improved over the past 30 years, there has been no general approach for identifying new cancer classes (class discovery) or for assigning tumors to known classes (class prediction). Here, a generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case. A class discovery procedure automatically discovered the distinction between acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) without previous knowledge of these classes. An automatically derived class predictor was able to determine the class of new leukemia cases. The results demonstrate the feasibility of cancer classification based solely on gene expression monitoring and suggest a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.

12,530 citations

Journal ArticleDOI
03 Feb 2000-Nature
TL;DR: It is shown that there is diversity in gene expression among the tumours of DLBCL patients, apparently reflecting the variation in tumour proliferation rate, host response and differentiation state of the tumour.
Abstract: 12 Pathology and Microbiology, and 13 Diffuse large B-cell lymphoma (DLBCL), the most common subtype of non-Hodgkin's lymphoma, is clinically heterogeneous: 40% of patients respond well to current therapy and have prolonged survival, whereas the remainder succumb to the disease. We proposed that this variability in natural history reflects unrecognized molecular heterogeneity in the tumours. Using DNA microarrays, we have conducted a systematic characterization of gene expression in B-cell malignancies. Here we show that there is diversity in gene expression among the tumours of DLBCL patients, apparently reflecting the variation in tumour proliferation rate, host response and differentiation state of the tumour. We identified two molecularly distinct forms of DLBCL which had gene expression patterns indicative of different stages of B-cell differentiation. One type expressed genes characteristic of germinal centre B cells ('germinal centre B-like DLBCL'); the second type expressed genes normally induced during in vitro activation of peripheral blood B cells ('activated B-like DLBCL'). Patients with germinal centre B-like DLBCL had a significantly better overall survival than those with activated B-like DLBCL. The molecular classification of tumours on the basis of gene expression can thus identify previously undetected and clinically significant subtypes of cancer.

9,493 citations

Journal ArticleDOI
24 Oct 1997-Science
TL;DR: DNA microarrays containing virtually every gene of Saccharomyces cerevisiae were used to carry out a comprehensive investigation of the temporal program of gene expression accompanying the metabolic shift from fermentation to respiration, and the expression patterns of many previously uncharacterized genes provided clues to their possible functions.
Abstract: DNA microarrays containing virtually every gene of Saccharomyces cerevisiae were used to carry out a comprehensive investigation of the temporal program of gene expression accompanying the metabolic shift from fermentation to respiration. The expression profiles observed for genes with known metabolic functions pointed to features of the metabolic reprogramming that occur during the diauxic shift, and the expression patterns of many previously uncharacterized genes provided clues to their possible functions. The same DNA microarrays were also used to identify genes whose expression was affected by deletion of the transcriptional co-repressor TUP1 or overexpression of the transcriptional activator YAP1. These results demonstrate the feasibility and utility of this approach to genomewide exploration of gene expression patterns.

4,792 citations


"Molecular portraits of human breast..." refers methods in this paper

  • ...Methods Most of the techniques used in this work have been described elsewher...

    [...]

Journal ArticleDOI

2,538 citations

Related Papers (5)
Frequently Asked Questions (13)
Q1. What is the common cause of damage in light elements?

In heavy elements this usually gives rise to X-ray ¯uorescence, whereas in light elements the falling electron is more likely to give up its energy to another electron, which is then ejected in the Auger effect. 

Two distinct types of epithelial cell are found in the human mammary gland: basal (and/or myoepithelial) cells and luminal epithelial cells11,12. 

Estimations of radiation damage as a function of photon energy, pulse length, integrated pulse intensity and sample size show that experiments using very high X-ray dose rates and ultrashort exposures may provide useful structural information before radiation damage destroys the sample. 

Analyses of the dynamics of damage formation3±5 suggest that the conventional damage barrier (about 200 X-ray photons per AÊ 2 with X-rays of 12 keV energy or 1 AÊ wavelength2) may be extended at very high dose rates and very short exposure times. 

The `normal breast' gene expression pattern is typi®ed by the high expression of genes characteristic of basal epithelial cells and adipose cells, and the low expression of genes characteristic of luminal epithelial cells. 

Keratin expression in human mammary epithelial cells cultured fromnormal and malignant tissue: relation to in vivo phenotypes and in¯uence of medium. 

Fifteen of the twenty before and after doxorubicin pairs (red dendrogram branches), and both primary tumour/lymph node metastasis pairs (light blue branches) were clustered together on terminal branches in the dendrogram; that is, despite an interval of 16 weeks, independent surgical procedures and cytotoxic chemotherapy, independent samples taken from the same tumour were in most cases recognizably more similar to each other than either was to any of the other samples. 

The authors focus ®rst on a set of 1,753 genes (about 22% of the 8,102 genes analysed), whose transcripts varied in abundance by at least fourfold from their median abundance in this sample set in at least three of the samples (Fig. 1; see Supplementary Information Fig. 4 for the complete cluster diagram). 

(1) Endothelial cells: a cluster of genes characteristically expressed by endothelial cells, including CD34, CD31 and von Willebrand factor were also strongly expressed in the two endothelial cell lines HUVEC and HMVEC (Fig. 1c). 

(6) Macrophages: a cluster of genes that appeared to be markers of macrophage/ monocytes included CD68, acid phosphatase 5, chitinase and lysozyme (Fig. 1h). 

Many of the genes characteristic of breast basal epithelial cells were also highly expressed in a group of six clustered tumours (Fig. 3e). 

In three instances (Norway 47, 61 and 101), the `after' chemotherapy specimens clustered in a branch of the dendrogram that also contained the three normal breast samples; the authors know from the clinical data that these tumours were 3 of the 20 tumours that were classi®ed as doxorubicin `responders' (data not shown). 

The hierarchical clustering algorithm organizes the experimental samples only on the basis of overall similarity in their gene expression patterns; these relationships are summarized in a dendrogram (Fig. 1a), in which the pattern and length of the branches re¯ects the relatedness of the samples5.