scispace - formally typeset
Open AccessJournal ArticleDOI

Serum Metabolomic Profiles Identify ER-Positive Early Breast Cancer Patients at Increased Risk of Disease Recurrence in a Multicenter Population

TLDR
In a multicenter group of EBC patients, a model based on preoperative serum metabolomic profiles was developed that was prognostic for disease recurrence, independent of traditional clinicopathologic risk factors.
Abstract
Purpose: Detecting signals of micrometastatic disease in patients with early breast cancer (EBC) could improve risk stratification and allow better tailoring of adjuvant therapies. We previously showed that postoperative serum metabolomic profiles were predictive of relapse in a single-center cohort of estrogen receptor (ER)-negative EBC patients. Here, we investigated this further using preoperative serum samples from ER-positive, premenopausal women with EBC who were enrolled in an international phase III trial.Experimental Design: Proton nuclear magnetic resonance (NMR) spectroscopy of 590 EBC samples (319 with relapse or ≥6 years clinical follow-up) and 109 metastatic breast cancer (MBC) samples was performed. A Random Forest (RF) classification model was built using a training set of 85 EBC and all MBC samples. The model was then applied to a test set of 234 EBC samples, and a risk of recurrence score was generated on the basis of the likelihood of the sample being misclassified as metastatic.Results: In the training set, the RF model separated EBC from MBC with a discrimination accuracy of 84.9%. In the test set, the RF recurrence risk score correlated with relapse, with an AUC of 0.747 in ROC analysis. Accuracy was maximized at 71.3% (sensitivity, 70.8%; specificity, 71.4%). The model performed independently of age, tumor size, grade, HER2 status and nodal status, and also of Adjuvant! Online risk of relapse score.Conclusions: In a multicenter group of EBC patients, we developed a model based on preoperative serum metabolomic profiles that was prognostic for disease recurrence, independent of traditional clinicopathologic risk factors. Clin Cancer Res; 23(6); 1422-31. ©2017 AACR.

read more

Content maybe subject to copyright    Report

Personalized Medicine and Imaging
Serum Metabolomic Proles Identify ER-Positive
Early Breast Cancer Patients at Increased Risk of
Disease Recurrence in a Multicenter Population
Christopher D. Hart
1
, Alessia Vignoli
2
, Leonardo Tenori
2,3
, Gemma Leonora Uy
4
,
Ta Va n To
5
, Clement Adebamowo
6
, Syed Mozammel Hossain
7
, Laura Biganzoli
1
,
Emanuela Risi
1
, Richard R. Love
8
, Claudio Luchinat
2,9
, and Angelo Di Leo
1
Abstract
Purpose: Detecting signals of micrometastatic disease in
patients with early breast cancer (EBC) could improve risk strat-
ication and allow better tailoring of adjuvant therapies. We
previously showed that postoperative serum metabolomic pro-
les were predictive of relapse in a single-center cohort of estrogen
receptor (ER)negative EBC patients. Here, we investigated this
further using preoperative serum samples from ER-positive, pre-
menopausal women with EBC who were enrolled in an interna-
tional phase III trial.
Experimental Design: Proto n nuclear mag netic resonance
(NMR) spectroscopy of 590 EBC samples ( 319 wi th relapse
or 6 years clinical follow- up) and 109 me tastatic b reas t cancer
(MBC) samples was perfo rmed. A Random Forest (RF) classi-
cation model was built us ing a training set of 85 EBC and all
MBC samples. The model was then applied to a test set of 234
EBC sample s, and a risk of recurrence score was generated on
the basis of the likelihood of the sample being misclassied as
metastatic.
Results: In the training set, the RF model separated EBC from
MBC with a discrimination accuracy of 84.9%. In the test set, the
RF recurrence risk score correlated with relapse, with an AUC of
0.747 in ROC analysis. Accuracy was maximized at 71.3% (sen-
sitivity, 70.8%; specicity, 71.4%). The model performed inde-
pendently of age, tumor size, grade, HER2 status and nodal status,
and also of Adjuvant! Online risk of relapse score.
Conclusions: In a multicenter group of EBC patients,
we developed a model based o n preoperative serum meta-
bolomic proles that was prognostic for disease recurrence,
independent of t raditi onal clinicopatholog ic risk factors.
Clin Cancer Res; 23(6); 142231. 2017 AACR.
Introduction
In the treatment of early breast cancer (EBC), risk stratication
based on prognostic features is critical to decisions about the
appropriate adjuvant strategy, in particular whether or not che-
motherapy is warranted. Molecular proling of the primary tumor
has improved on traditional clinicopathologic risk stratication,
yet still a signicant proportion of "high risk" patients do not
relapse and may receive chemotherapy unnecessarily (13). In
addition to focusing on the characteristics of the primary cancer,
an improved method to detect the actual presence of micrometa-
static disease would help to identify those who might benet from
adjuvant therapies and those who may not.
Metabolomics is the study of metabolites (small molecules)
in blood, tissue, or other biological samples, where the pres-
ence and relativ e conce ntratio ns of these molecules can be used
as evidence of cellular processes and fu nction s. Given that
cancer cells can have signicantly altered metabolism, the
pattern of meta bolit es pro duced ca n y ield a "signature" that
may indicate the cancer's presence or behavior (4). Important-
ly, and in contrast to gene expression proling as a risk stratier,
this is a signal that originates directly or indirectly from m icro-
metastatic disease, rather than one derived from featur es of the
primary tum or. Furthermore, the surro unding stroma and
immune resp onse may also contribute to an altered metabo-
lomic prole, thus offering combined information on residual
tumor and host response. A major challe nge in metabolomics is
detecting this signature against the dynamic sea of metabolic
data from normal cellul ar function.
Several groups including our own have identied a metastatic
"signature" in patients with advanced breast cancer, using nuclear
magnetic resonance (NMR) spectra or mass spectrometry to
analyze the metabolites in biological samples, primarily serum
(57). We compared the NMR spectra of serum from a group of
EBC patients and a group of metastatic breast cancer (MBC)
patients and identied a metastatic signature that could differ-
entiate the two groups (5).
1
"Sandro Pitigliani" Medical Oncology Department, Hospital of Prato, Istituto
Toscano Tumori, Prato, Italy.
2
Magnetic Resonance Center (CERM ), University of
Florence, Sesto Fiorentino, Italy.
3
FiorGen Foundation, Sesto Fiorentino, Italy.
4
Philippine General Hospital, Manila, Philippines.
5
Hospital K, Hanoi, Vietnam.
6
University College Hospital, Ibadan, Nigeria.
7
Khulna Medical College and
Hospital, Khulna, Bangladesh.
8
The International Breast Cancer Research Foun-
dation, Madison, Wisconsin.
9
Department of Chemistry, University of Florence,
Sesto Fiorentino, Italy.
Note: Supplementary data for this article are available at Clinical Cancer
Research Online (http://clincancerres.aacrjournals.org/).
C.D. Hart and A. Vignoli contributed equally to this article.
R.R. Love, C. Luchinat, and A. Di Leo share senior authorship.
Corresponding Author: Angelo Di Leo, Hospital of Prato, Via Suor Niccolina 20,
Prato 59100, Italy. Phone: 3905-7480-2520; Fax: 39 05-7480-2903; E-mail:
angelo.dileo@uslcentro.toscana.it
doi: 10.1158/1078-0432.CCR-16-1153
2017 American Association for Cancer Research.
Clinical
Cancer
Research
Clin Cancer Res; 23(6) March 15, 2017
1422
Downloaded from http://aacrjournals.org/clincancerres/article-pdf/23/6/1422/2300688/1422.pdf by guest on 26 August 2022

From there, we hypothesized that EBC patients with micro-
metastatic disease may also have features of the metastatic sig-
nature in their metabolomic prole, whereas those with no
micrometastatic disease would not, and that this signature would
predict for relapse. This hypothesis was tested in a follow-up study
using serum from a biobank of estrogen receptornegative (ER
)
patients from the Memorial Sloan Kettering Cancer Center (New
York, NY), for whom clinical outcome (relapse at 5 years) was
known (8). A model was built in which EBC patients were
assigned a metabolomic risk score [Random Forest (RF) risk
score], which was a function of the likelihood that they would
be misclassied as metastatic based on their serum NMR spectra.
Again, we were able to demonstrate that EBC and MBC proles
differed, but importantly, we also demonstrated that the RF risk
score could predict relapse, independent of traditional clinico-
pathologic risk factors, in this single-center group of ER
EBC
women.
In this current study, we aimed to test the RF risk score again as a
predictor of relapse in a large group of premenopausal EBC
patients with ER-positive (ER
þ
) disease taking part in a multi-
center adjuvant trial.
Patients and Methods
This retrospective study was a collaborative project among
the International Breast Cancer Research Foundation, the Uni-
versity of Flor ence Magnetic Resonance Centre (Florence,
Italy), and the Sandro Pitigli ani Medical Oncology Depart-
ment, Hospital of Prato (Pra to, Italy). The stu dy p rotocol
received ethics approval from the ethics committee of the
Hospital of Prato.
Patient selection
Serum samples for analysis were obtained from a bank of blood
samples that had been collected during a phase III adjuvant breast
cancer clinical trial (NCT00201851; ref. 9) and a parallel phase III
MBC clinical trial (NCT00293540; ref. 10) conducted at centers
across South East Asia. Both the trials were run by the Interna-
tional Breast Cancer Research Foundation.
In the adjuvant trial, 740 premenopausal women with stage
IIIIIB hormone receptor (HR)positive breast cancer received
surgical oophorectomy at the time of breast cancer surgery (mas-
tectomy), followed by tamoxifen for 5 years, to investigate the
hypothesis that surgery performed during the luteal phase of the
menstrual cycle would be associated with better outcomes. At the
time of enrollment, 231 patients were estimated to be in the luteal
phase and were scheduled for immediate surgery; 509 patients
were estimated not to be in the luteal phase and were randomized
to receive either immediate surgery or surgery scheduled to occur
in the predicted mid-luteal phase (9). Blood samples were col-
lected preoperatively in fasted patients on the day of surgery.
Frozen sera were initially stored at local sites and then shipped
frozen to the United States. Subsequently specimens were shipped
still frozen to Italy. No patients were recorded as diabetic. The trial
was designed to follow patients for recurrence for at least 6 years,
and deidentied clinical outcome data were made available for
the purposes of this study. The study was approved at individual
participating institutions in the Philippines, Vietnam, and Moroc-
co and/or by supervising Institutional Review Boards for these
institutions and at lead investigator's American institutions. The
consent processes addressed the use of samples for future research
studies.
In the metastatic trial, premenopausal patients with ER
þ
MBC
were randomized to undergo oophorectomy surgery as palliative
endocrine therapy in either the follicular or the luteal phase of the
menstrual cycle, followed by tamoxifen (10). Blood samples were
collected preoperatively from fasted patients on the day of sur-
gery. Frozen sera were initially stored at local sites and then
shipped frozen to the United States. Subsequently specimens
were shipped still frozen to Italy. Diabetic status of patients was
not recorded.
NMR sample preparation
Frozen serum samples were thawed at room temperature and
shaken before use and then were prepared according to standard
operating procedures (11).
A total of 300 mL of sodium phosphate buffer (70 mmol/L
Na
2
HPO
4
; 20% (v/v)
2
H
2
O; 0.025% (v/v) NaN
3
;0.8%(w/v)
sodium trimethylsilyl [2,2,3,3-
2
H
4
]propi onate pH 7.4 ) was
added to 300 mL of ea ch s erum sa mple, a nd the mixture was
homogenized by vortexing for 30 secon ds. A total of 450 mL
of this mixture was transferred into a 4.25-mm NMR tube
(Bruker BioSpin sr l) for the an alysis .
NMR analysis
Monodimensional 1H NMR spectra for all samples were
acquired using a Bruker 600 MHz spectrometer (Bruker
BioSpin) operating at 600.13 MHz proton Larmor frequency
and equipp ed w ith a 5-mm CPTCI 1H-13C-31P and 2 H-decou-
pling cryoprobe, including a z-axis gradient coil, an au to matic
tuning-matching, and an automatic sample changer. A BTO
2000 thermocouple serve d for temperature stabil izatio n at th e
level of app roximat ely 0.1 K at the sample. Before measure-
ment, samples wer e kept f or at l east 3 minutes inside the NMR
probehead for temperature equil ibratio n ( 310 K for serum
samples).
According to standard practice (12, 13), three monodimen-
sional 1H NMR spectra with different pulse sequences were
acquired for each serum sample, allowing the selective detection
of different molecular components:
Translational Relevance
Adjuvant chemotherapy in early breast cancer improves
survival by targeting micrometastatic disease. Because of dif-
culties in detecting such a disease in patients, there is a
tendency to overtreat, meaning that many patients receive
chemotherapy unnecessarily, with substantial morbidity.
We hypothesize that the combined altered cellular behavior
of micrometastatic disease, supporting stroma and host
response, results in a unique, detectable pattern of metabolites
(metabolomic prole) similar to that seen in advanced disease
and that it correlates with relapse. Here, using serum taken
from premenopausal women enrolled in two phase III trials,
and using nuclear magnetic resonance spectroscopy, we show
that patients with metabolomic proles more resembling the
metastatic prole have a higher rate of relapse. Metabolomics
thus has the potential to identify patients with micrometastatic
disease, improve risk stratication, and reduce overprescrip-
tion of chemotherapy.
Metabolomic Proles Predictive of Breast Cancer Recurrence
www.aacrjournals.org Clin Cancer Res; 23(6) March 15, 2017 1423
Downloaded from http://aacrjournals.org/clincancerres/article-pdf/23/6/1422/2300688/1422.pdf by guest on 26 August 2022

(i) a standard nuclear Overhauser effect spectroscopy pulse
sequence NOESY 1Dpresat (noesygppr1d.comp; Bruker
BioSpin) using 64 scans, 98,304 data points, a spectral
width of 18,028 Hz, an acquisition time of 2.7 seconds, a
relaxation delay of 4 seconds, and a mixing time of 0.1
second was applied to obtain a spectrum in which both
signals of metabolites and high molecular weight macro-
molecules (lipids and lipoproteins) are visible.
(ii) a standard spin echo Carr PurcellMeiboom Gill
(CPMG; ref. 14; cpmgpr1d.comp; Br uker BioSpin) p ulse
sequence with 64 scans, 73,728 data points, a spe ctral
width of 12,019 H z, and a relaxation delay of 4 seconds
was used for the select ive observation of l ow molecular
weight metabolites, suppressing signals arising from
macromolecules.
(iii) a standard diffusion-edited (ledbgppr2s1d.comp; Bruker
BioSpin; ref. 15) pulse sequence, using 64 scans, 98,304 data
points, a spectral width of 18,028 Hz, and a relaxation delay
of 4 seconds was applied to suppress metabolite signals.
Spectral processing
Free induction decays were multiplied by an exponential func-
tion equivalent to a 1.0-Hz line-broadening factor before applying
Fourier transformation. Transformed spectra were automatically
corrected for phase and baseline distortions and calibrated
(anomeric glucose doublet at 5.24 ppm) using TopSpin 3.2
(Bruker Biospin srl). Each 1D spectrum in the range 0.2 to
10.00 ppm was segmented into 0.02-ppm chemical shift bins,
and the corresponding spectral areas were integrated using AMIX
software (version 3.8.4, Bruker BioSpin). Binning is a means to
reduce the number of total variables and to compensate for small
shifts in the signals, making the analysis more robust and repro-
ducible (16, 17). Regions between 4.5 and 6.5 ppm containing
residual water signal were removed, and the dimension of the
system was reduced to 391 bins. The total spectral area was
calculated on the remaining bins, and total area normalization
was carried out on the data prior to pattern recognition.
Statistical analysis
Statistical analyses were planned prior to specimen retrieval,
based on those performed in the previous study, including min-
imum number of samples required (8). All data analyses were
performed using R (18). Principal component analysis (PCA) was
used rst as an unsupervised exploratory analysis to assess the
presence of any clusters or outliers.
To conrm that serum metabolomic proles can be used to
distinguish patients with MBC from those with early disease, an
RF classier (19) was built to separate early and metastatic
patients. For the initial model, the group of EBC patients who
had relapsed or had minimum 5 years clinical follow-up was
randomly split into two groups, to form a training set and a
validation set, as in the previous study (8). Briey, the RF classier
uses data from the metastatic and training set to build an ensem-
ble of decision trees, where each tree contains a random sample of
the original data, with only a small number of variables (bins) at
each decision node, used to predict whether a sample is early or
metastatic. For early patients, a score was created that expresses the
extent to which the serum metabolomic prole appears to be
metastatic, designated as the "RF risk score." For each patient,
three "RF risk scores" were derived using the three types of spectra
(NOESY1D, CPMG, and diffusion-edited spectra). For all calcula-
tions, the R package "Random Forest" (20) was used to grow a
forest of 1,000 trees, using the default settings.
The next step was to test the hypothesis that a metastatic
metabolomic signature in early disease would be predictive of
relapse and that higher RF relapse scores would correlate with
higher risk of developing a relapse. Using ROC analysis, the
performance of the RF risk score was compared with actual breast
cancer outcome. A prognostic model was created using the CPMG
RF risk score, which had the best performance in the training set.
To delineate high risk of relapse, a cutoff for the RF risk score was
calculated in the training set that optimized accuracy, sensitivity,
and specicity, and the performance of the model was subse-
quently tested in the validation set.
Multivariate analysis of the impact of provenance of the sample
was achieved using unsupervised PCA of the spectra. When this
impact was found to be signicant, the model for relapse predic-
tion was redesigned:
(i) We hypothesized that samples from different clinical sites
had been collected or stored following different operating
procedures (e.g., longer periods from collection to sera
separation and freezing, or different freezing tempera-
tures), and that this may be reected in the metabolomic
spectra. As reported in the literature (11), lactate (coupled
with pyruvate and glucose) is the most sensitive marker for
sample degradation. To overcome this inuence, we
removed the bins related to lactate from the data matrix.
(ii) The nonrelapsing patients included in the analysis were
restricted to those with a minimum follow-up of 6 years, as
HR
þ
breast cancer has a relatively steady relapse rate for at
least 10 years.
(iii) Finally, we chose to include in the training set only women
who had not developed a recurrence, to reduce the likeli-
hood of confounding factors due to the presence of patients
with micrometastases in the model. Thus, ROC analysis
could only be carried out on the subsequent test set of
relapsed and nonrelapsed patients.
Assessment of confounding factors (e.g., age, tumor size, nodal
status, etc.) within the spectra was performed by using the
multivariate RF classier analysis to determine whether spectra
could be predictive of each factor. The independent prognostic
capacity of the redesigned RF risk score model was evaluated in a
multivariate analysis controlling for standard prognostic features,
which also included an Adjuvant! Online (AoL) risk of relapse
score. The AoL score was calculated for 10-year risk of relapse
assuming no adjuvant therapy and was used as a surrogate
combined clinicopathologic risk.
For the analysis of individual metabolites, the spectral regions
related to 22 metabolites were assigned in the
1
HCPMG
NMR proles by using matching routines of AMIX 3.8.4 (Bruker
BioSpin) in combination with the BBIOREFCODE (Bruker BioS-
pin) and the Human Metabolome Database (21). The spectral
regions were tted and integrated to obtain the concentration in
arbitrary units, and these data were used to compare metabolite
concentrations between EBC and MBC patients. Wilcoxon signed-
rank test (22) was chosen to perform the analysis on the biological
asymptotic assumption that the metabolite concentrations are not
normally distributed, and FDR correction was applied using the
BenjaminiHochberg method (23). P < 0.05 was deemed signif-
icant. Because of the method used to generate spectra, NMR proles
could not be used to measure individual lipid concentrations, nor
metabolites in very small concentrations, such as acylcarnitines.
Hart et al.
Clin Cancer Res; 23(6) March 15, 2017 Clinical Cancer Research1424
Downloaded from http://aacrjournals.org/clincancerres/article-pdf/23/6/1422/2300688/1422.pdf by guest on 26 August 2022

Results
Patients
Serum samples from 675 women with EBC and 125 with MBC
were received. Of these, 101 samples were deemed nonevaluable
for technical reasons (plasma instead of serum, inadequate
amount of serum, hemolysis, and insufcient clinical informa-
tion), leaving 590 EBC and 109 MBC samples suitable for NMR
spectroscopy to build metabolomic proles. Baseline character-
istics are reported in Table 1.
Provenance of samples
EBC samples came from 5 centers in the Philippines and 2
centers in Vietnam; MBC samples came from 5 centers in the
Philippines, 3 centers in Bangladesh, and one in Nigeria (Table 2).
Notably, no MBC samples came from Vietnam, and only 24 came
from Philippine General Hospital in Manila, yet these centers
contributed the majority of EBC samples, representing signicant
imbalance.
Discrimination between EBC and MBC patients
Using the RF classier for supervised analysis, the metabolomic
proles of 590 EBC and 109 MBC patients were classied, and
show signicant differential clustering, with near-complete sep-
aration of the two groups (Fig. 1). Clustering was achieved by the
CPMG, NOESY1D, and diffusion spectra.
As in the previous studies (5, 8), the clustering provided by the
CPMG spectra shows the highest accuracy for predicting early or
metastatic status, with accuracy of 90.3% [95% condence inter-
val (CI), 90.2%90.4%], compared with 86.8% (95% CI, 86.7%
86.8%) for NOESY1D, and 84.4% (95% CI, 84.3%84.5%) for
diffusion edited. Only results for CPMG spectra will be reported
from here on.
Relapse prediction by RF score
A metabolomic RF risk s core for each EBC sample was
generated on the basis of the pro babili ty that the NMR spectrum
would be clas sied as metasta tic. The initial model was built
using the same parameters as in the previous study, using CPMG
spectra and only including EBC samples from patients who
either relapsed or were relapse free with a minimum of 5 years
clinical follow-up data (total 443). The training set consisted of
68 rel apsed and 41 nonrel apsed EBC patients chosen at random
and al l 109 metastat ic patients. The validation set consisted of
the rema ining 124 relap sed and 210 nonrelapsed EBC patients.
The AUC obtained for the traini ng set wa s 0.644, and the
accuracy of the RF risk score was maximized using a threshold
of 0.18, which yielded se nsit ivity of 61.3% (95% CI, 60.3%
62.2%), s peci cit y of 61.0% (95% CI, 60 .6%61.3%), and
overall accuracy for predicting likelihood of relapse of 61.1%
(95% CI, 60.6%61.6%; Supplementary Fig. S1A). The model
was then applied to the validation set, using the RF ri sk score
threshold of 0.18, achieving a sensitiv ity, sp ecicity, and pre-
dictive accuracy of 71.7%, 46.7%, and 62.4%, res pectively, and
an AUC of 0.631 (Supplem entary Fig . S1B).
In view of the low AUC results, investigation of the effect of
provenance (collection center) and length of follow-up was
carried out.
Exploratory unsupervised PCA of the CMPG spectra showed
marked differentiation among the different centers of collection
(Supplementary Fig. S2A), with the spectral region of lactate
resulting in the most relevant discrimination in the rst two
principal components. Lactate concentrations, calculated in arbi-
trary units from the spectra, differed signicantly between EBC
and MBC patients (Table 3), demonstrating the key role of lactate
in both discrimination of EBC and MBC and in the identication
of treatment centers. This
nding was consistent with our hypoth-
esis regarding differences in storage and handling between treat-
ment centers in our samples.
Relapse prediction by RF scoreoptimized model
To overcome the inuence of lactate, we removed the bins
relatedtothismetabolitefromthedatamatrix.ThePCAscore
plot (Supplementar y Fig. S2B) cal culate d usi ng this reduced
data matrix shows greatly reduced dispersion of the data points.
This observation is conrmed by calcul ating the generalized
variance (24) o f the rst three PCA components. This value
(calculated as the determinant of the covariance matrix) repre-
sents the volume of the ellipsoid containing the data. Using the
complete data matrix, we obtain a generali zed variance of 16.8,
whereas for the reduced data matrix, the generalized variance is
11.8, illustrating that remova l of the bins corresponding to
lactate indeed reduced sp reading of the data, thus reducing the
location effect.
The EBC cohort was restricted to those with relapse or mini-
mum 6 years follow-up, which reduced the sample size to 319. In
this new model, the training set consisted of 85 early patients
without relapse (randomly selected) and all 109 metastatic
patients. The test set contained 192 early patients that suffered
relapse and the remaining 42 relapse-free early patients.
Using the CPMG NMR spectra, the RF classier discriminated
EBC from MBC patients in the training set with sensitivity,
specicity, and predictive accuracy of 90.0% (95% CI 89.7%
90.3%), 84.9% (95% CI 84.7%85.1%), and 87.1% (95% CI
86.9%87.3%), respectively (Fig. 2A). This new model was then
applied to the test set to assess ability to predict relapse, attaining
an AUC of 0.747. The accuracy of the RF risk score was maximized
using a threshold of 0.235, which yielded sensitivity of 70.8%,
specicity of 71.4%, and overall accuracy for predicting likelihood
of relapse of 71.3% (Fig. 2B). AUC scores for NOESY1D and
diffusion-editing spectra were inferior, at AUC 0.706 and 0.617,
respectively.
The AUC score calculated on the RF score was assessed for
signicance against the null hypothesis of no prediction accuracy
in the data, by means of 10,000 randomized class permutation
tests. The estimate AUC score obtained after randomization is
0.531 (95% CI, 0.530.531), demonstrating the signicance of
our result (AUC, 0.747; P ¼ 1.63 10
20
) despite the problems
encountered.
Comparison with known prognostic factors
The known prognostic factors age, tumor size (02 cm, 2.15
cm, >5 cm), nodal status (0, 13, >3), histologic grade, and HER2
overexpression were compared with the CPMG RF risk score,
calculated on the optimized set, in univariate and multivariate
regression analyses (Table 4). We also compared the RF risk score
with the 10-year risk of recurrence as calculated by AoL in a
separate multivariate analysis. In all cases, the RF risk score
maintained independent prognostic value.
Similarly, using RF classication to predict individual prog-
nostic features based on the CPMG NMR spectra, none of these
features could be meaningfully discriminated (Supplementary
Fig. S3). Only the tumor size showed a weak concordance with the
Metabolomic Proles Predictive of Breast Cancer Recurrence
www.aacrjournals.org Clin Cancer Res; 23(6) March 15, 2017 1425
Downloaded from http://aacrjournals.org/clincancerres/article-pdf/23/6/1422/2300688/1422.pdf by guest on 26 August 2022

CPMG RF risk score (coefcient of correlation ¼ 0.18; P value
corrected with Bonferroni ¼ 0.02).
Metabolite analysis
NMR spect ra were analyzed to identify which me taboli tes
were contributing to discrimination of MBC and EBC proles.
In the combi ned mu lticente r populations (Table 3), compared
with E BC pati ents, patients w ith MBC are characterized by
higher serum levels ( adjust ed P < 0.05) of citrate, choline,
acetate, for mate, lactate, glutamate, 3-hydroxybutyrate, p henyl-
alanine, glycine, leucine, ala nine, proline, tyrosine, isoleucine,
creatine, creatinine, and methionine and lower serum
levels (adjusted P < 0.05) of glucos e and glutamine. In sin-
gle-center analysis (Supplementary Table S1), ci trate, formate,
Table 1. Patients and tumor char acteristics for EBC and MBC cohorts, including populations restricted to include only relapsed patients or those with clinical follow-
up greater than 5 or 6 years
Characteristic EBC all
EBC relapsed or
follow-up 5 years
EBC relapsed or
follow-up 6 years MBC
Number 590 443 319 109
Age, mean (range) 42 (2950) 42 (2950) 42 (2950) 39 (2253)
Tumor size, n (%)
<2 cm 35 (5.9) 23 (5.2) 11 (3.5%)
25 cm 396 (67.1) 285 (64.3) 203 (63.6%)
>5 cm 159 (27) 135 (30.5) 105 (32.9%)
Grade, n (%)
I 74 (13) 63 (14) 46 (14)
II 300 (51) 224 (51) 162 (51)
III 115 (19) 89 (20) 73 (23)
Unknown 101 (17) 67 (15) 38 (12)
Lymph node status, n (%)
0 248 (42) 166 (37.5) 106 (33)
13 157 (27) 121 (27.5) 83 (26)
>3 185 (31) 156 (35) 130 (41)
HER2, n (%)
Positive 108 (18) 90 (20.5) 76 (24)
Negative 388 (66) 298 (67) 210 (66)
Unknown 94 (16) 55 (12.5) 33 (10)
ER, n (%)
Positive 552 (93.6) 410 (92.6) 297 (93)
Negative 37 (6.3) 32 (7.2) 22 (7)
Unknown 1 (0.2) 1 (0.2) 0 (0)
PR, n (%)
Positive 545 (92.4) 405 (91.4) 291 (91)
Negative 44 (7.4) 37 (8.4) 28 (9)
Unknown 1 (0.2) 1 (0.2) 0 (0)
Treatment arm, n (%)
A 186 (31.5) 142 (32.0) 106 (33.2)
B 216 (36.6) 158 (35.7) 111 (34.8)
C 188 (31.9) 143 (32.3) 102 (32.0)
Dominant metastatic site, n (%)
Soft tissue —— 79 (72.5)
Bone 17 (15.6)
Viscera 13 (11.9)
Prior systemic treatment, n (%)
No —— 69 (63.3)
Yes 40 (36.7)
NOTE: Treatment arm A: not in luteal phase at the time of trial entry, randomized to luteal phase surgery; treatment arm B: not in luteal phase at the time
of trial entry,
randomized to immediate, non-luteal phase surgery; and treatment arm C: in luteal phase at the time of trial entry, immediate surgery in luteal phase.
Abbreviation: PR, progesterone receptor.
Table 2. Distribution of EBC and MBC samples by treatment center
Country Samples, n EBC samples, n MBC samples, n
Vietnam, Hanoi - Hospital K 228 228
Vietnam, Danang - Danang General 14 14
Philippines, Manila - PGH 302 278 24
Philippines, Cebu - Vicente Sotto Hospital 39 26 13
Philippines, Manila - Santo Tomas Hospital 9 3 6
Philippines, Manila - Rizal 20 15 5
Philippines, Manila - East Avenue 29 26 3
Nigeria, Ibadan - University College Hospital 8 8
Bangladesh, Dhaka - Dhaka Medical College 15 15
Bangladesh, Khulna - Khulna Medical College 28 28
Bangladesh, Dhaka - BSMMU 7 7
Total 699 590 109
Abbreviations: BSMMU, Bangabandhu Sheikh Mujib Medical University; PGH, Philippine General Hospital.
Hart et al.
Clin Cancer Res; 23(6) March 15, 2017 Clinical Cancer Research1426
Downloaded from http://aacrjournals.org/clincancerres/article-pdf/23/6/1422/2300688/1422.pdf by guest on 26 August 2022

Citations
More filters
Journal ArticleDOI

High-Throughput Metabolomics by 1D NMR

TL;DR: From the analytical point of view, NMR has pros and cons but does provide a peculiar holistic perspective that may speak for its future adoption as a population‐wide health screening technique.
Journal ArticleDOI

Evaluation and comparison of bioinformatic tools for the enrichment analysis of metabolomics data

TL;DR: The state-of-the-art of the available range of tools for metabolomic datasets, the completeness of metabolite databases, the performance of ORA methods and disease-based analyses are reviewed.
Journal ArticleDOI

Uniqueness of the NMR approach to metabolomics

TL;DR: A Urine Shift Predictor that also provides the concentration of NMR-invisible inorganic ions and the ability to identify the individual phenotype that constitutes the metabolic signature of a person and monitor its behavior over time is proposed.
Journal ArticleDOI

Metabolomics in breast cancer: A decade in review.

TL;DR: The past decade has seen significant progress made within the field of clinical metabolomic BC research, with several groups demonstrating results with significant promise in the setting of BC screening and biological characterisation, as well as future potential for prognostic metabolomic biomarkers.
Journal ArticleDOI

NMR-based metabolomics identifies patients at high risk of death within two years after acute myocardial infarction in the AMI-Florence II cohort.

TL;DR: For the first time, metabolomic profiling technologies were used to discriminate between patients with different outcomes after an acute myocardial infarction and seem to be a valid and accurate addition to standard stratification based on clinical and biohumoral parameters.
References
More filters
Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.

Classification and Regression by randomForest

TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Book ChapterDOI

Individual Comparisons by Ranking Methods

TL;DR: The comparison of two treatments generally falls into one of the following two categories: (a) a number of replications for each of the two treatments, which are unpaired, or (b) we may have a series of paired comparisons, some of which may be positive and some negative as mentioned in this paper.
Journal ArticleDOI

R: A Language for Data Analysis and Graphics

TL;DR: In this article, the authors discuss their experience designing and implementing a statistical computing language, which combines what they felt were useful features from two existing computer languages, and they feel that the new language provides advantages in the areas of portability, computational efficiency, memory management, and scope.
Related Papers (5)