Skip to main content

Table 2 Discrimination of the logistic regression and machine learning models with and without quantitative features

From: Predicting pathologic complete response to neoadjuvant chemotherapy in breast cancer using a machine learning approach

AUC (95% CI)

Logistic Regression a

Machine Learning

Basic Model b

Quantitative Model c

Basic Model

Quantitative Model

Overalld (n = 16730)

0.739 (0.731–0.747)

0.781 (0.774–0.788)

0.746 (0.738–0.753)

0.785 (0.778–0.792)

HR+/HER2- (n = 5481)

0.749 (0.732–0.767)

0.811 (0.795–0.827)

0.756 (0.739–0.773)

0.817 (0.802–0.832)

HR+/HER2+ (n = 4043)

0.612 (0.595–0.630)

0.744 (0.729–0.760)

0.623 (0.605–0.640)

0.751 (0.736–0.766)

HR-/HER2+ (n = 1787)

0.558 (0.530–0.586)

0.616 (0.588–0.644)

0.603 (0.576–0.631)

0.640 (0.613–0.668)

TNBC (n = 5419)

0.649 (0.634–0.663)

0.654 (0.639–0.668)

0.647 (0.632–0.662)

0.654 (0.639–0.669)

  1. Abbreviations AUC, area under the receiver operating characteristic curve; CI, confidence interval; HR, hormone receptor; HER2, human epidermal growth factor receptor 2; TNBC, triple-negative breast cancer
  2. aDetails of the logistic regression models can be found in eTable 13
  3. bThe basic models included the basic features (i.e. age at diagnosis, clinical T and N stages, histology types, tumor grades and comorbidity index)
  4. cThe quantitative models included both the basic features (i.e. age at diagnosis, clinical T and N stages, histology types, tumor grades and comorbidity index) and the quantitative features (i.e. ER%, PR%, HER2 IHC categories, HER2/CEP17 ratios and Ki-67 scores
  5. dThe AUC of each model was estimated among the 30% hold-out validation set overall and within each breast cancer subtype, the 95% CIs of the AUCs were calculated using the ‘pROC’ package in R