Machine learning-based models for the prediction of breast cancer recurrence risk

© 2023. The Author(s)..

Breast cancer is the most common malignancy diagnosed in women worldwide. The prevalence and incidence of breast cancer is increasing every year; therefore, early diagnosis along with suitable relapse detection is an important strategy for prognosis improvement. This study aimed to compare different machine algorithms to select the best model for predicting breast cancer recurrence. The prediction model was developed by using eleven different machine learning (ML) algorithms, including logistic regression (LR), random forest (RF), support vector classification (SVC), extreme gradient boosting (XGBoost), gradient boosting decision tree (GBDT), decision tree, multilayer perceptron (MLP), linear discriminant analysis (LDA), adaptive boosting (AdaBoost), Gaussian naive Bayes (GaussianNB), and light gradient boosting machine (LightGBM), to predict breast cancer recurrence. The area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score were used to evaluate the performance of the prognostic model. Based on performance, the optimal ML was selected, and feature importance was ranked by Shapley Additive Explanation (SHAP) values. Compared to the other 10 algorithms, the results showed that the AdaBoost algorithm had the best prediction performance for successfully predicting breast cancer recurrence and was adopted in the establishment of the prediction model. Moreover, CA125, CEA, Fbg, and tumor diameter were found to be the most important features in our dataset to predict breast cancer recurrence. More importantly, our study is the first to use the SHAP method to improve the interpretability of clinicians to predict the recurrence model of breast cancer based on the AdaBoost algorithm. The AdaBoost algorithm offers a clinical decision support model and successfully identifies the recurrence of breast cancer.

Medienart:

E-Artikel

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

Zur Gesamtaufnahme - volume:23

Enthalten in:

BMC medical informatics and decision making - 23(2023), 1 vom: 29. Nov., Seite 276

Sprache:

Englisch

Beteiligte Personen:

Zuo, Duo [VerfasserIn]
Yang, Lexin [VerfasserIn]
Jin, Yu [VerfasserIn]
Qi, Huan [VerfasserIn]
Liu, Yahui [VerfasserIn]
Ren, Li [VerfasserIn]

Links:

Volltext

Themen:

Artificial intelligence
Breast cancer
Disease recurrence
Journal Article
Machine learning
Prediction model
Research Support, Non-U.S. Gov't

Anmerkungen:

Date Completed 01.12.2023

Date Revised 11.01.2024

published: Electronic

Citation Status MEDLINE

doi:

10.1186/s12911-023-02377-z

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM365222119