Using Machine Learning to Leverage Biomarker Change and Predict Colorectal Cancer Recurrence
PURPOSE: The risk of colorectal cancer (CRC) recurrence after primary treatment varies across individuals and over time. Using patients' most up-to-date information, including carcinoembryonic antigen (CEA) biomarker profiles, to predict risk could improve personalized decision making.
METHODS: We used electronic health record data from an integrated health system on a cohort of patients diagnosed with American Joint Committee on Cancer stage I-III CRC between 2008 and 2013 (N = 3,970) and monitored until recurrence or end of follow-up. We addressed missingness in recurrence outcomes and longitudinal CEA measures, and engineered CEA features using current and past biomarker values for inclusion in a risk prediction model. We used a discrete time Superlearner model to evaluate various algorithms for predicting recurrence. We evaluated the time-varying discrimination and calibration of the algorithms and assessed the role of individual predictors.
RESULTS: Recurrence was documented in 448 (11.3%) patients. XGBoost with depth = 1 (XGB-D1) predicted recurrence substantially better than all other algorithms at all time points, with AUC ranging from 0.87 (95% CI, 0.86 to 0.88) at 6 months to 0.94 (95% CI, 0.92 to 0.96) at 54 months. The only variable used by XGB-D1 was 6-month change in log CEA. Predicted 1-year risk of recurrence was nearly zero for patients whose log CEA did not increase in the last 6 months, between 12.2% and 34.1% for patients whose log CEA increased between 0.10 and 0.40, and 43.6% for those with a log CEA increase >0.40. Compared with XGB, penalized regression approaches (lasso, ridge, and elastic net) performed poorly, with AUCs ranging from 0.58 to 0.69.
CONCLUSION: A flexible, machine learning approach that incorporated longitudinal CEA information yielded a simple and high-performing model for predicting recurrence on the basis of 6-month change in log CEA.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:7 |
---|---|
Enthalten in: |
JCO clinical cancer informatics - 7(2023) vom: 10. Sept., Seite e2300066 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Rodriguez, Patricia J [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 16.11.2023 Date Revised 29.11.2023 published: Print Citation Status MEDLINE |
---|
doi: |
10.1200/CCI.23.00066 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM364549327 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM364549327 | ||
003 | DE-627 | ||
005 | 20231226095652.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1200/CCI.23.00066 |2 doi | |
028 | 5 | 2 | |a pubmed24n1215.xml |
035 | |a (DE-627)NLM364549327 | ||
035 | |a (NLM)37963310 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Rodriguez, Patricia J |e verfasserin |4 aut | |
245 | 1 | 0 | |a Using Machine Learning to Leverage Biomarker Change and Predict Colorectal Cancer Recurrence |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 16.11.2023 | ||
500 | |a Date Revised 29.11.2023 | ||
500 | |a published: Print | ||
500 | |a Citation Status MEDLINE | ||
520 | |a PURPOSE: The risk of colorectal cancer (CRC) recurrence after primary treatment varies across individuals and over time. Using patients' most up-to-date information, including carcinoembryonic antigen (CEA) biomarker profiles, to predict risk could improve personalized decision making | ||
520 | |a METHODS: We used electronic health record data from an integrated health system on a cohort of patients diagnosed with American Joint Committee on Cancer stage I-III CRC between 2008 and 2013 (N = 3,970) and monitored until recurrence or end of follow-up. We addressed missingness in recurrence outcomes and longitudinal CEA measures, and engineered CEA features using current and past biomarker values for inclusion in a risk prediction model. We used a discrete time Superlearner model to evaluate various algorithms for predicting recurrence. We evaluated the time-varying discrimination and calibration of the algorithms and assessed the role of individual predictors | ||
520 | |a RESULTS: Recurrence was documented in 448 (11.3%) patients. XGBoost with depth = 1 (XGB-D1) predicted recurrence substantially better than all other algorithms at all time points, with AUC ranging from 0.87 (95% CI, 0.86 to 0.88) at 6 months to 0.94 (95% CI, 0.92 to 0.96) at 54 months. The only variable used by XGB-D1 was 6-month change in log CEA. Predicted 1-year risk of recurrence was nearly zero for patients whose log CEA did not increase in the last 6 months, between 12.2% and 34.1% for patients whose log CEA increased between 0.10 and 0.40, and 43.6% for those with a log CEA increase >0.40. Compared with XGB, penalized regression approaches (lasso, ridge, and elastic net) performed poorly, with AUCs ranging from 0.58 to 0.69 | ||
520 | |a CONCLUSION: A flexible, machine learning approach that incorporated longitudinal CEA information yielded a simple and high-performing model for predicting recurrence on the basis of 6-month change in log CEA | ||
650 | 4 | |a Journal Article | |
650 | 7 | |a Carcinoembryonic Antigen |2 NLM | |
700 | 1 | |a Heagerty, Patrick J |e verfasserin |4 aut | |
700 | 1 | |a Clark, Samantha |e verfasserin |4 aut | |
700 | 1 | |a Khor, Sara |e verfasserin |4 aut | |
700 | 1 | |a Chen, Yilin |e verfasserin |4 aut | |
700 | 1 | |a Haupt, Eric |e verfasserin |4 aut | |
700 | 1 | |a Hahn, Erin E |e verfasserin |4 aut | |
700 | 1 | |a Shankaran, Veena |e verfasserin |4 aut | |
700 | 1 | |a Bansal, Aasthaa |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t JCO clinical cancer informatics |d 2017 |g 7(2023) vom: 10. Sept., Seite e2300066 |w (DE-627)NLM275406369 |x 2473-4276 |7 nnns |
773 | 1 | 8 | |g volume:7 |g year:2023 |g day:10 |g month:09 |g pages:e2300066 |
856 | 4 | 0 | |u http://dx.doi.org/10.1200/CCI.23.00066 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 7 |j 2023 |b 10 |c 09 |h e2300066 |