Improving prediction of survival and progression in metastatic non-small cell lung cancer following immunotherapy through machine learning of circulating tumor DNA dynamics

Abstract Objectives To use modern machine-learning approaches to enhance and automate the feature extraction from the longitudinal ctDNA data and to improve the prediction of survival and disease progression, risk stratification, and treatment strategies for patients with 1L NSCLC.Methods Using IMpower150 trial data on untreated metastatic non-small cell lung cancer patients treated with atezolizumab and chemotherapies, we developed a machine-learning algorithm to extract predictive features from ctDNA kinetics, improving survival and progression prediction. We analyzed kinetic data from 17 ctDNA summary markers, including cell-free DNA concentration, allele frequency, tumor molecules in plasma, and mutation counts. Our machine-learning workflow (FPCRF) involved functional principal component analysis (FPCA) for automated feature extraction, random forest and bagging ensemble algorithms for feature selection, standard PCA for dimension reduction, and Cox proportional-hazards regression for survival analysis. The dataset was divided into training and test cohorts in the same way as a previous study.Results 398 patients with ctDNA data (206 in training, 192 in validation) were analyzed. Our machine-learning models automated feature extraction, excelling in predicting overall survival (OS) and progression-free survival (PFS) at different landmarks. In identical train-test cohorts, our models outperformed existing ones using handcrafted ctDNA features, raising OS c-index to 0.72 and 0.71 from 0.67 and 0.63 for C3D1 and C4D1, and substantially improving PFS to ∼0.65 from the previous 0.54 - 0.58, a 12-20% increase. Our model enhanced risk stratification for NSCLC patients, achieving clear OS and PFS separation (e.g., on C3D1, HR: 2.65 [95%CI: 1.78–3.95, P < 0.001] for high vs. intermediate risk, 2.06 [95%CI: 1.29–3.29, P = 0.002] for intermediate vs. low risk; and PFS HR: 2.04 [95%CI: 1.41–2.94, P < 0.001], 1.56 [95%CI: 1.07–2.27, P = 0.02]). Distinct patterns of ctDNA kinetic characteristics (e.g., baseline ctDNA markers, depth of ctDNA responses, and timing of ctDNA clearance, etc.) were revealed across the risk groups. Rapid and complete ctDNA clearance appears essential for long-term clinical benefit.Conclusions Our machine-learning approach offers a novel tool for analyzing ctDNA kinetics, extracting critical features from longitudinal data, improving our understanding of the link between ctDNA kinetics and progression/mortality risks, and optimizing personalized immunotherapies for 1L NSCLC.Research in context Evidence before this study The longitudinal dynamics of ctDNA are showing promise as a biomarker for treatment outcomes and monitoring. However, despite of recent advances of machine learning, very limited applications have been reported in using machine learning-based approaches to analyze the longitudinal ctDNA data, improve the prediction of clinical outcomes, and refine the risk stratifications. We searched PubMed on Oct 8, 2023 for peer-reviewed, English-language journal and conference articles using the terms (“ctDNA”) AND (“deep learning” OR “machine learning” OR “artificial intelligence”). Fifty-nine (59) search results were found. After systematical review of these search results, we found only 4 research studies where longitudinal ctDNA dynamic/kinetic data were analyzed using machine-learning models to predict patient outcomes. These studies focused on building models using handcrafted features of ctDNA dynamics such as on-treatment ctDNA levels and early ctDNA changes and clearance, etc. So far, no studies have utilized machine- or deep-learning models to extract features from longitudinal ctDNA dynamics to inform and predict cancer patient outcomes.Added value of this study We developed a machine-learning algorithm to predict survival and disease progression using ctDNA data from the Impower150 trial on untreated metastatic non-small cell lung cancer patients receiving atezolizumab and chemotherapy. Our machine-learning models automatically extract informative features from longitudinal ctDNA dynamics, outperforming existing models based on handcrafted features in predicting overall survival and progression-free survival at various time points. They improved risk stratification and identified crucial ctDNA kinetic characteristics in 1L NSCLC, revealing the importance of rapid and complete ctDNA clearance for long-term clinical benefit.Implications of all the available evidence Machine-learning models can automatically extract prognostic features from longitudinal ctDNA dynamic trajectories, enable refined risk stratification and prediction of clinical outcomes, and thereby enhance ctDNA data’s utility in clinical patient care and personalized treatment..

Medienart:

Preprint

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

bioRxiv.org - (2023) vom: 28. Okt. Zur Gesamtaufnahme - year:2023

Sprache:

Englisch

Beteiligte Personen:

Ding, Haolun [VerfasserIn]
Yuan, Min [VerfasserIn]
Yang, Yaning [VerfasserIn]
Xu, Xu Steven [VerfasserIn]

Links:

Volltext [kostenfrei]

Themen:

570
Biology

doi:

10.1101/2023.10.24.23297462

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

XBI041318501