Lung Cancer Prediction Using Electronic Claims Records : A Transformer-Based Approach

Electronic claims records (ECRs) are large scale and longitudinal collections of individual's medical service seeking actions. Compared to in-hospital medical records (EMRs), ECRs are more standardized and cross-sites. Recently, there has been studies showing promising results on modeling claims data for a wide range of medical applications. However, few of them address the exclusion criteria on cohort selection to extract new incidence without prior signs and also often lack of emphasis on predicting cancer in early stages. In this work, we aim to design a lung cancer prediction framework using ECRs with rigorous exclusion design using state-of-the-art sequence-based transformer. Furthermore, this work presents one of the first results by applying disease prediction model to the entire population in Taiwan. The result shows over 2.1 predictive power, 5 average positive predictive value (PPV), and 0.668 area under curve (AUC) in all-stage lung cancer and around 2.0 predictive power, 1 average PPV and 0.645 AUC in early-stage in our dataset. Sub-cohort analysis could funnel high precision selective group into prioritized clinical examination. Onset analysis validates the effect of our exclusion criteria. This work presents comprehensive analyses on lung cancer prediction, and the proposed approach can serve as a state-of-the-art disease risk prediction framework on claims data.

Medienart:

E-Artikel

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

Zur Gesamtaufnahme - volume:27

Enthalten in:

IEEE journal of biomedical and health informatics - 27(2023), 12 vom: 12. Dez., Seite 6062-6073

Sprache:

Englisch

Beteiligte Personen:

Chen, Huan-Yu [VerfasserIn]
Wang, Hui-Min [VerfasserIn]
Lin, Ching-Heng [VerfasserIn]
Yang, Rob [VerfasserIn]
Lee, Chi-Chun [VerfasserIn]

Links:

Volltext

Themen:

Journal Article

Anmerkungen:

Date Completed 06.12.2023

Date Revised 06.12.2023

published: Print-Electronic

Citation Status MEDLINE

doi:

10.1109/JBHI.2023.3324191

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM363173501