Development and validation of algorithms to build an electronic health record based cohort of patients with systemic sclerosis

Copyright: © 2023 Tukpah et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited..

OBJECTIVES: To evaluate methods of identifying patients with systemic sclerosis (SSc) using International Classification of Diseases, Tenth Revision (ICD-10) codes (M34*), electronic health record (EHR) databases and organ involvement keywords, that result in a validated cohort comprised of true cases with high disease burden.

METHODS: We retrospectively studied patients in a healthcare system likely to have SSc. Using structured EHR data from January 2016 to June 2021, we identified 955 adult patients with M34* documented 2 or more times during the study period. A random subset of 100 patients was selected to validate the ICD-10 code for its positive predictive value (PPV). The dataset was then divided into a training and validation sets for unstructured text processing (UTP) search algorithms, two of which were created using keywords for Raynaud's syndrome, and esophageal involvement/symptoms.

RESULTS: Among 955 patients, the average age was 60. Most patients (84%) were female; 75% of patients were White, and 5.2% were Black. There were approximately 175 patients per year with the code newly documented, overall 24% had an ICD-10 code for esophageal disease, and 13.4% for pulmonary hypertension. The baseline PPV was 78%, which improved to 84% with UTP, identifying 788 patients likely to have SSc. After the ICD-10 code was placed, 63% of patients had a rheumatology office visit. Patients identified by the UTP search algorithm were more likely to have increased healthcare utilization (ICD-10 codes 4 or more times 84.1% vs 61.7%, p < .001), organ involvement (pulmonary hypertension 12.7% vs 6% p = .011) and medication use (mycophenolate use 28.7% vs 11.4%, p < .001) than those identified by the ICD codes alone.

CONCLUSION: EHRs can be used to identify patients with SSc. Using unstructured text processing keyword searches for SSc clinical manifestations improved the PPV of ICD-10 codes alone and identified a group of patients most likely to have SSc and increased healthcare needs.

Medienart:

E-Artikel

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

Zur Gesamtaufnahme - volume:18

Enthalten in:

PloS one - 18(2023), 4 vom: 14., Seite e0283775

Sprache:

Englisch

Beteiligte Personen:

Tukpah, Ann-Marcia C [VerfasserIn]
Rose, Jonathan A [VerfasserIn]
Seger, Diane L [VerfasserIn]
Dellaripa, Paul F [VerfasserIn]
Hunninghake, Gary M [VerfasserIn]
Bates, David W [VerfasserIn]

Links:

Volltext

Themen:

Journal Article
Research Support, N.I.H., Extramural
UT0S826Z60
Uridine Triphosphate

Anmerkungen:

Date Completed 17.04.2023

Date Revised 18.04.2023

published: Electronic-eCollection

Citation Status MEDLINE

doi:

10.1371/journal.pone.0283775

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM355576430