Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases

© Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ..

OBJECTIVE: Systemic autoimmune rheumatic diseases (SARDs) encompass a diverse group of complex conditions with overlapping clinical features, making accurate diagnosis challenging. This study aims to develop a multiclass machine learning (ML) model for early-stage SARDs classification using accessible laboratory indicators.

METHODS: A total of 925 SARDs patients were included, categorised into SLE, Sjögren's syndrome (SS) and inflammatory myositis (IM). Clinical characteristics and laboratory markers were collected and nine key indicators, including anti-dsDNA, anti-SS-A60, anti-Sm/nRNP, antichromatin, anti-dsDNA (indirect immunofluorescence assay), haemoglobin (Hb), platelet, neutrophil percentage and cytoplasmic patterns (AC-19, AC-20), were selected for model building. Various ML algorithms were used to construct a tripartite classification ML model.

RESULTS: Patients were divided into two cohorts, cohort 1 was used to construct a tripartite classification model. Among models assessed, the random forest (RF) model demonstrated superior performance in distinguishing SLE, IM and SS (with area under curve=0.953, 0.903 and 0.836; accuracy= 0.892, 0.869 and 0.857; sensitivity= 0.890, 0.868 and 0.795; specificity= 0.910, 0.836 and 0.748; positive predictive value=0.922, 0.727 and 0.663; and negative predictive value= 0.854, 0.915 and 0.879). The RF model excelled in classifying SLE (precision=0.930, recall=0.985, F1 score=0.957). For IM and SS, RF model outcomes were (precision=0.793, 0.950; recall=0.920, 0.679; F1 score=0.852, 0.792). Cohort 2 served as an external validation set, achieving an overall accuracy of 87.3%. Individual classification performances for SLE, SS and IM were excellent, with precision, recall and F1 scores specified. SHAP analysis highlighted significant contributions from antibody profiles.

CONCLUSION: This pioneering multiclass ML model, using basic laboratory indicators, enhances clinical feasibility and demonstrates promising potential for SARDs classification. The collaboration of clinical expertise and ML offers a nuanced approach to SARDs classification, with potential for enhanced patient care.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:11

Enthalten in:

Lupus science & medicine - 11(2024), 1 vom: 31. Jan.

Sprache:

Englisch

Beteiligte Personen:

Wang, Yun [VerfasserIn]
Wei, Wei [VerfasserIn]
Ouyang, Renren [VerfasserIn]
Chen, Rujia [VerfasserIn]
Wang, Ting [VerfasserIn]
Yuan, Xu [VerfasserIn]
Wang, Feng [VerfasserIn]
Hou, Hongyan [VerfasserIn]
Wu, Shiji [VerfasserIn]

Links:

Volltext

Themen:

Antibodies, Antinuclear
Autoimmune Diseases
Autoimmunity
Journal Article
Lupus Erythematosus, Systemic

Anmerkungen:

Date Completed 05.02.2024

Date Revised 05.02.2024

published: Electronic

Citation Status MEDLINE

doi:

10.1136/lupus-2023-001125

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM36791493X