Clinical features-based machine learning models to separate sexually transmitted infections from other skin diagnoses
Copyright © 2024 The Author(s). Published by Elsevier Ltd.. All rights reserved..
INTRODUCTION: Many sexual health services are overwhelmed and cannot cater for all the individuals who present with sexually transmitted infections (STIs). Digital health software that separates STIs from non-STIs could improve the efficiency of clinical services. We developed and evaluated a machine learning model that predicts whether patients have an STI based on their clinical features.
METHODS: We manually extracted 25 demographic features and clinical features from 1315 clinical records in the electronic health record system at Melbourne Sexual Health Center. We examined 16 machine learning models to predict a binary outcome of an STI or a non-STI diagnosis. We evaluated the models' performance with the area under the ROC curve (AUC), accuracy and F1-scores.
RESULTS: Our study included 1315 consultations, of which 36.8% (484/1315) were diagnosed with STIs and 63.2% (831/1315) had non-STI conditions. The study population predominantly consisted of heterosexual men (49.5%, 651/1315), followed by gay, bisexual and other men who have sex with men (GBMSM) (25.7%), women (21.6%) and unknown gender (3.2%). The median age was 31 years (intra-quartile range (IQR) 26-39). The top 5 performing models were CatBoost (AUC 0.912), Random Forest (AUC 0.917), LightGBM (AUC 0.907), Gradient Boosting (AUC 0.905) and XGBoost (AUC 0.900). The best model, CatBoost, achieved an accuracy of 0.837, sensitivity of 0.776, specificity of 0.831, precision of 0.782 and F1-score of 0.778. The key important features were lesion duration, type of skin lesions, age, gender, history of skin disorders, number of lesions, dysuria duration, anorectal pain and itchiness.
CONCLUSIONS: Our best model demonstrates a reasonable performance in distinguishing STIs from non-STIs. However, to be clinically useful, more detailed information such as clinical images, may be required to reach sufficient accuracy.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2024 |
---|---|
Erschienen: |
2024 |
Enthalten in: |
Zur Gesamtaufnahme - volume:88 |
---|---|
Enthalten in: |
The Journal of infection - 88(2024), 4 vom: 10. Apr., Seite 106128 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Soe, Nyi Nyi [VerfasserIn] |
---|
Links: |
---|
Themen: |
Electronic health records |
---|
Anmerkungen: |
Date Completed 03.04.2024 Date Revised 03.04.2024 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1016/j.jinf.2024.106128 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM369426819 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM369426819 | ||
003 | DE-627 | ||
005 | 20240403235616.0 | ||
007 | cr uuu---uuuuu | ||
008 | 240308s2024 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1016/j.jinf.2024.106128 |2 doi | |
028 | 5 | 2 | |a pubmed24n1363.xml |
035 | |a (DE-627)NLM369426819 | ||
035 | |a (NLM)38452934 | ||
035 | |a (PII)S0163-4453(24)00062-8 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Soe, Nyi Nyi |e verfasserin |4 aut | |
245 | 1 | 0 | |a Clinical features-based machine learning models to separate sexually transmitted infections from other skin diagnoses |
264 | 1 | |c 2024 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 03.04.2024 | ||
500 | |a Date Revised 03.04.2024 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Copyright © 2024 The Author(s). Published by Elsevier Ltd.. All rights reserved. | ||
520 | |a INTRODUCTION: Many sexual health services are overwhelmed and cannot cater for all the individuals who present with sexually transmitted infections (STIs). Digital health software that separates STIs from non-STIs could improve the efficiency of clinical services. We developed and evaluated a machine learning model that predicts whether patients have an STI based on their clinical features | ||
520 | |a METHODS: We manually extracted 25 demographic features and clinical features from 1315 clinical records in the electronic health record system at Melbourne Sexual Health Center. We examined 16 machine learning models to predict a binary outcome of an STI or a non-STI diagnosis. We evaluated the models' performance with the area under the ROC curve (AUC), accuracy and F1-scores | ||
520 | |a RESULTS: Our study included 1315 consultations, of which 36.8% (484/1315) were diagnosed with STIs and 63.2% (831/1315) had non-STI conditions. The study population predominantly consisted of heterosexual men (49.5%, 651/1315), followed by gay, bisexual and other men who have sex with men (GBMSM) (25.7%), women (21.6%) and unknown gender (3.2%). The median age was 31 years (intra-quartile range (IQR) 26-39). The top 5 performing models were CatBoost (AUC 0.912), Random Forest (AUC 0.917), LightGBM (AUC 0.907), Gradient Boosting (AUC 0.905) and XGBoost (AUC 0.900). The best model, CatBoost, achieved an accuracy of 0.837, sensitivity of 0.776, specificity of 0.831, precision of 0.782 and F1-score of 0.778. The key important features were lesion duration, type of skin lesions, age, gender, history of skin disorders, number of lesions, dysuria duration, anorectal pain and itchiness | ||
520 | |a CONCLUSIONS: Our best model demonstrates a reasonable performance in distinguishing STIs from non-STIs. However, to be clinically useful, more detailed information such as clinical images, may be required to reach sufficient accuracy | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Electronic health records | |
650 | 4 | |a Machine learning | |
650 | 4 | |a Sexually transmitted infections | |
700 | 1 | |a Latt, Phyu Mon |e verfasserin |4 aut | |
700 | 1 | |a Yu, Zhen |e verfasserin |4 aut | |
700 | 1 | |a Lee, David |e verfasserin |4 aut | |
700 | 1 | |a Kim, Cham-Mill |e verfasserin |4 aut | |
700 | 1 | |a Tran, Daniel |e verfasserin |4 aut | |
700 | 1 | |a Ong, Jason J |e verfasserin |4 aut | |
700 | 1 | |a Ge, Zongyuan |e verfasserin |4 aut | |
700 | 1 | |a Fairley, Christopher K |e verfasserin |4 aut | |
700 | 1 | |a Zhang, Lei |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t The Journal of infection |d 1982 |g 88(2024), 4 vom: 10. Apr., Seite 106128 |w (DE-627)NLM012791822 |x 1532-2742 |7 nnns |
773 | 1 | 8 | |g volume:88 |g year:2024 |g number:4 |g day:10 |g month:04 |g pages:106128 |
856 | 4 | 0 | |u http://dx.doi.org/10.1016/j.jinf.2024.106128 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 88 |j 2024 |e 4 |b 10 |c 04 |h 106128 |