Are Generative Pretrained Transformer 4 Responses to Developmental Dysplasia of the Hip Clinical Scenarios Universal? An International Review
Copyright © 2024 Wolters Kluwer Health, Inc. All rights reserved..
OBJECTIVE: There is increasing interest in applying artificial intelligence chatbots like generative pretrained transformer 4 (GPT-4) in the medical field. This study aimed to explore the universality of GPT-4 responses to simulated clinical scenarios of developmental dysplasia of the hip (DDH) across diverse global settings.
METHODS: Seventeen international experts with more than 15 years of experience in pediatric orthopaedics were selected for the evaluation panel. Eight simulated DDH clinical scenarios were created, covering 4 key areas: (1) initial evaluation and diagnosis, (2) initial examination and treatment, (3) nursing care and follow-up, and (4) prognosis and rehabilitation planning. Each scenario was completed independently in a new GPT-4 session. Interrater reliability was assessed using Fleiss kappa, and the quality, relevance, and applicability of GPT-4 responses were analyzed using median scores and interquartile ranges. Following scoring, experts met in ZOOM sessions to generate Regional Consensus Assessment Scores, which were intended to represent a consistent regional assessment of the use of the GPT-4 in pediatric orthopaedic care.
RESULTS: GPT-4's responses to the 8 clinical DDH scenarios received performance scores ranging from 44.3% to 98.9% of the 88-point maximum. The Fleiss kappa statistic of 0.113 (P = 0.001) indicated low agreement among experts in their ratings. When assessing the responses' quality, relevance, and applicability, the median scores were 3, with interquartile ranges of 3 to 4, 3 to 4, and 2 to 3, respectively. Significant differences were noted in the prognosis and rehabilitation domain scores (P < 0.05 for all). Regional consensus scores were 75 for Africa, 74 for Asia, 73 for India, 80 for Europe, and 65 for North America, with the Kruskal-Wallis test highlighting significant disparities between these regions (P = 0.034).
CONCLUSIONS: This study demonstrates the promise of GPT-4 in pediatric orthopaedic care, particularly in supporting preliminary DDH assessments and guiding treatment strategies for specialist care. However, effective integration of GPT-4 into clinical practice will require adaptation to specific regional health care contexts, highlighting the importance of a nuanced approach to health technology adaptation.
LEVEL OF EVIDENCE: Level IV.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2024 |
---|---|
Erschienen: |
2024 |
Enthalten in: |
Zur Gesamtaufnahme - year:2024 |
---|---|
Enthalten in: |
Journal of pediatric orthopedics - (2024) vom: 10. Apr. |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Luo, Shaoting [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Revised 10.04.2024 published: Print-Electronic Citation Status Publisher |
---|
doi: |
10.1097/BPO.0000000000002682 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM370865294 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM370865294 | ||
003 | DE-627 | ||
005 | 20240410233553.0 | ||
007 | cr uuu---uuuuu | ||
008 | 240410s2024 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1097/BPO.0000000000002682 |2 doi | |
028 | 5 | 2 | |a pubmed24n1371.xml |
035 | |a (DE-627)NLM370865294 | ||
035 | |a (NLM)38597198 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Luo, Shaoting |e verfasserin |4 aut | |
245 | 1 | 0 | |a Are Generative Pretrained Transformer 4 Responses to Developmental Dysplasia of the Hip Clinical Scenarios Universal? An International Review |
264 | 1 | |c 2024 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Revised 10.04.2024 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status Publisher | ||
520 | |a Copyright © 2024 Wolters Kluwer Health, Inc. All rights reserved. | ||
520 | |a OBJECTIVE: There is increasing interest in applying artificial intelligence chatbots like generative pretrained transformer 4 (GPT-4) in the medical field. This study aimed to explore the universality of GPT-4 responses to simulated clinical scenarios of developmental dysplasia of the hip (DDH) across diverse global settings | ||
520 | |a METHODS: Seventeen international experts with more than 15 years of experience in pediatric orthopaedics were selected for the evaluation panel. Eight simulated DDH clinical scenarios were created, covering 4 key areas: (1) initial evaluation and diagnosis, (2) initial examination and treatment, (3) nursing care and follow-up, and (4) prognosis and rehabilitation planning. Each scenario was completed independently in a new GPT-4 session. Interrater reliability was assessed using Fleiss kappa, and the quality, relevance, and applicability of GPT-4 responses were analyzed using median scores and interquartile ranges. Following scoring, experts met in ZOOM sessions to generate Regional Consensus Assessment Scores, which were intended to represent a consistent regional assessment of the use of the GPT-4 in pediatric orthopaedic care | ||
520 | |a RESULTS: GPT-4's responses to the 8 clinical DDH scenarios received performance scores ranging from 44.3% to 98.9% of the 88-point maximum. The Fleiss kappa statistic of 0.113 (P = 0.001) indicated low agreement among experts in their ratings. When assessing the responses' quality, relevance, and applicability, the median scores were 3, with interquartile ranges of 3 to 4, 3 to 4, and 2 to 3, respectively. Significant differences were noted in the prognosis and rehabilitation domain scores (P < 0.05 for all). Regional consensus scores were 75 for Africa, 74 for Asia, 73 for India, 80 for Europe, and 65 for North America, with the Kruskal-Wallis test highlighting significant disparities between these regions (P = 0.034) | ||
520 | |a CONCLUSIONS: This study demonstrates the promise of GPT-4 in pediatric orthopaedic care, particularly in supporting preliminary DDH assessments and guiding treatment strategies for specialist care. However, effective integration of GPT-4 into clinical practice will require adaptation to specific regional health care contexts, highlighting the importance of a nuanced approach to health technology adaptation | ||
520 | |a LEVEL OF EVIDENCE: Level IV | ||
650 | 4 | |a Journal Article | |
700 | 1 | |a Canavese, Federico |e verfasserin |4 aut | |
700 | 1 | |a Aroojis, Alaric |e verfasserin |4 aut | |
700 | 1 | |a Andreacchio, Antonio |e verfasserin |4 aut | |
700 | 1 | |a Anticevic, Darko |e verfasserin |4 aut | |
700 | 1 | |a Bouchard, Maryse |e verfasserin |4 aut | |
700 | 1 | |a Castaneda, Pablo |e verfasserin |4 aut | |
700 | 1 | |a De Rosa, Vincenzo |e verfasserin |4 aut | |
700 | 1 | |a Fiogbe, Michel Armand |e verfasserin |4 aut | |
700 | 1 | |a Frick, Steven L |e verfasserin |4 aut | |
700 | 1 | |a Hui, James H |e verfasserin |4 aut | |
700 | 1 | |a Johari, Ashok N |e verfasserin |4 aut | |
700 | 1 | |a Loro, Antonio |e verfasserin |4 aut | |
700 | 1 | |a Lyu, Xuemin |e verfasserin |4 aut | |
700 | 1 | |a Matsushita, Masaki |e verfasserin |4 aut | |
700 | 1 | |a Omeroglu, Hakan |e verfasserin |4 aut | |
700 | 1 | |a Roye, David P |e verfasserin |4 aut | |
700 | 1 | |a Shah, Maulin M |e verfasserin |4 aut | |
700 | 1 | |a Yong, Bicheng |e verfasserin |4 aut | |
700 | 1 | |a Li, Lianyong |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Journal of pediatric orthopedics |d 1993 |g (2024) vom: 10. Apr. |w (DE-627)NLM013240641 |x 1539-2570 |7 nnns |
773 | 1 | 8 | |g year:2024 |g day:10 |g month:04 |
856 | 4 | 0 | |u http://dx.doi.org/10.1097/BPO.0000000000002682 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |j 2024 |b 10 |c 04 |