Evaluating the Artificial Intelligence Performance Growth in Ophthalmic Knowledge

Copyright © 2023, Jiao et al..

OBJECTIVE: We aim to compare the capabilities of Chat Generative Pre-Trained Transformer (ChatGPT)-3.5 and ChatGPT-4.0 (OpenAI, San Francisco, CA, USA) in addressing multiple-choice ophthalmic case challenges.

METHODS AND ANALYSIS: Both models' accuracy was compared across different ophthalmology subspecialties using multiple-choice ophthalmic clinical cases provided by the American Academy of Ophthalmology (AAO) "Diagnosis This" questions. Additional analysis was based on image content, question difficulty, character length of models' responses, and model's alignment with responses from human respondents. χ2 test, Fisher's exact test, Student's t-test, and one-way analysis of variance (ANOVA) were conducted where appropriate, with p<0.05 considered significant.

RESULTS: GPT-4.0 significantly outperformed GPT-3.5 (75% versus 46%, p<0.01), with the most noticeable improvement in neuro-ophthalmology (100% versus 38%, p=0.03). While both models struggled with uveitis and refractive questions, GPT-4.0 excelled in other areas, such as pediatric questions (82%). In image-related questions, GPT-4.0 also displayed superior accuracy that trended toward significance (73% versus 46%, p=0.07). GPT-4.0 performed better with easier questions (93.8% (least difficult) versus 76.2% (middle) versus 53.3% (most), p=0.03) and generated more concise answers than GPT-3.5 (651.7±342.9 versus 1,112.9±328.8 characters, p<0.01). Moreover, GPT-4.0's answers were more in line with those of AAO respondents (57.3% versus 41.4%, p<0.01), showing a strong correlation between its accuracy and the proportion of AAO respondents who selected GPT-4.0's answer (ρ=0.713, p<0.01).

CONCLUSION AND RELEVANCE: Our study demonstrated that GPT-4.0 significantly outperforms GPT-3.5 in addressing ophthalmic case challenges, especially in neuro-ophthalmology, with improved accuracy even in image-related questions. These findings underscore the potential of advancing artificial intelligence (AI) models in enhancing ophthalmic diagnostics and medical education.

Medienart:

E-Artikel

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

Zur Gesamtaufnahme - volume:15

Enthalten in:

Cureus - 15(2023), 9 vom: 30. Sept., Seite e45700

Sprache:

Englisch

Beteiligte Personen:

Jiao, Cheng [VerfasserIn]
Edupuganti, Neel R [VerfasserIn]
Patel, Parth A [VerfasserIn]
Bui, Tommy [VerfasserIn]
Sheth, Veeral [VerfasserIn]

Links:

Volltext

Themen:

Artificial intelligence
Chatgpt
Journal Article
Medical education
Natural language processing models
Ophthalmology

Anmerkungen:

Date Revised 31.10.2023

published: Electronic-eCollection

Citation Status PubMed-not-MEDLINE

doi:

10.7759/cureus.45700

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM363607188