Evaluation of ChatGPT and Google Bard Using Prompt Engineering in Cancer Screening Algorithms

Copyright © 2023 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved..

Large language models (LLMs) such as ChatGPT and Bard have emerged as powerful tools in medicine, showcasing strong results in tasks such as radiology report translations and research paper drafting. While their implementation in clinical practice holds promise, their response accuracy remains variable. This study aimed to evaluate the accuracy of ChatGPT and Bard in clinical decision-making based on the American College of Radiology Appropriateness Criteria for various cancers. Both LLMs were evaluated in terms of their responses to open-ended (OE) and select-all-that-apply (SATA) prompts. Furthermore, the study incorporated prompt engineering (PE) techniques to enhance the accuracy of LLM outputs. The results revealed similar performances between ChatGPT and Bard on OE prompts, with ChatGPT exhibiting marginally higher accuracy in SATA scenarios. The introduction of PE also marginally improved LLM outputs in OE prompts but did not enhance SATA responses. The results highlight the potential of LLMs in aiding clinical decision-making processes, especially when guided by optimally engineered prompts. Future studies in diverse clinical situations are imperative to better understand the impact of LLMs in radiology.

Medienart:

E-Artikel

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

Zur Gesamtaufnahme - year:2023

Enthalten in:

Academic radiology - (2023) vom: 15. Dez.

Sprache:

Englisch

Beteiligte Personen:

Nguyen, Daniel [VerfasserIn]
Swanson, Daniel [VerfasserIn]
Newbury, Alex [VerfasserIn]
Kim, Young H [VerfasserIn]

Links:

Volltext

Themen:

Journal Article

Anmerkungen:

Date Revised 16.12.2023

published: Print-Electronic

Citation Status Publisher

doi:

10.1016/j.acra.2023.11.002

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM365946672