Evaluation of ChatGPT and Google Bard Using Prompt Engineering in Cancer Screening Algorithms
Copyright © 2023 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved..
Large language models (LLMs) such as ChatGPT and Bard have emerged as powerful tools in medicine, showcasing strong results in tasks such as radiology report translations and research paper drafting. While their implementation in clinical practice holds promise, their response accuracy remains variable. This study aimed to evaluate the accuracy of ChatGPT and Bard in clinical decision-making based on the American College of Radiology Appropriateness Criteria for various cancers. Both LLMs were evaluated in terms of their responses to open-ended (OE) and select-all-that-apply (SATA) prompts. Furthermore, the study incorporated prompt engineering (PE) techniques to enhance the accuracy of LLM outputs. The results revealed similar performances between ChatGPT and Bard on OE prompts, with ChatGPT exhibiting marginally higher accuracy in SATA scenarios. The introduction of PE also marginally improved LLM outputs in OE prompts but did not enhance SATA responses. The results highlight the potential of LLMs in aiding clinical decision-making processes, especially when guided by optimally engineered prompts. Future studies in diverse clinical situations are imperative to better understand the impact of LLMs in radiology.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - year:2023 |
---|---|
Enthalten in: |
Academic radiology - (2023) vom: 15. Dez. |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Nguyen, Daniel [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Revised 16.12.2023 published: Print-Electronic Citation Status Publisher |
---|
doi: |
10.1016/j.acra.2023.11.002 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM365946672 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM365946672 | ||
003 | DE-627 | ||
005 | 20231227134154.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231227s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1016/j.acra.2023.11.002 |2 doi | |
028 | 5 | 2 | |a pubmed24n1231.xml |
035 | |a (DE-627)NLM365946672 | ||
035 | |a (NLM)38103973 | ||
035 | |a (PII)S1076-6332(23)00618-9 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Nguyen, Daniel |e verfasserin |4 aut | |
245 | 1 | 0 | |a Evaluation of ChatGPT and Google Bard Using Prompt Engineering in Cancer Screening Algorithms |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Revised 16.12.2023 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status Publisher | ||
520 | |a Copyright © 2023 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved. | ||
520 | |a Large language models (LLMs) such as ChatGPT and Bard have emerged as powerful tools in medicine, showcasing strong results in tasks such as radiology report translations and research paper drafting. While their implementation in clinical practice holds promise, their response accuracy remains variable. This study aimed to evaluate the accuracy of ChatGPT and Bard in clinical decision-making based on the American College of Radiology Appropriateness Criteria for various cancers. Both LLMs were evaluated in terms of their responses to open-ended (OE) and select-all-that-apply (SATA) prompts. Furthermore, the study incorporated prompt engineering (PE) techniques to enhance the accuracy of LLM outputs. The results revealed similar performances between ChatGPT and Bard on OE prompts, with ChatGPT exhibiting marginally higher accuracy in SATA scenarios. The introduction of PE also marginally improved LLM outputs in OE prompts but did not enhance SATA responses. The results highlight the potential of LLMs in aiding clinical decision-making processes, especially when guided by optimally engineered prompts. Future studies in diverse clinical situations are imperative to better understand the impact of LLMs in radiology | ||
650 | 4 | |a Journal Article | |
700 | 1 | |a Swanson, Daniel |e verfasserin |4 aut | |
700 | 1 | |a Newbury, Alex |e verfasserin |4 aut | |
700 | 1 | |a Kim, Young H |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Academic radiology |d 1995 |g (2023) vom: 15. Dez. |w (DE-627)NLM087676818 |x 1878-4046 |7 nnns |
773 | 1 | 8 | |g year:2023 |g day:15 |g month:12 |
856 | 4 | 0 | |u http://dx.doi.org/10.1016/j.acra.2023.11.002 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |j 2023 |b 15 |c 12 |