Adapted large language models can outperform medical experts in clinical text summarization
© 2024. The Author(s), under exclusive licence to Springer Nature America, Inc..
Analyzing vast textual data and summarizing key information from electronic health records imposes a substantial burden on how clinicians allocate their time. Although large language models (LLMs) have shown promise in natural language processing (NLP) tasks, their effectiveness on a diverse range of clinical summarization tasks remains unproven. Here we applied adaptation methods to eight LLMs, spanning four distinct clinical summarization tasks: radiology reports, patient questions, progress notes and doctor-patient dialogue. Quantitative assessments with syntactic, semantic and conceptual NLP metrics reveal trade-offs between models and adaptation methods. A clinical reader study with 10 physicians evaluated summary completeness, correctness and conciseness; in most cases, summaries from our best-adapted LLMs were deemed either equivalent (45%) or superior (36%) compared with summaries from medical experts. The ensuing safety analysis highlights challenges faced by both LLMs and medical experts, as we connect errors to potential medical harm and categorize types of fabricated information. Our research provides evidence of LLMs outperforming medical experts in clinical text summarization across multiple tasks. This suggests that integrating LLMs into clinical workflows could alleviate documentation burden, allowing clinicians to focus more on patient care.
Errataetall: | |
---|---|
Medienart: |
E-Artikel |
Erscheinungsjahr: |
2024 |
---|---|
Erschienen: |
2024 |
Enthalten in: |
Zur Gesamtaufnahme - volume:30 |
---|---|
Enthalten in: |
Nature medicine - 30(2024), 4 vom: 22. Apr., Seite 1134-1142 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Van Veen, Dave [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 22.04.2024 Date Revised 22.04.2024 published: Print-Electronic UpdateOf: Res Sq. 2023 Oct 30;:. - PMID 37961377 Citation Status MEDLINE |
---|
doi: |
10.1038/s41591-024-02855-5 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM369036530 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM369036530 | ||
003 | DE-627 | ||
005 | 20240423232152.0 | ||
007 | cr uuu---uuuuu | ||
008 | 240229s2024 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1038/s41591-024-02855-5 |2 doi | |
028 | 5 | 2 | |a pubmed24n1384.xml |
035 | |a (DE-627)NLM369036530 | ||
035 | |a (NLM)38413730 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Van Veen, Dave |e verfasserin |4 aut | |
245 | 1 | 0 | |a Adapted large language models can outperform medical experts in clinical text summarization |
264 | 1 | |c 2024 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 22.04.2024 | ||
500 | |a Date Revised 22.04.2024 | ||
500 | |a published: Print-Electronic | ||
500 | |a UpdateOf: Res Sq. 2023 Oct 30;:. - PMID 37961377 | ||
500 | |a Citation Status MEDLINE | ||
520 | |a © 2024. The Author(s), under exclusive licence to Springer Nature America, Inc. | ||
520 | |a Analyzing vast textual data and summarizing key information from electronic health records imposes a substantial burden on how clinicians allocate their time. Although large language models (LLMs) have shown promise in natural language processing (NLP) tasks, their effectiveness on a diverse range of clinical summarization tasks remains unproven. Here we applied adaptation methods to eight LLMs, spanning four distinct clinical summarization tasks: radiology reports, patient questions, progress notes and doctor-patient dialogue. Quantitative assessments with syntactic, semantic and conceptual NLP metrics reveal trade-offs between models and adaptation methods. A clinical reader study with 10 physicians evaluated summary completeness, correctness and conciseness; in most cases, summaries from our best-adapted LLMs were deemed either equivalent (45%) or superior (36%) compared with summaries from medical experts. The ensuing safety analysis highlights challenges faced by both LLMs and medical experts, as we connect errors to potential medical harm and categorize types of fabricated information. Our research provides evidence of LLMs outperforming medical experts in clinical text summarization across multiple tasks. This suggests that integrating LLMs into clinical workflows could alleviate documentation burden, allowing clinicians to focus more on patient care | ||
650 | 4 | |a Journal Article | |
700 | 1 | |a Van Uden, Cara |e verfasserin |4 aut | |
700 | 1 | |a Blankemeier, Louis |e verfasserin |4 aut | |
700 | 1 | |a Delbrouck, Jean-Benoit |e verfasserin |4 aut | |
700 | 1 | |a Aali, Asad |e verfasserin |4 aut | |
700 | 1 | |a Bluethgen, Christian |e verfasserin |4 aut | |
700 | 1 | |a Pareek, Anuj |e verfasserin |4 aut | |
700 | 1 | |a Polacin, Malgorzata |e verfasserin |4 aut | |
700 | 1 | |a Reis, Eduardo Pontes |e verfasserin |4 aut | |
700 | 1 | |a Seehofnerová, Anna |e verfasserin |4 aut | |
700 | 1 | |a Rohatgi, Nidhi |e verfasserin |4 aut | |
700 | 1 | |a Hosamani, Poonam |e verfasserin |4 aut | |
700 | 1 | |a Collins, William |e verfasserin |4 aut | |
700 | 1 | |a Ahuja, Neera |e verfasserin |4 aut | |
700 | 1 | |a Langlotz, Curtis P |e verfasserin |4 aut | |
700 | 1 | |a Hom, Jason |e verfasserin |4 aut | |
700 | 1 | |a Gatidis, Sergios |e verfasserin |4 aut | |
700 | 1 | |a Pauly, John |e verfasserin |4 aut | |
700 | 1 | |a Chaudhari, Akshay S |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Nature medicine |d 1995 |g 30(2024), 4 vom: 22. Apr., Seite 1134-1142 |w (DE-627)NLM074659804 |x 1546-170X |7 nnns |
773 | 1 | 8 | |g volume:30 |g year:2024 |g number:4 |g day:22 |g month:04 |g pages:1134-1142 |
856 | 4 | 0 | |u http://dx.doi.org/10.1038/s41591-024-02855-5 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 30 |j 2024 |e 4 |b 22 |c 04 |h 1134-1142 |