Details der Publikation - Development of a natural language processing model for deriving breast cancer quality indicators : A cross-sectional, multicenter study

Development of a natural language processing model for deriving breast cancer quality indicators : A cross-sectional, multicenter study

Copyright © 2023 Elsevier Masson SAS. All rights reserved..

OBJECTIVES: Medico-administrative data are promising to automate the calculation of Healthcare Quality and Safety Indicators. Nevertheless, not all relevant indicators can be calculated with this data alone. Our feasibility study objective is to analyze 1) the availability of data sources; 2) the availability of each indicator elementary variables, and 3) to apply natural language processing to automatically retrieve such information.

METHOD: We performed a multicenter cross-sectional observational feasibility study on the clinical data warehouse of Assistance Publique - Hôpitaux de Paris (AP-HP). We studied the management of breast cancer patients treated at AP-HP between January 2019 and June 2021, and the quality indicators published by the European Society of Breast Cancer Specialist, using claims data from the Programme de Médicalisation du Système d'Information (PMSI) and pathology reports. For each indicator, we calculated the number (%) of patients for whom all necessary data sources were available, and the number (%) of patients for whom all elementary variables were available in the sources, and for whom the related HQSI was computable. To extract useful data from the free text reports, we developed and validated dedicated rule-based algorithms, whose performance metrics were assessed with recall, precision, and f1-score.

RESULTS: Out of 5785 female patients diagnosed with a breast cancer (60.9 years, IQR [50.0-71.9]), 5,147 (89.0%) had procedures related to breast cancer recorded in the PMSI, and 3732 (72.5%) had at least one surgery. Out of the 34 key indicators, 9 could be calculated with the PMSI alone, and 6 others became so using the data from pathology reports. Ten elementary variables were needed to calculate the 6 indicators combining the PMSI and pathology reports. The necessary sources were available for 58.8% to 94.6% of patients, depending on the indicators. The extraction algorithms developed had an average accuracy of 76.5% (min-max [32.7%-93.3%]), an average precision of 77.7% [10.0%-97.4%] and an average sensitivity of 71.6% [2.8% to 100.0%]. Once these algorithms applied, the variables needed to calculate the indicators were extracted for 2% to 88% of patients, depending on the indicators.

DISCUSSION: The availability of medical reports in the electronic health records, of the elementary variables within the reports, and the performance of the extraction algorithms limit the population for which the indicators can be calculated.

CONCLUSIONS: The automated calculation of quality indicators from electronic health records is a prospect that comes up against many practical obstacles.

Medienart:	E-Artikel

Erscheinungsjahr:	2023
Erschienen:	2023

Enthalten in:	Zur Gesamtaufnahme - volume:71
Enthalten in:	Revue d'epidemiologie et de sante publique - 71(2023), 6 vom: 21. Dez., Seite 102189

Sprache:	Englisch

Beteiligte Personen:	Guével, Etienne [VerfasserIn] Priou, Sonia [VerfasserIn] Flicoteaux, Rémi [VerfasserIn] Lamé, Guillaume [VerfasserIn] Bey, Romain [VerfasserIn] Tannier, Xavier [VerfasserIn] Cohen, Ariel [VerfasserIn] Chatellier, Gilles [VerfasserIn] Daniel, Christel [VerfasserIn] Tournigand, Christophe [VerfasserIn] Kempf, Emmanuelle [VerfasserIn] AP-HP Cancer Group, a CRAB [VerfasserIn] initiative [VerfasserIn]

Links:	Volltext

Themen:	Electronic Data Processing Health Care Indicateurs de qualité Journal Article Multicenter Study Natural Language Processing Observational Study Quality Indicators Soins de santé Traitement électronique de données Traitement du langage naturel

Anmerkungen:	Date Completed 22.12.2023 Date Revised 22.12.2023 published: Print-Electronic Citation Status MEDLINE

doi:	10.1016/j.respe.2023.102189

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM364640928

Internformat


LEADER	01000caa a22002652 4500
001	NLM364640928
003	DE-627
005	20231227135938.0
007	cr uuu---uuuuu
008	231226s2023 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1016/j.respe.2023.102189 \|2 doi
028	5	2	\|a pubmed24n1235.xml
035			\|a (DE-627)NLM364640928
035			\|a (NLM)37972522
035			\|a (PII)S0398-7620(23)00792-7
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Guével, Etienne \|e verfasserin \|4 aut
245	1	0	\|a Development of a natural language processing model for deriving breast cancer quality indicators : A cross-sectional, multicenter study
264		1	\|c 2023
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 22.12.2023
500			\|a Date Revised 22.12.2023
500			\|a published: Print-Electronic
500			\|a Citation Status MEDLINE
520			\|a Copyright © 2023 Elsevier Masson SAS. All rights reserved.
520			\|a OBJECTIVES: Medico-administrative data are promising to automate the calculation of Healthcare Quality and Safety Indicators. Nevertheless, not all relevant indicators can be calculated with this data alone. Our feasibility study objective is to analyze 1) the availability of data sources; 2) the availability of each indicator elementary variables, and 3) to apply natural language processing to automatically retrieve such information
520			\|a METHOD: We performed a multicenter cross-sectional observational feasibility study on the clinical data warehouse of Assistance Publique - Hôpitaux de Paris (AP-HP). We studied the management of breast cancer patients treated at AP-HP between January 2019 and June 2021, and the quality indicators published by the European Society of Breast Cancer Specialist, using claims data from the Programme de Médicalisation du Système d'Information (PMSI) and pathology reports. For each indicator, we calculated the number (%) of patients for whom all necessary data sources were available, and the number (%) of patients for whom all elementary variables were available in the sources, and for whom the related HQSI was computable. To extract useful data from the free text reports, we developed and validated dedicated rule-based algorithms, whose performance metrics were assessed with recall, precision, and f1-score
520			\|a RESULTS: Out of 5785 female patients diagnosed with a breast cancer (60.9 years, IQR [50.0-71.9]), 5,147 (89.0%) had procedures related to breast cancer recorded in the PMSI, and 3732 (72.5%) had at least one surgery. Out of the 34 key indicators, 9 could be calculated with the PMSI alone, and 6 others became so using the data from pathology reports. Ten elementary variables were needed to calculate the 6 indicators combining the PMSI and pathology reports. The necessary sources were available for 58.8% to 94.6% of patients, depending on the indicators. The extraction algorithms developed had an average accuracy of 76.5% (min-max [32.7%-93.3%]), an average precision of 77.7% [10.0%-97.4%] and an average sensitivity of 71.6% [2.8% to 100.0%]. Once these algorithms applied, the variables needed to calculate the indicators were extracted for 2% to 88% of patients, depending on the indicators
520			\|a DISCUSSION: The availability of medical reports in the electronic health records, of the elementary variables within the reports, and the performance of the extraction algorithms limit the population for which the indicators can be calculated
520			\|a CONCLUSIONS: The automated calculation of quality indicators from electronic health records is a prospect that comes up against many practical obstacles
650		4	\|a Journal Article
650		4	\|a Multicenter Study
650		4	\|a Observational Study
650		4	\|a Electronic Data Processing
650		4	\|a Health Care
650		4	\|a Indicateurs de qualité
650		4	\|a Natural Language Processing
650		4	\|a Quality Indicators
650		4	\|a Soins de santé
650		4	\|a Traitement du langage naturel
650		4	\|a Traitement électronique de données
700	1		\|a Priou, Sonia \|e verfasserin \|4 aut
700	1		\|a Flicoteaux, Rémi \|e verfasserin \|4 aut
700	1		\|a Lamé, Guillaume \|e verfasserin \|4 aut
700	1		\|a Bey, Romain \|e verfasserin \|4 aut
700	1		\|a Tannier, Xavier \|e verfasserin \|4 aut
700	1		\|a Cohen, Ariel \|e verfasserin \|4 aut
700	1		\|a Chatellier, Gilles \|e verfasserin \|4 aut
700	1		\|a Daniel, Christel \|e verfasserin \|4 aut
700	1		\|a Tournigand, Christophe \|e verfasserin \|4 aut
700	1		\|a Kempf, Emmanuelle \|e verfasserin \|4 aut
700	0		\|a AP-HP Cancer Group, a CRAB \|e verfasserin \|4 aut
700	0		\|a initiative \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t Revue d'epidemiologie et de sante publique \|d 1993 \|g 71(2023), 6 vom: 21. Dez., Seite 102189 \|w (DE-627)NLM000133043 \|x 0398-7620 \|7 nnns
773	1	8	\|g volume:71 \|g year:2023 \|g number:6 \|g day:21 \|g month:12 \|g pages:102189
856	4	0	\|u http://dx.doi.org/10.1016/j.respe.2023.102189 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|d 71 \|j 2023 \|e 6 \|b 21 \|c 12 \|h 102189

Development of a natural language processing model for deriving breast cancer quality indicators : A cross-sectional, multicenter study

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände