Details der Publikation - Towards Medical Billing Automation: NLP for Outpatient Clinician Note Classification

Towards Medical Billing Automation: NLP for Outpatient Clinician Note Classification

Objectives: Our primary objective was to develop a natural language processing approach that accurately predicts outpatient Evaluation and Management (E/M) level of service (LoS) codes using clinicians notes from a health system electronic health record. A secondary objective was to investigate the impact of clinic note de-identification on document classification performance. Methods: We used retrospective outpatient office clinic notes from four medical and surgical specialties. Classification models were fine-tuned on the clinic notes datasets and stratified by subspecialty. The success criteria for the classification tasks were the classification accuracy and F1-scores on internal test data. For the secondary objective, the dataset was de-identified using Named Entity Recognition (NER) to remove protected health information (PHI), and models were retrained. Results: The models demonstrated similar predictive performance across different specialties, except for internal medicine, which had the lowest classification accuracy across all model architectures. The models trained on the entire note corpus achieved an E/M LoS CPT code classification accuracy of 74.8% (CI 95: 74.1-75.6). However, the de-identified note corpus showed a markedly lower classification accuracy of 48.2% (CI 95: 47.7-48.6) compared to the model trained on the identified notes. Conclusion: The study demonstrates the potential of NLP-based document classifiers to accurately predict E/M LoS CPT codes using clinical notes from various medical and procedural specialties. The models' performance suggests that the classification task's complexity merits further investigation. The de-identification experiment demonstrated that de-identification may negatively impact classifier performance. Further research is needed to validate the performance of our NLP classifiers in different healthcare settings and patient populations and to investigate the potential implications of de-identification on model performance..

Medienart:	Preprint

Erscheinungsjahr:	2023
Erschienen:	2023

Enthalten in:	bioRxiv.org - (2023) vom: 12. Juli Zur Gesamtaufnahme - year:2023

Sprache:	Englisch

Beteiligte Personen:	Crowson, Matthew G [VerfasserIn] Alsentzer, Emily [VerfasserIn] Fiskio, Julie M [VerfasserIn] Bates, David [VerfasserIn]

Links:	Volltext [kostenfrei]

Themen:	570 Biology

doi:	10.1101/2023.07.07.23292367

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	XBI040181537

Internformat


LEADER	01000caa a22002652 4500
001	XBI040181537
003	DE-627
005	20231205144652.0
007	cr uuu---uuuuu
008	230713s2023 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1101/2023.07.07.23292367 \|2 doi
035			\|a (DE-627)XBI040181537
035			\|a (biorXiv)10.1101/2023.07.07.23292367
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Crowson, Matthew G \|e verfasserin \|0 (orcid)0000-0001-9950-0985 \|4 aut
245	1	0	\|a Towards Medical Billing Automation: NLP for Outpatient Clinician Note Classification
264		1	\|c 2023
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
520			\|a Objectives: Our primary objective was to develop a natural language processing approach that accurately predicts outpatient Evaluation and Management (E/M) level of service (LoS) codes using clinicians notes from a health system electronic health record. A secondary objective was to investigate the impact of clinic note de-identification on document classification performance. Methods: We used retrospective outpatient office clinic notes from four medical and surgical specialties. Classification models were fine-tuned on the clinic notes datasets and stratified by subspecialty. The success criteria for the classification tasks were the classification accuracy and F1-scores on internal test data. For the secondary objective, the dataset was de-identified using Named Entity Recognition (NER) to remove protected health information (PHI), and models were retrained. Results: The models demonstrated similar predictive performance across different specialties, except for internal medicine, which had the lowest classification accuracy across all model architectures. The models trained on the entire note corpus achieved an E/M LoS CPT code classification accuracy of 74.8% (CI 95: 74.1-75.6). However, the de-identified note corpus showed a markedly lower classification accuracy of 48.2% (CI 95: 47.7-48.6) compared to the model trained on the identified notes. Conclusion: The study demonstrates the potential of NLP-based document classifiers to accurately predict E/M LoS CPT codes using clinical notes from various medical and procedural specialties. The models' performance suggests that the classification task's complexity merits further investigation. The de-identification experiment demonstrated that de-identification may negatively impact classifier performance. Further research is needed to validate the performance of our NLP classifiers in different healthcare settings and patient populations and to investigate the potential implications of de-identification on model performance.
650		4	\|a Biology \|7 (dpeaa)DE-84
650		4	\|a 570 \|7 (dpeaa)DE-84
700	1		\|a Alsentzer, Emily \|0 (orcid)0000-0002-5370-1746 \|4 aut
700	1		\|a Fiskio, Julie M \|4 aut
700	1		\|a Bates, David \|0 (orcid)0000-0001-6268-1540 \|4 aut
773	0	8	\|i Enthalten in \|t bioRxiv.org \|g (2023) vom: 12. Juli
773	1	8	\|g year:2023 \|g day:12 \|g month:07
856	4	0	\|u http://dx.doi.org/10.1101/2023.07.07.23292367 \|z kostenfrei \|3 Volltext
912			\|a GBV_XBI
951			\|a AR
952			\|j 2023 \|b 12 \|c 07

Towards Medical Billing Automation: NLP for Outpatient Clinician Note Classification

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände