A New Tool for Holistic Residency Application Review : Using Natural Language Processing of Applicant Experiences to Predict Interview Invitation

Copyright © 2023 by the Association of American Medical Colleges..

PROBLEM: Reviewing residency application narrative components is time intensive and has contributed to nearly half of applications not receiving holistic review. The authors developed a natural language processing (NLP)-based tool to automate review of applicants' narrative experience entries and predict interview invitation.

APPROACH: Experience entries (n = 188,500) were extracted from 6,403 residency applications across 3 application cycles (2017-2019) at 1 internal medicine program, combined at the applicant level, and paired with the interview invitation decision (n = 1,224 invitations). NLP identified important words (or word pairs) with term frequency-inverse document frequency, which were used to predict interview invitation using logistic regression with L1 regularization. Terms remaining in the model were analyzed thematically. Logistic regression models were also built using structured application data and a combination of NLP and structured data. Model performance was evaluated on never-before-seen data using area under the receiver operating characteristic and precision-recall curves (AUROC, AUPRC).

OUTCOMES: The NLP model had an AUROC of 0.80 (vs chance decision of 0.50) and AUPRC of 0.49 (vs chance decision of 0.19), showing moderate predictive strength. Phrases indicating active leadership, research, or work in social justice and health disparities were associated with interview invitation. The model's detection of these key selection factors demonstrated face validity. Adding structured data to the model significantly improved prediction (AUROC 0.92, AUPRC 0.73), as expected given reliance on such metrics for interview invitation.

NEXT STEPS: This model represents a first step in using NLP-based artificial intelligence tools to promote holistic residency application review. The authors are assessing the practical utility of using this model to identify applicants screened out using traditional metrics. Generalizability must be determined through model retraining and evaluation at other programs. Work is ongoing to thwart model "gaming," improve prediction, and remove unwanted biases introduced during model training.

Medienart:

E-Artikel

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

Zur Gesamtaufnahme - volume:98

Enthalten in:

Academic medicine : journal of the Association of American Medical Colleges - 98(2023), 9 vom: 01. Sept., Seite 1018-1021

Sprache:

Englisch

Beteiligte Personen:

Mahtani, Arun Umesh [VerfasserIn]
Reinstein, Ilan [VerfasserIn]
Marin, Marina [VerfasserIn]
Burk-Rafel, Jesse [VerfasserIn]

Links:

Volltext

Themen:

Journal Article

Anmerkungen:

Date Completed 23.10.2023

Date Revised 24.10.2023

published: Print-Electronic

Citation Status MEDLINE

doi:

10.1097/ACM.0000000000005210

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM354457586