Dirichlet process mixture models to impute missing predictor data in counterfactual prediction models : an application to predict optimal type 2 diabetes therapy

© 2023. The Author(s)..

BACKGROUND: The handling of missing data is a challenge for inference and regression modelling. A particular challenge is dealing with missing predictor information, particularly when trying to build and make predictions from models for use in clinical practice.

METHODS: We utilise a flexible Bayesian approach for handling missing predictor information in regression models. This provides practitioners with full posterior predictive distributions for both the missing predictor information (conditional on the observed predictors) and the outcome-of-interest. We apply this approach to a previously proposed counterfactual treatment selection model for type 2 diabetes second-line therapies. Our approach combines a regression model and a Dirichlet process mixture model (DPMM), where the former defines the treatment selection model, and the latter provides a flexible way to model the joint distribution of the predictors.

RESULTS: We show that DPMMs can model complex relationships between predictor variables and can provide powerful means of fitting models to incomplete data (under missing-completely-at-random and missing-at-random assumptions). This framework ensures that the posterior distribution for the parameters and the conditional average treatment effect estimates automatically reflect the additional uncertainties associated with missing data due to the hierarchical model structure. We also demonstrate that in the presence of multiple missing predictors, the DPMM model can be used to explore which variable(s), if collected, could provide the most additional information about the likely outcome.

CONCLUSIONS: When developing clinical prediction models, DPMMs offer a flexible way to model complex covariate structures and handle missing predictor information. DPMM-based counterfactual prediction models can also provide additional information to support clinical decision-making, including allowing predictions with appropriate uncertainty to be made for individuals with incomplete predictor data.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:24

Enthalten in:

BMC medical informatics and decision making - 24(2024), 1 vom: 08. Jan., Seite 12

Sprache:

Englisch

Beteiligte Personen:

Cardoso, Pedro [VerfasserIn]
Dennis, John M [VerfasserIn]
Bowden, Jack [VerfasserIn]
Shields, Beverley M [VerfasserIn]
McKinley, Trevelyan J [VerfasserIn]
the MASTERMIND Consortium [VerfasserIn]

Links:

Volltext

Themen:

Bayesian modelling
Dirichlet process mixture model
Journal Article
Precision medicine
Research Support, Non-U.S. Gov't
Treatment selection model
Type 2 diabetes

Anmerkungen:

Date Completed 10.01.2024

Date Revised 13.03.2024

published: Electronic

Citation Status MEDLINE

doi:

10.1186/s12911-023-02400-3

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM366820001