Adjusting for indirectly measured confounding using large-scale propensity score
Copyright © 2022 The Author(s). Published by Elsevier Inc. All rights reserved..
Confounding remains one of the major challenges to causal inference with observational data. This problem is paramount in medicine, where we would like to answer causal questions from large observational datasets like electronic health records (EHRs) and administrative claims. Modern medical data typically contain tens of thousands of covariates. Such a large set carries hope that many of the confounders are directly measured, and further hope that others are indirectly measured through their correlation with measured covariates. How can we exploit these large sets of covariates for causal inference? To help answer this question, this paper examines the performance of the large-scale propensity score (LSPS) approach on causal analysis of medical data. We demonstrate that LSPS may adjust for indirectly measured confounders by including tens of thousands of covariates that may be correlated with them. We present conditions under which LSPS removes bias due to indirectly measured confounders, and we show that LSPS may avoid bias when inadvertently adjusting for variables (like colliders) that otherwise can induce bias. We demonstrate the performance of LSPS with both simulated medical data and real medical data.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2022 |
---|---|
Erschienen: |
2022 |
Enthalten in: |
Zur Gesamtaufnahme - volume:134 |
---|---|
Enthalten in: |
Journal of biomedical informatics - 134(2022) vom: 01. Okt., Seite 104204 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Zhang, Linying [VerfasserIn] |
---|
Links: |
---|
Anmerkungen: |
Date Completed 13.10.2022 Date Revised 28.11.2022 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1016/j.jbi.2022.104204 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM346255740 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM346255740 | ||
003 | DE-627 | ||
005 | 20231226031003.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1016/j.jbi.2022.104204 |2 doi | |
028 | 5 | 2 | |a pubmed24n1154.xml |
035 | |a (DE-627)NLM346255740 | ||
035 | |a (NLM)36108816 | ||
035 | |a (PII)S1532-0464(22)00209-X | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Zhang, Linying |e verfasserin |4 aut | |
245 | 1 | 0 | |a Adjusting for indirectly measured confounding using large-scale propensity score |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 13.10.2022 | ||
500 | |a Date Revised 28.11.2022 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Copyright © 2022 The Author(s). Published by Elsevier Inc. All rights reserved. | ||
520 | |a Confounding remains one of the major challenges to causal inference with observational data. This problem is paramount in medicine, where we would like to answer causal questions from large observational datasets like electronic health records (EHRs) and administrative claims. Modern medical data typically contain tens of thousands of covariates. Such a large set carries hope that many of the confounders are directly measured, and further hope that others are indirectly measured through their correlation with measured covariates. How can we exploit these large sets of covariates for causal inference? To help answer this question, this paper examines the performance of the large-scale propensity score (LSPS) approach on causal analysis of medical data. We demonstrate that LSPS may adjust for indirectly measured confounders by including tens of thousands of covariates that may be correlated with them. We present conditions under which LSPS removes bias due to indirectly measured confounders, and we show that LSPS may avoid bias when inadvertently adjusting for variables (like colliders) that otherwise can induce bias. We demonstrate the performance of LSPS with both simulated medical data and real medical data | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, Non-U.S. Gov't | |
650 | 4 | |a Research Support, N.I.H., Extramural | |
650 | 4 | |a Research Support, U.S. Gov't, Non-P.H.S. | |
650 | 4 | |a Causal inference | |
650 | 4 | |a Electronic health record | |
650 | 4 | |a Observational study | |
650 | 4 | |a Propensity score | |
650 | 4 | |a Unmeasured confounder | |
700 | 1 | |a Wang, Yixin |e verfasserin |4 aut | |
700 | 1 | |a Schuemie, Martijn J |e verfasserin |4 aut | |
700 | 1 | |a Blei, David M |e verfasserin |4 aut | |
700 | 1 | |a Hripcsak, George |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Journal of biomedical informatics |d 2001 |g 134(2022) vom: 01. Okt., Seite 104204 |w (DE-627)NLM112821766 |x 1532-0480 |7 nnns |
773 | 1 | 8 | |g volume:134 |g year:2022 |g day:01 |g month:10 |g pages:104204 |
856 | 4 | 0 | |u http://dx.doi.org/10.1016/j.jbi.2022.104204 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 134 |j 2022 |b 01 |c 10 |h 104204 |