DICE: Deep Significance Clustering for Outcome-Driven Stratification
Abstract We present deep significance clustering (DICE), a framework for jointly performing representation learning and clustering for “outcome-driven” stratification. Motivated by practical needs in medicine to risk-stratify patients into subgroups, DICE brings self-supervision to unsupervised tasks to generate cluster membership that may be used to categorize unseen patients by risk levels. DICE is driven by a combined objective function and constraint which require a statistically significant association between the outcome and cluster membership of learned representations. DICE also performs a neural architecture search to optimize cluster membership and hyper-parameters for model likelihood and classification accuracy. The performance of DICE was evaluated using two datasets with different outcome ratios extracted from real-world electronic health records of patients who were treated for coronavirus disease 2019 and heart failure. Outcomes are defined as in-hospital mortality (15.9%) and discharge home (36.8%), respectively. Results show that DICE has superior performance as measured by the difference in outcome distribution across clusters, Silhouette score, Calinski-Harabasz index, and Davies-Bouldin index for clustering, and Area under the ROC Curve for outcome classification compared to baseline approaches..
Medienart: |
Preprint |
---|
Erscheinungsjahr: |
2022 |
---|---|
Erschienen: |
2022 |
Enthalten in: |
bioRxiv.org - (2022) vom: 22. Nov. Zur Gesamtaufnahme - year:2022 |
---|
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Huang, Yufang [VerfasserIn] |
---|
Links: |
Volltext [kostenfrei] |
---|
Themen: |
---|
doi: |
10.1101/2020.10.04.20204321 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
XBI019077394 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | XBI019077394 | ||
003 | DE-627 | ||
005 | 20230429092201.0 | ||
007 | cr uuu---uuuuu | ||
008 | 201010s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1101/2020.10.04.20204321 |2 doi | |
035 | |a (DE-627)XBI019077394 | ||
035 | |a (biorXiv)10.1101/2020.10.04.20204321 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Huang, Yufang |e verfasserin |4 aut | |
245 | 1 | 0 | |a DICE: Deep Significance Clustering for Outcome-Driven Stratification |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
520 | |a Abstract We present deep significance clustering (DICE), a framework for jointly performing representation learning and clustering for “outcome-driven” stratification. Motivated by practical needs in medicine to risk-stratify patients into subgroups, DICE brings self-supervision to unsupervised tasks to generate cluster membership that may be used to categorize unseen patients by risk levels. DICE is driven by a combined objective function and constraint which require a statistically significant association between the outcome and cluster membership of learned representations. DICE also performs a neural architecture search to optimize cluster membership and hyper-parameters for model likelihood and classification accuracy. The performance of DICE was evaluated using two datasets with different outcome ratios extracted from real-world electronic health records of patients who were treated for coronavirus disease 2019 and heart failure. Outcomes are defined as in-hospital mortality (15.9%) and discharge home (36.8%), respectively. Results show that DICE has superior performance as measured by the difference in outcome distribution across clusters, Silhouette score, Calinski-Harabasz index, and Davies-Bouldin index for clustering, and Area under the ROC Curve for outcome classification compared to baseline approaches. | ||
650 | 4 | |a Biology |7 (dpeaa)DE-84 | |
650 | 4 | |a 570 |7 (dpeaa)DE-84 | |
700 | 1 | |a Park, Joel C. |e verfasserin |4 aut | |
700 | 1 | |a Axsom, Kelly M. |e verfasserin |4 aut | |
700 | 1 | |a Subramanian, Lakshminarayanan |e verfasserin |4 aut | |
700 | 1 | |a Zhang, Yiye |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t bioRxiv.org |g (2022) vom: 22. Nov. |
773 | 1 | 8 | |g year:2022 |g day:22 |g month:11 |
856 | 4 | 0 | |u http://dx.doi.org/10.1101/2020.10.04.20204321 |z kostenfrei |3 Volltext |
912 | |a GBV_XBI | ||
912 | |a SSG-OLC-PHA | ||
951 | |a AR | ||
952 | |j 2022 |b 22 |c 11 |