Details der Publikation - DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency

DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency

The combination of electronic health records (EHR) and medical images is crucial for clinicians in making diagnoses and forecasting prognosis. Strategically fusing these two data modalities has great potential to improve the accuracy of machine learning models in clinical prediction tasks. However, the asynchronous and complementary nature of EHR and medical images presents unique challenges. Missing modalities due to clinical and administrative factors are inevitable in practice, and the significance of each data modality varies depending on the patient and the prediction target, resulting in inconsistent predictions and suboptimal model performance. To address these challenges, we propose DrFuse to achieve effective clinical multi-modal fusion. It tackles the missing modality issue by disentangling the features shared across modalities and those unique within each modality. Furthermore, we address the modal inconsistency issue via a disease-wise attention layer that produces the patient- and disease-wise weighting for each modality to make the final prediction. We validate the proposed method using real-world large-scale datasets, MIMIC-IV and MIMIC-CXR. Experimental results show that the proposed method significantly outperforms the state-of-the-art models. Our implementation is publicly available at https://github.com/dorothy-yao/drfuse..

Medienart:	Preprint

Erscheinungsjahr:	2024
Erschienen:	2024

Enthalten in:	arXiv.org - (2024) vom: 10. März Zur Gesamtaufnahme - year:2024

Sprache:	Englisch

Beteiligte Personen:	Yao, Wenfang [VerfasserIn] Yin, Kejing [VerfasserIn] Cheung, William K. [VerfasserIn] Liu, Jia [VerfasserIn] Qin, Jing [VerfasserIn]

Links:	Volltext [kostenfrei]

Themen:	000 620 Computer Science - Computer Vision and Pattern Recognition Computer Science - Machine Learning Electrical Engineering and Systems Science - Image and Video Processing

Förderinstitution / Projekttitel:

PPN (Katalog-ID):	XAR042855675

Internformat


LEADER	01000caa a22002652 4500
001	XAR042855675
003	DE-627
005	20240313080451.0
007	cr uuu---uuuuu
008	240312s2024 xx \|\|\|\|\|o 00\| \|\|eng c
035			\|a (DE-627)XAR042855675
035			\|a (arXiv)2403.06197
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Yao, Wenfang \|e verfasserin \|4 aut
245	1	0	\|a DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency
264		1	\|c 2024
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
520			\|a The combination of electronic health records (EHR) and medical images is crucial for clinicians in making diagnoses and forecasting prognosis. Strategically fusing these two data modalities has great potential to improve the accuracy of machine learning models in clinical prediction tasks. However, the asynchronous and complementary nature of EHR and medical images presents unique challenges. Missing modalities due to clinical and administrative factors are inevitable in practice, and the significance of each data modality varies depending on the patient and the prediction target, resulting in inconsistent predictions and suboptimal model performance. To address these challenges, we propose DrFuse to achieve effective clinical multi-modal fusion. It tackles the missing modality issue by disentangling the features shared across modalities and those unique within each modality. Furthermore, we address the modal inconsistency issue via a disease-wise attention layer that produces the patient- and disease-wise weighting for each modality to make the final prediction. We validate the proposed method using real-world large-scale datasets, MIMIC-IV and MIMIC-CXR. Experimental results show that the proposed method significantly outperforms the state-of-the-art models. Our implementation is publicly available at https://github.com/dorothy-yao/drfuse.
650		4	\|a Electrical Engineering and Systems Science - Image and Video Processing \|7 (dpeaa)DE-84
650		4	\|a Computer Science - Computer Vision and Pattern Recognition \|7 (dpeaa)DE-84
650		4	\|a Computer Science - Machine Learning \|7 (dpeaa)DE-84
650		4	\|a 620 \|7 (dpeaa)DE-84
650		4	\|a 000 \|7 (dpeaa)DE-84
700	1		\|a Yin, Kejing \|4 aut
700	1		\|a Cheung, William K. \|4 aut
700	1		\|a Liu, Jia \|4 aut
700	1		\|a Qin, Jing \|4 aut
773	0	8	\|i Enthalten in \|t arXiv.org \|g (2024) vom: 10. März
773	1	8	\|g year:2024 \|g day:10 \|g month:03
856	4	0	\|u https://arxiv.org/abs/2403.06197 \|z kostenfrei \|3 Volltext
912			\|a GBV_XAR
951			\|a AR
952			\|j 2024 \|b 10 \|c 03

DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände