Dual-View Learning Based on Images and Sequences for Molecular Property Prediction

The prediction of molecular properties remains a challenging task in the field of drug design and development. Recently, there has been a growing interest in the analysis of biological images. Molecular images, as a novel representation, have proven to be competitive, yet they lack explicit information and detailed semantic richness. Conversely, semantic information in SMILES sequences is explicit but lacks spatial structural details. Therefore, in this study, we focus on and explore the relationship between these two types of representations, proposing a novel multimodal architecture named ISMol. ISMol relies on a cross-attention mechanism to extract information representations of molecules from both images and SMILES strings, thereby predicting molecular properties. Evaluation results on 14 small molecule ADMET datasets indicate that ISMol outperforms machine learning (ML) and deep learning (DL) models based on single-modal representations. In addition, we analyze our method through a large number of experiments to test the superiority, interpretability and generalizability of the method. In summary, ISMol offers a powerful deep learning toolbox for drug discovery in a variety of molecular properties.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:28

Enthalten in:

IEEE journal of biomedical and health informatics - 28(2024), 3 vom: 01. März, Seite 1564-1574

Sprache:

Englisch

Beteiligte Personen:

Zhang, Xiang [VerfasserIn]
Xiang, Hongxin [VerfasserIn]
Yang, Xixi [VerfasserIn]
Dong, Jingxin [VerfasserIn]
Fu, Xiangzheng [VerfasserIn]
Zeng, Xiangxiang [VerfasserIn]
Chen, Haowen [VerfasserIn]
Li, Keqin [VerfasserIn]

Links:

Volltext

Themen:

Journal Article

Anmerkungen:

Date Completed 07.03.2024

Date Revised 07.03.2024

published: Print-Electronic

Citation Status MEDLINE

doi:

10.1109/JBHI.2023.3347794

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM366444689