Dual-View Learning Based on Images and Sequences for Molecular Property Prediction
The prediction of molecular properties remains a challenging task in the field of drug design and development. Recently, there has been a growing interest in the analysis of biological images. Molecular images, as a novel representation, have proven to be competitive, yet they lack explicit information and detailed semantic richness. Conversely, semantic information in SMILES sequences is explicit but lacks spatial structural details. Therefore, in this study, we focus on and explore the relationship between these two types of representations, proposing a novel multimodal architecture named ISMol. ISMol relies on a cross-attention mechanism to extract information representations of molecules from both images and SMILES strings, thereby predicting molecular properties. Evaluation results on 14 small molecule ADMET datasets indicate that ISMol outperforms machine learning (ML) and deep learning (DL) models based on single-modal representations. In addition, we analyze our method through a large number of experiments to test the superiority, interpretability and generalizability of the method. In summary, ISMol offers a powerful deep learning toolbox for drug discovery in a variety of molecular properties.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2024 |
---|---|
Erschienen: |
2024 |
Enthalten in: |
Zur Gesamtaufnahme - volume:28 |
---|---|
Enthalten in: |
IEEE journal of biomedical and health informatics - 28(2024), 3 vom: 01. März, Seite 1564-1574 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Zhang, Xiang [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 07.03.2024 Date Revised 07.03.2024 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1109/JBHI.2023.3347794 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM366444689 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM366444689 | ||
003 | DE-627 | ||
005 | 20240307232145.0 | ||
007 | cr uuu---uuuuu | ||
008 | 240108s2024 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1109/JBHI.2023.3347794 |2 doi | |
028 | 5 | 2 | |a pubmed24n1319.xml |
035 | |a (DE-627)NLM366444689 | ||
035 | |a (NLM)38153823 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Zhang, Xiang |e verfasserin |4 aut | |
245 | 1 | 0 | |a Dual-View Learning Based on Images and Sequences for Molecular Property Prediction |
264 | 1 | |c 2024 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 07.03.2024 | ||
500 | |a Date Revised 07.03.2024 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a The prediction of molecular properties remains a challenging task in the field of drug design and development. Recently, there has been a growing interest in the analysis of biological images. Molecular images, as a novel representation, have proven to be competitive, yet they lack explicit information and detailed semantic richness. Conversely, semantic information in SMILES sequences is explicit but lacks spatial structural details. Therefore, in this study, we focus on and explore the relationship between these two types of representations, proposing a novel multimodal architecture named ISMol. ISMol relies on a cross-attention mechanism to extract information representations of molecules from both images and SMILES strings, thereby predicting molecular properties. Evaluation results on 14 small molecule ADMET datasets indicate that ISMol outperforms machine learning (ML) and deep learning (DL) models based on single-modal representations. In addition, we analyze our method through a large number of experiments to test the superiority, interpretability and generalizability of the method. In summary, ISMol offers a powerful deep learning toolbox for drug discovery in a variety of molecular properties | ||
650 | 4 | |a Journal Article | |
700 | 1 | |a Xiang, Hongxin |e verfasserin |4 aut | |
700 | 1 | |a Yang, Xixi |e verfasserin |4 aut | |
700 | 1 | |a Dong, Jingxin |e verfasserin |4 aut | |
700 | 1 | |a Fu, Xiangzheng |e verfasserin |4 aut | |
700 | 1 | |a Zeng, Xiangxiang |e verfasserin |4 aut | |
700 | 1 | |a Chen, Haowen |e verfasserin |4 aut | |
700 | 1 | |a Li, Keqin |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t IEEE journal of biomedical and health informatics |d 2013 |g 28(2024), 3 vom: 01. März, Seite 1564-1574 |w (DE-627)NLM217081614 |x 2168-2208 |7 nnns |
773 | 1 | 8 | |g volume:28 |g year:2024 |g number:3 |g day:01 |g month:03 |g pages:1564-1574 |
856 | 4 | 0 | |u http://dx.doi.org/10.1109/JBHI.2023.3347794 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 28 |j 2024 |e 3 |b 01 |c 03 |h 1564-1574 |