Details der Publikation - Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey

Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey

The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology. This approach leverages the rich, multifaceted descriptions of biomolecules contained within textual data sources to enhance our fundamental understanding and enable downstream computational tasks such as biomolecule property prediction. The fusion of the nuanced narratives expressed through natural language with the structural and functional specifics of biomolecules described via various molecular modeling techniques opens new avenues for comprehensively representing and analyzing biomolecules. By incorporating the contextual language data that surrounds biomolecules into their modeling, BL aims to capture a holistic view encompassing both the symbolic qualities conveyed through language as well as quantitative structural characteristics. In this review, we provide an extensive analysis of recent advancements achieved through cross modeling of biomolecules and natural language. (1) We begin by outlining the technical representations of biomolecules employed, including sequences, 2D graphs, and 3D structures. (2) We then examine in depth the rationale and key objectives underlying effective multi-modal integration of language and molecular data sources. (3) We subsequently survey the practical applications enabled to date in this developing research area. (4) We also compile and summarize the available resources and datasets to facilitate future work. (5) Looking ahead, we identify several promising research directions worthy of further exploration and investment to continue advancing the field. The related resources and contents are updating in \url{https://github.com/QizhiPei/Awesome-Biomolecule-Language-Cross-Modeling}..

Medienart:	Preprint

Erscheinungsjahr:	2024
Erschienen:	2024

Enthalten in:	arXiv.org - (2024) vom: 03. März Zur Gesamtaufnahme - year:2024

Sprache:	Englisch

Beteiligte Personen:	Pei, Qizhi [VerfasserIn] Wu, Lijun [VerfasserIn] Gao, Kaiyuan [VerfasserIn] Zhu, Jinhua [VerfasserIn] Wang, Yue [VerfasserIn] Wang, Zun [VerfasserIn] Qin, Tao [VerfasserIn] Yan, Rui [VerfasserIn]

Links:	Volltext [kostenfrei]

Themen:	000 570 Computer Science - Artificial Intelligence Computer Science - Computation and Language Quantitative Biology - Biomolecules

Förderinstitution / Projekttitel:

PPN (Katalog-ID):	XCH042773431

Internformat


LEADER	01000naa a22002652 4500
001	XCH042773431
003	DE-627
005	20240306114457.0
007	cr uuu---uuuuu
008	240306s2024 xx \|\|\|\|\|o 00\| \|\|eng c
035			\|a (DE-627)XCH042773431
035			\|a (chemrXiv)2403.01528
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Pei, Qizhi \|e verfasserin \|4 aut
245	1	0	\|a Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey
264		1	\|c 2024
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
520			\|a The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology. This approach leverages the rich, multifaceted descriptions of biomolecules contained within textual data sources to enhance our fundamental understanding and enable downstream computational tasks such as biomolecule property prediction. The fusion of the nuanced narratives expressed through natural language with the structural and functional specifics of biomolecules described via various molecular modeling techniques opens new avenues for comprehensively representing and analyzing biomolecules. By incorporating the contextual language data that surrounds biomolecules into their modeling, BL aims to capture a holistic view encompassing both the symbolic qualities conveyed through language as well as quantitative structural characteristics. In this review, we provide an extensive analysis of recent advancements achieved through cross modeling of biomolecules and natural language. (1) We begin by outlining the technical representations of biomolecules employed, including sequences, 2D graphs, and 3D structures. (2) We then examine in depth the rationale and key objectives underlying effective multi-modal integration of language and molecular data sources. (3) We subsequently survey the practical applications enabled to date in this developing research area. (4) We also compile and summarize the available resources and datasets to facilitate future work. (5) Looking ahead, we identify several promising research directions worthy of further exploration and investment to continue advancing the field. The related resources and contents are updating in \url{https://github.com/QizhiPei/Awesome-Biomolecule-Language-Cross-Modeling}.
650		4	\|a Computer Science - Computation and Language \|7 (dpeaa)DE-84
650		4	\|a Computer Science - Artificial Intelligence \|7 (dpeaa)DE-84
650		4	\|a Quantitative Biology - Biomolecules \|7 (dpeaa)DE-84
650		4	\|a 000 \|7 (dpeaa)DE-84
650		4	\|a 570 \|7 (dpeaa)DE-84
700	1		\|a Wu, Lijun \|4 aut
700	1		\|a Gao, Kaiyuan \|4 aut
700	1		\|a Zhu, Jinhua \|4 aut
700	1		\|a Wang, Yue \|4 aut
700	1		\|a Wang, Zun \|4 aut
700	1		\|a Qin, Tao \|4 aut
700	1		\|a Yan, Rui \|4 aut
773	0	8	\|i Enthalten in \|t arXiv.org \|g (2024) vom: 03. März
773	1	8	\|g year:2024 \|g day:03 \|g month:03
856	4	0	\|u https://arxiv.org/abs/2403.01528 \|z kostenfrei \|3 Volltext
912			\|a GBV_XCH
951			\|a AR
952			\|j 2024 \|b 03 \|c 03

Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände