Details der Publikation - Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering

Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering

This paper presents a study on the integration of domain-specific knowledge in prompt engineering to enhance the performance of large language models (LLMs) in scientific domains. A benchmark dataset is curated to encapsulate the intricate physical-chemical properties of small molecules, their drugability for pharmacology, alongside the functional attributes of enzymes and crystal materials, underscoring the relevance and applicability across biological and chemical domains.The proposed domain-knowledge embedded prompt engineering method outperforms traditional prompt engineering strategies on various metrics, including capability, accuracy, F1 score, and hallucination drop. The effectiveness of the method is demonstrated through case studies on complex materials including the MacMillan catalyst, paclitaxel, and lithium cobalt oxide. The results suggest that domain-knowledge prompts can guide LLMs to generate more accurate and relevant responses, highlighting the potential of LLMs as powerful tools for scientific discovery and innovation when equipped with domain-specific prompts. The study also discusses limitations and future directions for domain-specific prompt engineering development..

Medienart:	Preprint

Erscheinungsjahr:	2024
Erschienen:	2024

Enthalten in:	arXiv.org - (2024) vom: 22. Apr. Zur Gesamtaufnahme - year:2024

Sprache:	Englisch

Beteiligte Personen:	Liu, Hongxuan [VerfasserIn] Yin, Haoyu [VerfasserIn] Luo, Zhiyao [VerfasserIn] Wang, Xiaonan [VerfasserIn]

Links:	Volltext [kostenfrei]

Themen:	000 Computer Science - Artificial Intelligence Computer Science - Computation and Language

Förderinstitution / Projekttitel:

PPN (Katalog-ID):	XAR04336814X

Internformat


LEADER	01000naa a22002652 4500
001	XAR04336814X
003	DE-627
005	20240424080411.0
007	cr uuu---uuuuu
008	240424s2024 xx \|\|\|\|\|o 00\| \|\|eng c
035			\|a (DE-627)XAR04336814X
035			\|a (arXiv)2404.14467
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Liu, Hongxuan \|e verfasserin \|4 aut
245	1	0	\|a Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering
264		1	\|c 2024
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
520			\|a This paper presents a study on the integration of domain-specific knowledge in prompt engineering to enhance the performance of large language models (LLMs) in scientific domains. A benchmark dataset is curated to encapsulate the intricate physical-chemical properties of small molecules, their drugability for pharmacology, alongside the functional attributes of enzymes and crystal materials, underscoring the relevance and applicability across biological and chemical domains.The proposed domain-knowledge embedded prompt engineering method outperforms traditional prompt engineering strategies on various metrics, including capability, accuracy, F1 score, and hallucination drop. The effectiveness of the method is demonstrated through case studies on complex materials including the MacMillan catalyst, paclitaxel, and lithium cobalt oxide. The results suggest that domain-knowledge prompts can guide LLMs to generate more accurate and relevant responses, highlighting the potential of LLMs as powerful tools for scientific discovery and innovation when equipped with domain-specific prompts. The study also discusses limitations and future directions for domain-specific prompt engineering development.
650		4	\|a Computer Science - Computation and Language \|7 (dpeaa)DE-84
650		4	\|a Computer Science - Artificial Intelligence \|7 (dpeaa)DE-84
650		4	\|a 000 \|7 (dpeaa)DE-84
700	1		\|a Yin, Haoyu \|e verfasserin \|4 aut
700	1		\|a Luo, Zhiyao \|e verfasserin \|4 aut
700	1		\|a Wang, Xiaonan \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t arXiv.org \|g (2024) vom: 22. Apr.
773	1	8	\|g year:2024 \|g day:22 \|g month:04
856	4	0	\|u https://arxiv.org/abs/2404.14467 \|x 0 \|z kostenfrei \|3 Volltext
912			\|a GBV_XAR
951			\|a AR
952			\|j 2024 \|b 22 \|c 04

Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände