Details der Publikation - Single-sequence protein structure prediction by integrating protein language models

Single-sequence protein structure prediction by integrating protein language models

Protein structure prediction has been greatly improved by deep learning in the past few years. However, the most successful methods rely on multiple sequence alignment (MSA) of the sequence homologs of the protein under prediction. In nature, a protein folds in the absence of its sequence homologs and thus, a MSA-free structure prediction method is desired. Here, we develop a single-sequence-based protein structure prediction method RaptorX-Single by integrating several protein language models and a structure generation module and then study its advantage over MSA-based methods. Our experimental results indicate that in addition to running much faster than MSA-based methods such as AlphaFold2, RaptorX-Single outperforms AlphaFold2 and other MSA-free methods in predicting the structure of antibodies (after fine-tuning on antibody data), proteins of very few sequence homologs, and single mutation effects. By comparing different protein language models, our results show that not only the scale but also the training data of protein language models will impact the performance. RaptorX-Single also compares favorably to MSA-based AlphaFold2 when the protein under prediction has a large number of sequence homologs.

Medienart:	E-Artikel

Erscheinungsjahr:	2024
Erschienen:	2024

Enthalten in:	Zur Gesamtaufnahme - volume:121
Enthalten in:	Proceedings of the National Academy of Sciences of the United States of America - 121(2024), 13 vom: 26. März, Seite e2308788121

Sprache:	Englisch

Beteiligte Personen:	Jing, Xiaoyang [VerfasserIn] Wu, Fandi [VerfasserIn] Luo, Xiao [VerfasserIn] Xu, Jinbo [VerfasserIn]

Links:	Volltext

Themen:	Antibodies Antibody structure prediction Journal Article Protein language model Protein structure prediction Proteins Single mutation effect Single-sequence protein structure rediction

Anmerkungen:	Date Completed 22.03.2024 Date Revised 05.04.2024 published: Print-Electronic Citation Status MEDLINE

doi:	10.1073/pnas.2308788121

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM369970489

Internformat


LEADER	01000caa a22002652 4500
001	NLM369970489
003	DE-627
005	20240405233912.0
007	cr uuu---uuuuu
008	240322s2024 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1073/pnas.2308788121 \|2 doi
028	5	2	\|a pubmed24n1366.xml
035			\|a (DE-627)NLM369970489
035			\|a (NLM)38507445
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Jing, Xiaoyang \|e verfasserin \|4 aut
245	1	0	\|a Single-sequence protein structure prediction by integrating protein language models
264		1	\|c 2024
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 22.03.2024
500			\|a Date Revised 05.04.2024
500			\|a published: Print-Electronic
500			\|a Citation Status MEDLINE
520			\|a Protein structure prediction has been greatly improved by deep learning in the past few years. However, the most successful methods rely on multiple sequence alignment (MSA) of the sequence homologs of the protein under prediction. In nature, a protein folds in the absence of its sequence homologs and thus, a MSA-free structure prediction method is desired. Here, we develop a single-sequence-based protein structure prediction method RaptorX-Single by integrating several protein language models and a structure generation module and then study its advantage over MSA-based methods. Our experimental results indicate that in addition to running much faster than MSA-based methods such as AlphaFold2, RaptorX-Single outperforms AlphaFold2 and other MSA-free methods in predicting the structure of antibodies (after fine-tuning on antibody data), proteins of very few sequence homologs, and single mutation effects. By comparing different protein language models, our results show that not only the scale but also the training data of protein language models will impact the performance. RaptorX-Single also compares favorably to MSA-based AlphaFold2 when the protein under prediction has a large number of sequence homologs
650		4	\|a Journal Article
650		4	\|a antibody structure prediction
650		4	\|a protein language model
650		4	\|a protein structure prediction
650		4	\|a single mutation effect
650		4	\|a single-sequence protein structure rediction
650		7	\|a Proteins \|2 NLM
650		7	\|a Antibodies \|2 NLM
700	1		\|a Wu, Fandi \|e verfasserin \|4 aut
700	1		\|a Luo, Xiao \|e verfasserin \|4 aut
700	1		\|a Xu, Jinbo \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t Proceedings of the National Academy of Sciences of the United States of America \|d 1915 \|g 121(2024), 13 vom: 26. März, Seite e2308788121 \|w (DE-627)NLM000008982 \|x 1091-6490 \|7 nnns
773	1	8	\|g volume:121 \|g year:2024 \|g number:13 \|g day:26 \|g month:03 \|g pages:e2308788121
856	4	0	\|u http://dx.doi.org/10.1073/pnas.2308788121 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|d 121 \|j 2024 \|e 13 \|b 26 \|c 03 \|h e2308788121

Single-sequence protein structure prediction by integrating protein language models

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände