Bioactive Peptide Recognition Based on NLP Pre-Train Algorithm
Bioactive peptides are defined as peptide sequences within a protein that can regulate important bodily functions through their myriad activities. With the development of machine learning, more computational methods were proposed for bioactive peptides recognition so that this task does not only rely on tedious and time-consuming wet-experiment. But the training and testing process of existing models are limited to small datasets, which affects model performance. Inspired by the success of sequence classification in natural language processing with unlabeled data, we proposed a pre-training method for Bioactive peptides recognition. By pre-trained with large-scale of protein sequences, our method achieved the best performance in multiple functional peptides identification including anti-cancer, anti-diabetic, anti-hypertensive, anti-inflammatory and anti-microbial peptides. Compared with the advanced model, our model's precision, coverage, accuracy and absolute true are improved by 7.2%, 6.9%, 6.1% and 4.2% in the result of 5-fold cross-validation. In addition, the results indicate the model has superior prediction performance in single functional peptides recognition, especially for anti-cancer peptides and anti-microbial peptides which with longer sequences.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:20 |
---|---|
Enthalten in: |
IEEE/ACM transactions on computational biology and bioinformatics - 20(2023), 6 vom: 01. Nov., Seite 3809-3819 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Jiang, Likun [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 26.12.2023 Date Revised 26.12.2023 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1109/TCBB.2023.3323295 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM363091068 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM363091068 | ||
003 | DE-627 | ||
005 | 20231227141258.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1109/TCBB.2023.3323295 |2 doi | |
028 | 5 | 2 | |a pubmed24n1239.xml |
035 | |a (DE-627)NLM363091068 | ||
035 | |a (NLM)37815965 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Jiang, Likun |e verfasserin |4 aut | |
245 | 1 | 0 | |a Bioactive Peptide Recognition Based on NLP Pre-Train Algorithm |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 26.12.2023 | ||
500 | |a Date Revised 26.12.2023 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Bioactive peptides are defined as peptide sequences within a protein that can regulate important bodily functions through their myriad activities. With the development of machine learning, more computational methods were proposed for bioactive peptides recognition so that this task does not only rely on tedious and time-consuming wet-experiment. But the training and testing process of existing models are limited to small datasets, which affects model performance. Inspired by the success of sequence classification in natural language processing with unlabeled data, we proposed a pre-training method for Bioactive peptides recognition. By pre-trained with large-scale of protein sequences, our method achieved the best performance in multiple functional peptides identification including anti-cancer, anti-diabetic, anti-hypertensive, anti-inflammatory and anti-microbial peptides. Compared with the advanced model, our model's precision, coverage, accuracy and absolute true are improved by 7.2%, 6.9%, 6.1% and 4.2% in the result of 5-fold cross-validation. In addition, the results indicate the model has superior prediction performance in single functional peptides recognition, especially for anti-cancer peptides and anti-microbial peptides which with longer sequences | ||
650 | 4 | |a Journal Article | |
650 | 7 | |a Peptides |2 NLM | |
650 | 7 | |a Anti-Inflammatory Agents |2 NLM | |
700 | 1 | |a Sun, Nan |e verfasserin |4 aut | |
700 | 1 | |a Zhang, Yue |e verfasserin |4 aut | |
700 | 1 | |a Yu, Xinyu |e verfasserin |4 aut | |
700 | 1 | |a Liu, Xiangrong |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t IEEE/ACM transactions on computational biology and bioinformatics |d 2004 |g 20(2023), 6 vom: 01. Nov., Seite 3809-3819 |w (DE-627)NLM16601530X |x 1557-9964 |7 nnns |
773 | 1 | 8 | |g volume:20 |g year:2023 |g number:6 |g day:01 |g month:11 |g pages:3809-3819 |
856 | 4 | 0 | |u http://dx.doi.org/10.1109/TCBB.2023.3323295 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 20 |j 2023 |e 6 |b 01 |c 11 |h 3809-3819 |