Seq-BEL : Sequence-Based Ensemble Learning for Predicting Virus-Human Protein-Protein Interaction
Infectious diseases are currently the most important and widespread health problem, and identifying viral infection mechanisms is critical for controlling diseases caused by highly infectious viruses. Because of the lack of non-interactive protein pairs and serious imbalance between positive and negative sample ratios, the supervised learning algorithm is not suitable for prediction. At the same time, due to the lack of information on viral proteins and significant dissimilarity in sequence, some ensemble learning models have poor generalization ability. In this paper, we propose a Sequence-Based Ensemble Learning (Seq-BEL) method to predict the potential virus-human PPIs. Specifically, based on the amino acid sequence of proteins and the currently known virus-human PPI network, Seq-BEL calculates various features and similarities of human proteins and viral proteins, and then combines these similarities and features to score the potential of virus-human PPIs. The computational results show that Seq-BEL achieves success in predicting potential virus-human PPIs and outperforms other state-of-the-art methods. More importantly, Seq-BEL also has good predictive performance for new human proteins and new viral proteins. In addition, the model has the advantages of strong robustness and good generalization ability, and can be used as an effective tool for virus-human PPI prediction.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2022 |
---|---|
Erschienen: |
2022 |
Enthalten in: |
Zur Gesamtaufnahme - volume:19 |
---|---|
Enthalten in: |
IEEE/ACM transactions on computational biology and bioinformatics - 19(2022), 3 vom: 15. Mai, Seite 1322-1333 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Ma, Yingjun [VerfasserIn] |
---|
Links: |
---|
Themen: |
Journal Article |
---|
Anmerkungen: |
Date Completed 08.06.2022 Date Revised 08.06.2022 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1109/TCBB.2020.3008157 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM313261253 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM313261253 | ||
003 | DE-627 | ||
005 | 20231225150210.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231225s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1109/TCBB.2020.3008157 |2 doi | |
028 | 5 | 2 | |a pubmed24n1044.xml |
035 | |a (DE-627)NLM313261253 | ||
035 | |a (NLM)32750886 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Ma, Yingjun |e verfasserin |4 aut | |
245 | 1 | 0 | |a Seq-BEL |b Sequence-Based Ensemble Learning for Predicting Virus-Human Protein-Protein Interaction |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 08.06.2022 | ||
500 | |a Date Revised 08.06.2022 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Infectious diseases are currently the most important and widespread health problem, and identifying viral infection mechanisms is critical for controlling diseases caused by highly infectious viruses. Because of the lack of non-interactive protein pairs and serious imbalance between positive and negative sample ratios, the supervised learning algorithm is not suitable for prediction. At the same time, due to the lack of information on viral proteins and significant dissimilarity in sequence, some ensemble learning models have poor generalization ability. In this paper, we propose a Sequence-Based Ensemble Learning (Seq-BEL) method to predict the potential virus-human PPIs. Specifically, based on the amino acid sequence of proteins and the currently known virus-human PPI network, Seq-BEL calculates various features and similarities of human proteins and viral proteins, and then combines these similarities and features to score the potential of virus-human PPIs. The computational results show that Seq-BEL achieves success in predicting potential virus-human PPIs and outperforms other state-of-the-art methods. More importantly, Seq-BEL also has good predictive performance for new human proteins and new viral proteins. In addition, the model has the advantages of strong robustness and good generalization ability, and can be used as an effective tool for virus-human PPI prediction | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, Non-U.S. Gov't | |
650 | 7 | |a Viral Proteins |2 NLM | |
700 | 1 | |a He, Tingting |e verfasserin |4 aut | |
700 | 1 | |a Tan, Yuting |e verfasserin |4 aut | |
700 | 1 | |a Jiang, Xingpeng |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t IEEE/ACM transactions on computational biology and bioinformatics |d 2004 |g 19(2022), 3 vom: 15. Mai, Seite 1322-1333 |w (DE-627)NLM16601530X |x 1557-9964 |7 nnns |
773 | 1 | 8 | |g volume:19 |g year:2022 |g number:3 |g day:15 |g month:05 |g pages:1322-1333 |
856 | 4 | 0 | |u http://dx.doi.org/10.1109/TCBB.2020.3008157 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 19 |j 2022 |e 3 |b 15 |c 05 |h 1322-1333 |