Seq-BEL : Sequence-Based Ensemble Learning for Predicting Virus-Human Protein-Protein Interaction

Infectious diseases are currently the most important and widespread health problem, and identifying viral infection mechanisms is critical for controlling diseases caused by highly infectious viruses. Because of the lack of non-interactive protein pairs and serious imbalance between positive and negative sample ratios, the supervised learning algorithm is not suitable for prediction. At the same time, due to the lack of information on viral proteins and significant dissimilarity in sequence, some ensemble learning models have poor generalization ability. In this paper, we propose a Sequence-Based Ensemble Learning (Seq-BEL) method to predict the potential virus-human PPIs. Specifically, based on the amino acid sequence of proteins and the currently known virus-human PPI network, Seq-BEL calculates various features and similarities of human proteins and viral proteins, and then combines these similarities and features to score the potential of virus-human PPIs. The computational results show that Seq-BEL achieves success in predicting potential virus-human PPIs and outperforms other state-of-the-art methods. More importantly, Seq-BEL also has good predictive performance for new human proteins and new viral proteins. In addition, the model has the advantages of strong robustness and good generalization ability, and can be used as an effective tool for virus-human PPI prediction.

Medienart:

E-Artikel

Erscheinungsjahr:

2022

Erschienen:

2022

Enthalten in:

Zur Gesamtaufnahme - volume:19

Enthalten in:

IEEE/ACM transactions on computational biology and bioinformatics - 19(2022), 3 vom: 15. Mai, Seite 1322-1333

Sprache:

Englisch

Beteiligte Personen:

Ma, Yingjun [VerfasserIn]
He, Tingting [VerfasserIn]
Tan, Yuting [VerfasserIn]
Jiang, Xingpeng [VerfasserIn]

Links:

Volltext

Themen:

Journal Article
Research Support, Non-U.S. Gov't
Viral Proteins

Anmerkungen:

Date Completed 08.06.2022

Date Revised 08.06.2022

published: Print-Electronic

Citation Status MEDLINE

doi:

10.1109/TCBB.2020.3008157

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM313261253