Compositional features analysis by machine learning in genome represents linear adaptation of monkeypox virus
Copyright © 2024 Zhang, Li, Cai, Kang, Feng, Li, Chen, Li, Bao and Jiang..
Introduction: The global headlines have been dominated by the sudden and widespread outbreak of monkeypox, a rare and endemic zoonotic disease caused by the monkeypox virus (MPXV). Genomic composition based machine learning (ML) methods have recently shown promise in identifying host adaptability and evolutionary patterns of virus. Our study aimed to analyze the genomic characteristics and evolutionary patterns of MPXV using ML methods. Methods: The open reading frame (ORF) regions of full-length MPXV genomes were filtered and 165 ORFs were selected as clusters with the highest homology. Unsupervised machine learning methods of t-distributed stochastic neighbor embedding (t-SNE), Principal Component Analysis (PCA), and hierarchical clustering were performed to observe the DCR characteristics of the selected ORF clusters. Results: The results showed that MPXV sequences post-2022 showed an obvious linear adaptive evolution, indicating that it has become more adapted to the human host after accumulating mutations. For further accurate analysis, the ORF regions with larger variations were filtered out based on the ranking of homology difference to narrow down the key ORF clusters, which drew the same conclusion of linear adaptability. Then key differential protein structures were predicted by AlphaFold 2, which meant that difference in main domains might be one of the internal reasons for linear adaptive evolution. Discussion: Understanding the process of linear adaptation is critical in the constant evolutionary struggle between viruses and their hosts, playing a significant role in crafting effective measures to tackle viral diseases. Therefore, the present study provides valuable insights into the evolutionary patterns of the MPXV in 2022 from the perspective of genomic composition characteristics analysis through ML methods.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2024 |
---|---|
Erschienen: |
2024 |
Enthalten in: |
Zur Gesamtaufnahme - volume:15 |
---|---|
Enthalten in: |
Frontiers in genetics - 15(2024) vom: 12., Seite 1361952 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Zhang, Sen [VerfasserIn] |
---|
Links: |
---|
Themen: |
Dinucleotide composition representation (DCR) |
---|
Anmerkungen: |
Date Revised 19.03.2024 published: Electronic-eCollection Citation Status PubMed-not-MEDLINE |
---|
doi: |
10.3389/fgene.2024.1361952 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM369853563 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM369853563 | ||
003 | DE-627 | ||
005 | 20240319233211.0 | ||
007 | cr uuu---uuuuu | ||
008 | 240318s2024 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.3389/fgene.2024.1361952 |2 doi | |
028 | 5 | 2 | |a pubmed24n1336.xml |
035 | |a (DE-627)NLM369853563 | ||
035 | |a (NLM)38495668 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Zhang, Sen |e verfasserin |4 aut | |
245 | 1 | 0 | |a Compositional features analysis by machine learning in genome represents linear adaptation of monkeypox virus |
264 | 1 | |c 2024 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Revised 19.03.2024 | ||
500 | |a published: Electronic-eCollection | ||
500 | |a Citation Status PubMed-not-MEDLINE | ||
520 | |a Copyright © 2024 Zhang, Li, Cai, Kang, Feng, Li, Chen, Li, Bao and Jiang. | ||
520 | |a Introduction: The global headlines have been dominated by the sudden and widespread outbreak of monkeypox, a rare and endemic zoonotic disease caused by the monkeypox virus (MPXV). Genomic composition based machine learning (ML) methods have recently shown promise in identifying host adaptability and evolutionary patterns of virus. Our study aimed to analyze the genomic characteristics and evolutionary patterns of MPXV using ML methods. Methods: The open reading frame (ORF) regions of full-length MPXV genomes were filtered and 165 ORFs were selected as clusters with the highest homology. Unsupervised machine learning methods of t-distributed stochastic neighbor embedding (t-SNE), Principal Component Analysis (PCA), and hierarchical clustering were performed to observe the DCR characteristics of the selected ORF clusters. Results: The results showed that MPXV sequences post-2022 showed an obvious linear adaptive evolution, indicating that it has become more adapted to the human host after accumulating mutations. For further accurate analysis, the ORF regions with larger variations were filtered out based on the ranking of homology difference to narrow down the key ORF clusters, which drew the same conclusion of linear adaptability. Then key differential protein structures were predicted by AlphaFold 2, which meant that difference in main domains might be one of the internal reasons for linear adaptive evolution. Discussion: Understanding the process of linear adaptation is critical in the constant evolutionary struggle between viruses and their hosts, playing a significant role in crafting effective measures to tackle viral diseases. Therefore, the present study provides valuable insights into the evolutionary patterns of the MPXV in 2022 from the perspective of genomic composition characteristics analysis through ML methods | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a dinucleotide composition representation (DCR) | |
650 | 4 | |a linear adaptation | |
650 | 4 | |a machine learning | |
650 | 4 | |a monkeypox viruses | |
650 | 4 | |a open reading frame clusters | |
700 | 1 | |a Li, Ya-Dan |e verfasserin |4 aut | |
700 | 1 | |a Cai, Yu-Rong |e verfasserin |4 aut | |
700 | 1 | |a Kang, Xiao-Ping |e verfasserin |4 aut | |
700 | 1 | |a Feng, Ye |e verfasserin |4 aut | |
700 | 1 | |a Li, Yu-Chang |e verfasserin |4 aut | |
700 | 1 | |a Chen, Yue-Hong |e verfasserin |4 aut | |
700 | 1 | |a Li, Jing |e verfasserin |4 aut | |
700 | 1 | |a Bao, Li-Li |e verfasserin |4 aut | |
700 | 1 | |a Jiang, Tao |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Frontiers in genetics |d 2010 |g 15(2024) vom: 12., Seite 1361952 |w (DE-627)NLM20904649X |x 1664-8021 |7 nnns |
773 | 1 | 8 | |g volume:15 |g year:2024 |g day:12 |g pages:1361952 |
856 | 4 | 0 | |u http://dx.doi.org/10.3389/fgene.2024.1361952 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 15 |j 2024 |b 12 |h 1361952 |