SARS-CoV-2 lineage assignments using phylogenetic placement/UShER are superior to pangoLEARN machine-learning method

© The Author(s) 2024. Published by Oxford University Press..

With the rapid spread and evolution of SARS-CoV-2, the ability to monitor its transmission and distinguish among viral lineages is critical for pandemic response efforts. The most commonly used software for the lineage assignment of newly isolated SARS-CoV-2 genomes is pangolin, which offers two methods of assignment, pangoLEARN and pUShER. PangoLEARN rapidly assigns lineages using a machine-learning algorithm, while pUShER performs a phylogenetic placement to identify the lineage corresponding to a newly sequenced genome. In a preliminary study, we observed that pangoLEARN (decision tree model), while substantially faster than pUShER, offered less consistency across different versions of pangolin v3. Here, we expand upon this analysis to include v3 and v4 of pangolin, which moved the default algorithm for lineage assignment from pangoLEARN in v3 to pUShER in v4, and perform a thorough analysis confirming that pUShER is not only more stable across versions but also more accurate. Our findings suggest that future lineage assignment algorithms for various pathogens should consider the value of phylogenetic placement.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:10

Enthalten in:

Virus evolution - 10(2024), 1 vom: 19., Seite vead085

Sprache:

Englisch

Beteiligte Personen:

de Bernardi Schneider, Adriano [VerfasserIn]
Su, Michelle [VerfasserIn]
Hinrichs, Angie S [VerfasserIn]
Wang, Jade [VerfasserIn]
Amin, Helly [VerfasserIn]
Bell, John [VerfasserIn]
Wadford, Debra A [VerfasserIn]
O'Toole, Áine [VerfasserIn]
Scher, Emily [VerfasserIn]
Perry, Marc D [VerfasserIn]
Turakhia, Yatish [VerfasserIn]
De Maio, Nicola [VerfasserIn]
Hughes, Scott [VerfasserIn]
Corbett-Detig, Russ [VerfasserIn]

Links:

Volltext

Themen:

Bioinformatics
COVID-19
Journal Article
Phylogenetics
Variants

Anmerkungen:

Date Revised 17.02.2024

published: Electronic-eCollection

Citation Status PubMed-not-MEDLINE

doi:

10.1093/ve/vead085

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM368518965