Automated Agnostic Designation of Pathogen Lineages

Abstract Pathogen nomenclature systems are a key component of effective communication and collaboration for researchers and public health workers. Since February 2021, the Pango nomenclature for SARS-CoV-2 has been sustained by crowdsourced lineage proposals as new isolates were added to a growing global dataset. This approach to dynamic lineage designation is dependent on a large and active epidemiological community identifying and curating each new lineage. This is vulnerable to time-critical delays as well as regional and personal bias. To address these issues, we developed a simple heuristic approach that divides a phylogenetic tree into lineages based on shared ancestral genotypes. We additionally provide a framework that automatically prioritizes the lineages by growth rate and association with key mutations or locations, extensible to any pathogen. Our implementation is efficient on extremely large phylogenetic trees and produces similar results to existing Pango lineage designations when applied to SARS-CoV-2. This method offers a simple, automated and consistent approach to pathogen nomenclature that can assist researchers in developing and maintaining phylogeny-based classifications in the face of ever increasing genomic datasets..

Medienart:

Preprint

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

bioRxiv.org - (2023) vom: 23. März Zur Gesamtaufnahme - year:2023

Sprache:

Englisch

Beteiligte Personen:

McBroome, Jakob [VerfasserIn]
de Bernardi Schneider, Adriano [VerfasserIn]
Roemer, Cornelius [VerfasserIn]
Wolfinger, Michael T. [VerfasserIn]
Hinrichs, Angie S. [VerfasserIn]
O’Toole, Aine Niamh [VerfasserIn]
Ruis, Christopher [VerfasserIn]
Turakhia, Yatish [VerfasserIn]
Rambaut, Andrew [VerfasserIn]
Corbett-Detig, Russell [VerfasserIn]

Links:

Volltext [kostenfrei]

Themen:

570
Biology

doi:

10.1101/2023.02.03.527052

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

XBI03860194X