Modeling the mosaic structure of bacterial genomes to infer their evolutionary history
The chronology and phylogeny of bacterial evolution are difficult to reconstruct due to a scarce fossil record. The analysis of bacterial genomes remains challenging because of large sequence divergence, the plasticity of bacterial genomes due to frequent gene loss, horizontal gene transfer, and differences in selective pressure from one locus to another. Therefore, taking advantage of the rich and rapidly accumulating genomic data requires accurate modeling of genome evolution. An important technical consideration is that loci with high effective mutation rates may diverge beyond the detection limit of the alignment algorithms used, biasing the genome-wide divergence estimates toward smaller divergences. In this article, we propose a novel method to gain insight into bacterial evolution based on statistical properties of genome comparisons. We find that the length distribution of sequence matches is shaped by the effective mutation rates of different loci, by the horizontal transfers, and by the aligner sensitivity. Based on these inputs, we build a model and show that it accounts for the empirically observed distributions, taking the Enterobacteriaceae family as an example. Our method allows to distinguish segments of vertical and horizontal origins and to estimate the time divergence and exchange rate between any pair of taxa from genome-wide alignments. Based on the estimated time divergences, we construct a time-calibrated phylogenetic tree to demonstrate the accuracy of the method.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2024 |
---|---|
Erschienen: |
2024 |
Enthalten in: |
Zur Gesamtaufnahme - volume:121 |
---|---|
Enthalten in: |
Proceedings of the National Academy of Sciences of the United States of America - 121(2024), 13 vom: 26. März, Seite e2313367121 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Sheinman, Michael [VerfasserIn] |
---|
Links: |
---|
Themen: |
Bacterial evolution |
---|
Anmerkungen: |
Date Completed 25.03.2024 Date Revised 05.04.2024 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1073/pnas.2313367121 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM370075536 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM370075536 | ||
003 | DE-627 | ||
005 | 20240405233933.0 | ||
007 | cr uuu---uuuuu | ||
008 | 240323s2024 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1073/pnas.2313367121 |2 doi | |
028 | 5 | 2 | |a pubmed24n1366.xml |
035 | |a (DE-627)NLM370075536 | ||
035 | |a (NLM)38517978 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Sheinman, Michael |e verfasserin |4 aut | |
245 | 1 | 0 | |a Modeling the mosaic structure of bacterial genomes to infer their evolutionary history |
264 | 1 | |c 2024 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 25.03.2024 | ||
500 | |a Date Revised 05.04.2024 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a The chronology and phylogeny of bacterial evolution are difficult to reconstruct due to a scarce fossil record. The analysis of bacterial genomes remains challenging because of large sequence divergence, the plasticity of bacterial genomes due to frequent gene loss, horizontal gene transfer, and differences in selective pressure from one locus to another. Therefore, taking advantage of the rich and rapidly accumulating genomic data requires accurate modeling of genome evolution. An important technical consideration is that loci with high effective mutation rates may diverge beyond the detection limit of the alignment algorithms used, biasing the genome-wide divergence estimates toward smaller divergences. In this article, we propose a novel method to gain insight into bacterial evolution based on statistical properties of genome comparisons. We find that the length distribution of sequence matches is shaped by the effective mutation rates of different loci, by the horizontal transfers, and by the aligner sensitivity. Based on these inputs, we build a model and show that it accounts for the empirically observed distributions, taking the Enterobacteriaceae family as an example. Our method allows to distinguish segments of vertical and horizontal origins and to estimate the time divergence and exchange rate between any pair of taxa from genome-wide alignments. Based on the estimated time divergences, we construct a time-calibrated phylogenetic tree to demonstrate the accuracy of the method | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a bacterial evolution | |
650 | 4 | |a horizontal gene transfer | |
650 | 4 | |a maximal exact matches | |
650 | 4 | |a molecular clock | |
650 | 4 | |a mutation rate | |
700 | 1 | |a Arndt, Peter F |e verfasserin |4 aut | |
700 | 1 | |a Massip, Florian |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Proceedings of the National Academy of Sciences of the United States of America |d 1915 |g 121(2024), 13 vom: 26. März, Seite e2313367121 |w (DE-627)NLM000008982 |x 1091-6490 |7 nnns |
773 | 1 | 8 | |g volume:121 |g year:2024 |g number:13 |g day:26 |g month:03 |g pages:e2313367121 |
856 | 4 | 0 | |u http://dx.doi.org/10.1073/pnas.2313367121 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 121 |j 2024 |e 13 |b 26 |c 03 |h e2313367121 |