Details der Publikation - Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae

Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae

Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left- and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machine-learning based diagnostics.

Medienart:	E-Artikel

Erscheinungsjahr:	2024
Erschienen:	2024

Enthalten in:	Zur Gesamtaufnahme - volume:10
Enthalten in:	Microbial genomics - 10(2024), 3 vom: 02. März

Sprache:	Englisch

Beteiligte Personen:	Batisti Biffignandi, Gherard [VerfasserIn] Chindelevitch, Leonid [VerfasserIn] Corbella, Marta [VerfasserIn] Feil, Edward J [VerfasserIn] Sassera, Davide [VerfasserIn] Lees, John A [VerfasserIn]

Links:	Volltext

Themen:	AMR Anti-Bacterial Agents Antibiotic resistance Bacterial genomics GWAS Journal Article Klebsiella pneumoniae MIC Machine learning

Anmerkungen:	Date Completed 27.03.2024 Date Revised 07.04.2024 published: Print Citation Status MEDLINE

doi:	10.1099/mgen.0.001222

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM370195019

Internformat


LEADER	01000caa a22002652 4500
001	NLM370195019
003	DE-627
005	20240407232415.0
007	cr uuu---uuuuu
008	240327s2024 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1099/mgen.0.001222 \|2 doi
028	5	2	\|a pubmed24n1368.xml
035			\|a (DE-627)NLM370195019
035			\|a (NLM)38529944
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Batisti Biffignandi, Gherard \|e verfasserin \|4 aut
245	1	0	\|a Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae
264		1	\|c 2024
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 27.03.2024
500			\|a Date Revised 07.04.2024
500			\|a published: Print
500			\|a Citation Status MEDLINE
520			\|a Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left- and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machine-learning based diagnostics
650		4	\|a Journal Article
650		4	\|a AMR
650		4	\|a GWAS
650		4	\|a Klebsiella pneumoniae
650		4	\|a MIC
650		4	\|a antibiotic resistance
650		4	\|a bacterial genomics
650		4	\|a machine learning
650		7	\|a Anti-Bacterial Agents \|2 NLM
700	1		\|a Chindelevitch, Leonid \|e verfasserin \|4 aut
700	1		\|a Corbella, Marta \|e verfasserin \|4 aut
700	1		\|a Feil, Edward J \|e verfasserin \|4 aut
700	1		\|a Sassera, Davide \|e verfasserin \|4 aut
700	1		\|a Lees, John A \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t Microbial genomics \|d 2015 \|g 10(2024), 3 vom: 02. März \|w (DE-627)NLM260056065 \|x 2057-5858 \|7 nnns
773	1	8	\|g volume:10 \|g year:2024 \|g number:3 \|g day:02 \|g month:03
856	4	0	\|u http://dx.doi.org/10.1099/mgen.0.001222 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|d 10 \|j 2024 \|e 3 \|b 02 \|c 03

Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände