Cross-column density functional theory-based quantitative structure-retention relationship model development powered by machine learning

© 2024. The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature..

Quantitative structure-retention relationship (QSRR) modeling has emerged as an efficient alternative to predict analyte retention times using molecular descriptors. However, most reported QSRR models are column-specific, requiring separate models for each high-performance liquid chromatography (HPLC) system. This study evaluates the potential of machine learning (ML) algorithms and quantum mechanical (QM) descriptors to develop QSRR models that can predict retention times across three different reversed-phase HPLC columns under varying conditions. Four machine learning methods-partial least squares (PLS) regression, ridge regression (RR), random forest (RF), and gradient boosting (GB)-were compared on a dataset of 360 retention times for 15 aromatic analytes. Molecular descriptors were calculated using density functional theory (DFT). Column characteristics like particle size and pore size and experimental conditions like temperature and gradient time were additionally used as descriptors. Results showed that the GB-QSRR model demonstrated the best predictive performance, with Q2 of 0.989 and root mean square error of prediction (RMSEP) of 0.749 min on the test set. Feature analysis revealed that solvation energy (SE), HOMO-LUMO energy gap (∆E HOMO-LUMO), total dipole moment (Mtot), and global hardness (η) are among the most influential predictors for retention time prediction, indicating the significance of electrostatic interactions and hydrophobicity. Our findings underscore the efficiency of ensemble methods, GB and RF models employing non-linear learners, in capturing local variations in retention times across diverse experimental setups. This study emphasizes the potential of cross-column QSRR modeling and highlights the utility of ML models in optimizing chromatographic analysis.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:416

Enthalten in:

Analytical and bioanalytical chemistry - 416(2024), 12 vom: 20. Apr., Seite 2951-2968

Sprache:

Englisch

Beteiligte Personen:

Mazraedoost, Sargol [VerfasserIn]
Žuvela, Petar [VerfasserIn]
Ulenberg, Szymon [VerfasserIn]
Bączek, Tomasz [VerfasserIn]
Liu, J Jay [VerfasserIn]

Links:

Volltext

Themen:

Cheminformatics
Density functional theory (DFT)
Journal Article
Machine learning (ML)
Quantitative structure-retention relationship (QSRR)
Retention time prediction
Reversed-phase high-performance liquid chromatography (RP-HPLC)

Anmerkungen:

Date Revised 25.04.2024

published: Print-Electronic

Citation Status PubMed-not-MEDLINE

doi:

10.1007/s00216-024-05243-7

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM369966511