Details der Publikation - Development and validation of machine learning models for the prediction of SH-2 containing protein tyrosine phosphatase 2 inhibitors

Development and validation of machine learning models for the prediction of SH-2 containing protein tyrosine phosphatase 2 inhibitors

© 2023. The Author(s), under exclusive licence to Springer Nature Switzerland AG..

Discovery and development of a new drug to the market is a highly challenging and resource consuming process. Although, modern drug discovery technologies have enabled the rapid identification of lead compounds, translation of the lead compounds into successful clinical candidates remains a big challenge. In recent years, the availability of massive structural and biological data of diverse small molecules and macromolecules has helped the researchers to deep mine the multidimensional data with the help of artificial intelligence-based predictive tools to draw useful insights on the structural features of biological or therapeutic significance. The aim of this study was to utilize the available data on small molecule (SH2)-containing protein tyrosine phosphatase 2 (SHP2) inhibitors to build and develop machine learning (ML) models that can predict the SHP2 inhibitory potential of new compounds. The dataset contained 2739 unique small molecule SHP2 inhibitors obtained from the BindingDB, ChEMBL and recent literature. After curation of the data, the predictive models such as XGBoost, K nearest neighbours, neural networks were developed and validated through a tenfold cross-validation testing procedure. Out of the seven models developed, the XGBoost model showed an excellent performance with ROC AUC score of 0.96 and accuracy of 0.97 on the test data. Moreover, the Shapley Additive Explanations method was applied to assess a more in-depth understanding of the influence of variables on the model's predictions. In summary, the XGBoost model developed in this study can be useful in the identification of novel SHP2 inhibitors and therefore, can accelerate the discovery of novel therapeutics for cancer therapy.

Medienart:	E-Artikel

Erscheinungsjahr:	2023
Erschienen:	2023

Enthalten in:	Zur Gesamtaufnahme - year:2023
Enthalten in:	Molecular diversity - (2023) vom: 08. Aug.

Sprache:	Englisch

Beteiligte Personen:	Adhikari, Nilanjan [VerfasserIn] Ayyannan, Senthil Raja [VerfasserIn]

Links:	Volltext

Themen:	Journal Article Machine learning QSAR SH2-containing protein tyrosine phosphatase 2 SHP2 inhibitors Virtual screening

Anmerkungen:	Date Revised 08.08.2023 published: Print-Electronic Citation Status Publisher

doi:	10.1007/s11030-023-10710-x

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM360520812

Internformat


LEADER	01000naa a22002652 4500
001	NLM360520812
003	DE-627
005	20231226083205.0
007	cr uuu---uuuuu
008	231226s2023 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1007/s11030-023-10710-x \|2 doi
028	5	2	\|a pubmed24n1201.xml
035			\|a (DE-627)NLM360520812
035			\|a (NLM)37552436
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Adhikari, Nilanjan \|e verfasserin \|4 aut
245	1	0	\|a Development and validation of machine learning models for the prediction of SH-2 containing protein tyrosine phosphatase 2 inhibitors
264		1	\|c 2023
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 08.08.2023
500			\|a published: Print-Electronic
500			\|a Citation Status Publisher
520			\|a © 2023. The Author(s), under exclusive licence to Springer Nature Switzerland AG.
520			\|a Discovery and development of a new drug to the market is a highly challenging and resource consuming process. Although, modern drug discovery technologies have enabled the rapid identification of lead compounds, translation of the lead compounds into successful clinical candidates remains a big challenge. In recent years, the availability of massive structural and biological data of diverse small molecules and macromolecules has helped the researchers to deep mine the multidimensional data with the help of artificial intelligence-based predictive tools to draw useful insights on the structural features of biological or therapeutic significance. The aim of this study was to utilize the available data on small molecule (SH2)-containing protein tyrosine phosphatase 2 (SHP2) inhibitors to build and develop machine learning (ML) models that can predict the SHP2 inhibitory potential of new compounds. The dataset contained 2739 unique small molecule SHP2 inhibitors obtained from the BindingDB, ChEMBL and recent literature. After curation of the data, the predictive models such as XGBoost, K nearest neighbours, neural networks were developed and validated through a tenfold cross-validation testing procedure. Out of the seven models developed, the XGBoost model showed an excellent performance with ROC AUC score of 0.96 and accuracy of 0.97 on the test data. Moreover, the Shapley Additive Explanations method was applied to assess a more in-depth understanding of the influence of variables on the model's predictions. In summary, the XGBoost model developed in this study can be useful in the identification of novel SHP2 inhibitors and therefore, can accelerate the discovery of novel therapeutics for cancer therapy
650		4	\|a Journal Article
650		4	\|a Machine learning
650		4	\|a QSAR
650		4	\|a SH2-containing protein tyrosine phosphatase 2
650		4	\|a SHP2 inhibitors
650		4	\|a Virtual screening
700	1		\|a Ayyannan, Senthil Raja \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t Molecular diversity \|d 1997 \|g (2023) vom: 08. Aug. \|w (DE-627)NLM091914590 \|x 1573-501X \|7 nnns
773	1	8	\|g year:2023 \|g day:08 \|g month:08
856	4	0	\|u http://dx.doi.org/10.1007/s11030-023-10710-x \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|j 2023 \|b 08 \|c 08

Development and validation of machine learning models for the prediction of SH-2 containing protein tyrosine phosphatase 2 inhibitors

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände