AI-Bind: Improving Binding Predictions for Novel Protein Targets and Ligands

Identifying novel drug-target interactions (DTI) is a critical and rate limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We first unveil the mechanisms responsible for this shortcoming, demonstrating how models rely on shortcuts that leverage the topology of the protein-ligand bipartite network, rather than learning the node features. Then, we introduce AI-Bind, a pipeline that combines network-based sampling strategies with unsupervised pre-training, allowing us to limit the annotation imbalance and improve binding predictions for novel proteins and ligands. We illustrate the value of AI-Bind by predicting drugs and natural compounds with binding affinity to SARS-CoV-2 viral proteins and the associated human proteins. We also validate these predictions via auto-docking simulations and comparison with recent experimental evidence, and step up the process of interpreting machine learning prediction of protein-ligand binding by identifying potential active binding sites on the amino acid sequence. Overall, AI-Bind offers a powerful high-throughput approach to identify drug-target combinations, with the potential of becoming a powerful tool in drug discovery..

Medienart:

Preprint

Erscheinungsjahr:

2021

Erschienen:

2021

Enthalten in:

arXiv.org - (2021) vom: 24. Dez. Zur Gesamtaufnahme - year:2021

Sprache:

Englisch

Beteiligte Personen:

Chatterjee, Ayan [VerfasserIn]
Walters, Robin [VerfasserIn]
Shafi, Zohair [VerfasserIn]
Ahmed, Omair Shafi [VerfasserIn]
Sebek, Michael [VerfasserIn]
Gysi, Deisy [VerfasserIn]
Yu, Rose [VerfasserIn]
Eliassi-Rad, Tina [VerfasserIn]
Barabási, Albert-László [VerfasserIn]
Menichetti, Giulia [VerfasserIn]

Links:

Volltext [kostenfrei]

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

XAR033302154