In silico proof of principle of machine learning-based antibody design at unconstrained scale
Abstract Generative machine learning (ML) has been postulated to be a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody binding parameters. The simulation framework enables both the computation of antibody-antigen 3D-structures as well as functions as an oracle for unrestricted prospective evaluation of the antigen specificity of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (1D) data can be used to design native-like conformational (3D) epitope-specific antibodies, matching or exceeding the training dataset in affinity and developability variety. Furthermore, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Finally, we validated that the antibody design insight gained from simulated antibody-antigen binding data is applicable to experimental real-world data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.Highlights <jats:list list-type="bullet">A large-scale dataset of 70M [3 orders of magnitude larger than the current state of the art] synthetic antibody-antigen complexes, that reflect biological complexity, allows the prospective evaluation of antibody generative deep learningCombination of generative learning, synthetic antibody-antigen binding data, and prospective evaluation shows that deep learning driven antibody design and discovery at an unconstrained level is feasibleTransfer learning (low-N learning) coupled to generative learning shows that antibody-binding rules may be transferred across unrelated antibody-antigen complexesExperimental validation of antibody-design conclusions drawn from deep learning on synthetic antibody-antigen binding dataGraphical abstract <jats:fig id="ufig1" position="float" orientation="portrait" fig-type="figure"><jats:caption>We leverage large synthetic ground-truth data to demonstrate the (A,B) unconstrained deep generative learning-based generation of native-like antibody sequences, (C) the prospective evaluation of conformational (3D) affinity, paratope-epitope pairs, and developability. (D) Finally, we show increased generation quality of low-N-based machine learning models via transfer learning.</jats:caption><jats:graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="451480v1_ufig1" position="float" orientation="portrait" /></jats:fig>.
Medienart: |
Preprint |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
bioRxiv.org - (2023) vom: 05. Nov. Zur Gesamtaufnahme - year:2023 |
---|
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Akbar, Rahmad [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
doi: |
10.1101/2021.07.08.451480 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
XBI032167830 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | XBI032167830 | ||
003 | DE-627 | ||
005 | 20231205150330.0 | ||
007 | cr uuu---uuuuu | ||
008 | 210712s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1101/2021.07.08.451480 |2 doi | |
035 | |a (DE-627)XBI032167830 | ||
035 | |a (biorXiv)10.1101/2021.07.08.451480 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Akbar, Rahmad |e verfasserin |0 (orcid)0000-0002-6692-0876 |4 aut | |
245 | 1 | 0 | |a In silico proof of principle of machine learning-based antibody design at unconstrained scale |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
520 | |a Abstract Generative machine learning (ML) has been postulated to be a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody binding parameters. The simulation framework enables both the computation of antibody-antigen 3D-structures as well as functions as an oracle for unrestricted prospective evaluation of the antigen specificity of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (1D) data can be used to design native-like conformational (3D) epitope-specific antibodies, matching or exceeding the training dataset in affinity and developability variety. Furthermore, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Finally, we validated that the antibody design insight gained from simulated antibody-antigen binding data is applicable to experimental real-world data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.Highlights <jats:list list-type="bullet">A large-scale dataset of 70M [3 orders of magnitude larger than the current state of the art] synthetic antibody-antigen complexes, that reflect biological complexity, allows the prospective evaluation of antibody generative deep learningCombination of generative learning, synthetic antibody-antigen binding data, and prospective evaluation shows that deep learning driven antibody design and discovery at an unconstrained level is feasibleTransfer learning (low-N learning) coupled to generative learning shows that antibody-binding rules may be transferred across unrelated antibody-antigen complexesExperimental validation of antibody-design conclusions drawn from deep learning on synthetic antibody-antigen binding dataGraphical abstract <jats:fig id="ufig1" position="float" orientation="portrait" fig-type="figure"><jats:caption>We leverage large synthetic ground-truth data to demonstrate the (A,B) unconstrained deep generative learning-based generation of native-like antibody sequences, (C) the prospective evaluation of conformational (3D) affinity, paratope-epitope pairs, and developability. (D) Finally, we show increased generation quality of low-N-based machine learning models via transfer learning.</jats:caption><jats:graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="451480v1_ufig1" position="float" orientation="portrait" /></jats:fig> | ||
650 | 4 | |a Biology |7 (dpeaa)DE-84 | |
650 | 4 | |a 570 |7 (dpeaa)DE-84 | |
700 | 1 | |a Robert, Philippe A. |0 (orcid)0000-0003-1345-5015 |4 aut | |
700 | 1 | |a Weber, Cédric R. |0 (orcid)0000-0003-4802-8996 |4 aut | |
700 | 1 | |a Widrich, Michael |0 (orcid)0000-0002-5721-0135 |4 aut | |
700 | 1 | |a Frank, Robert |0 (orcid)0000-0001-9097-7963 |4 aut | |
700 | 1 | |a Pavlović, Milena |0 (orcid)0000-0002-2484-3868 |4 aut | |
700 | 1 | |a Scheffer, Lonneke |0 (orcid)0000-0001-8900-075X |4 aut | |
700 | 1 | |a Chernigovskaya, Maria |0 (orcid)0000-0002-1507-4171 |4 aut | |
700 | 1 | |a Snapkov, Igor |0 (orcid)0000-0001-5341-685X |4 aut | |
700 | 1 | |a Slabodkin, Andrei |0 (orcid)0000-0002-9320-1666 |4 aut | |
700 | 1 | |a Mehta, Brij Bhushan |0 (orcid)0000-0002-8501-7076 |4 aut | |
700 | 1 | |a Miho, Enkelejda |0 (orcid)0000-0001-6461-0519 |4 aut | |
700 | 1 | |a Lund-Johansen, Fridtjof |0 (orcid)0000-0002-2445-1258 |4 aut | |
700 | 1 | |a Andersen, Jan Terje |0 (orcid)0000-0003-1710-1628 |4 aut | |
700 | 1 | |a Hochreiter, Sepp |0 (orcid)0000-0001-7449-2528 |4 aut | |
700 | 1 | |a Haff, Ingrid Hobæk |4 aut | |
700 | 1 | |a Klambauer, Günter |0 (orcid)0000-0003-2861-5552 |4 aut | |
700 | 1 | |a Sandve, Geir Kjetil |0 (orcid)0000-0002-4959-1409 |4 aut | |
700 | 1 | |a Greiff, Victor |0 (orcid)0000-0003-2622-5032 |4 aut | |
773 | 0 | 8 | |i Enthalten in |t bioRxiv.org |g (2023) vom: 05. Nov. |
773 | 1 | 8 | |g year:2023 |g day:05 |g month:11 |
856 | 4 | 0 | |u https://doi.org/10.1080/19420862.2022.2031482 |z lizenzpflichtig |3 Volltext |
856 | 4 | 0 | |u http://dx.doi.org/10.1101/2021.07.08.451480 |z kostenfrei |3 Volltext |
912 | |a GBV_XBI | ||
951 | |a AR | ||
952 | |j 2023 |b 05 |c 11 |