Mega-scale experimental analysis of protein folding stability in biology and design
© 2023. The Author(s)..
Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale1. However, the energetics driving folding are invisible in these structures and remain largely unknown2. The hidden thermodynamics of folding can drive disease3,4, shape protein evolution5-7 and guide protein engineering8-10, and new approaches are needed to reveal these thermodynamics for every sequence and structure. Here we present cDNA display proteolysis, a method for measuring thermodynamic folding stability for up to 900,000 protein domains in a one-week experiment. From 1.8 million measurements in total, we curated a set of around 776,000 high-quality folding stabilities covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains 40-72 amino acids in length. Using this extensive dataset, we quantified (1) environmental factors influencing amino acid fitness, (2) thermodynamic couplings (including unexpected interactions) between protein sites, and (3) the global divergence between evolutionary amino acid usage and protein folding stability. We also examined how our approach could identify stability determinants in designed proteins and evaluate design methods. The cDNA display proteolysis method is fast, accurate and uniquely scalable, and promises to reveal the quantitative rules for how amino acid sequences encode folding stability.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:620 |
---|---|
Enthalten in: |
Nature - 620(2023), 7973 vom: 19. Aug., Seite 434-444 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Tsuboyama, Kotaro [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 11.08.2023 Date Revised 12.08.2023 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1038/s41586-023-06328-6 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM359690947 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM359690947 | ||
003 | DE-627 | ||
005 | 20231226081419.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1038/s41586-023-06328-6 |2 doi | |
028 | 5 | 2 | |a pubmed24n1198.xml |
035 | |a (DE-627)NLM359690947 | ||
035 | |a (NLM)37468638 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Tsuboyama, Kotaro |e verfasserin |4 aut | |
245 | 1 | 0 | |a Mega-scale experimental analysis of protein folding stability in biology and design |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 11.08.2023 | ||
500 | |a Date Revised 12.08.2023 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a © 2023. The Author(s). | ||
520 | |a Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale1. However, the energetics driving folding are invisible in these structures and remain largely unknown2. The hidden thermodynamics of folding can drive disease3,4, shape protein evolution5-7 and guide protein engineering8-10, and new approaches are needed to reveal these thermodynamics for every sequence and structure. Here we present cDNA display proteolysis, a method for measuring thermodynamic folding stability for up to 900,000 protein domains in a one-week experiment. From 1.8 million measurements in total, we curated a set of around 776,000 high-quality folding stabilities covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains 40-72 amino acids in length. Using this extensive dataset, we quantified (1) environmental factors influencing amino acid fitness, (2) thermodynamic couplings (including unexpected interactions) between protein sites, and (3) the global divergence between evolutionary amino acid usage and protein folding stability. We also examined how our approach could identify stability determinants in designed proteins and evaluate design methods. The cDNA display proteolysis method is fast, accurate and uniquely scalable, and promises to reveal the quantitative rules for how amino acid sequences encode folding stability | ||
650 | 4 | |a Journal Article | |
650 | 7 | |a Amino Acids |2 NLM | |
650 | 7 | |a DNA, Complementary |2 NLM | |
650 | 7 | |a Proteins |2 NLM | |
700 | 1 | |a Dauparas, Justas |e verfasserin |4 aut | |
700 | 1 | |a Chen, Jonathan |e verfasserin |4 aut | |
700 | 1 | |a Laine, Elodie |e verfasserin |4 aut | |
700 | 1 | |a Mohseni Behbahani, Yasser |e verfasserin |4 aut | |
700 | 1 | |a Weinstein, Jonathan J |e verfasserin |4 aut | |
700 | 1 | |a Mangan, Niall M |e verfasserin |4 aut | |
700 | 1 | |a Ovchinnikov, Sergey |e verfasserin |4 aut | |
700 | 1 | |a Rocklin, Gabriel J |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Nature |d 1945 |g 620(2023), 7973 vom: 19. Aug., Seite 434-444 |w (DE-627)NLM000008257 |x 1476-4687 |7 nnns |
773 | 1 | 8 | |g volume:620 |g year:2023 |g number:7973 |g day:19 |g month:08 |g pages:434-444 |
856 | 4 | 0 | |u http://dx.doi.org/10.1038/s41586-023-06328-6 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 620 |j 2023 |e 7973 |b 19 |c 08 |h 434-444 |