findMySequence : a neural-network-based approach for identification of unknown proteins in X-ray crystallography and cryo-EM
© Grzegorz Chojnowski et al. 2022..
Although experimental protein-structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or appear as a contaminant. Regardless of the source of the problem, the unknown protein always requires characterization. Here, an automated pipeline is presented for the identification of protein sequences from cryo-EM reconstructions and crystallographic data. The method's application to characterize the crystal structure of an unknown protein purified from a snake venom is presented. It is also shown that the approach can be successfully applied to the identification of protein sequences and validation of sequence assignments in cryo-EM protein structures.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2022 |
---|---|
Erschienen: |
2022 |
Enthalten in: |
Zur Gesamtaufnahme - volume:9 |
---|---|
Enthalten in: |
IUCrJ - 9(2022), Pt 1 vom: 01. Jan., Seite 86-97 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Chojnowski, Grzegorz [VerfasserIn] |
---|
Links: |
---|
Themen: |
Bioinformatics |
---|
Anmerkungen: |
Date Revised 09.04.2022 published: Electronic-eCollection Citation Status PubMed-not-MEDLINE |
---|
doi: |
10.1107/S2052252521011088 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM335938418 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM335938418 | ||
003 | DE-627 | ||
005 | 20231225231008.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231225s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1107/S2052252521011088 |2 doi | |
028 | 5 | 2 | |a pubmed24n1119.xml |
035 | |a (DE-627)NLM335938418 | ||
035 | |a (NLM)35059213 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Chojnowski, Grzegorz |e verfasserin |4 aut | |
245 | 1 | 0 | |a findMySequence |b a neural-network-based approach for identification of unknown proteins in X-ray crystallography and cryo-EM |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Revised 09.04.2022 | ||
500 | |a published: Electronic-eCollection | ||
500 | |a Citation Status PubMed-not-MEDLINE | ||
520 | |a © Grzegorz Chojnowski et al. 2022. | ||
520 | |a Although experimental protein-structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or appear as a contaminant. Regardless of the source of the problem, the unknown protein always requires characterization. Here, an automated pipeline is presented for the identification of protein sequences from cryo-EM reconstructions and crystallographic data. The method's application to characterize the crystal structure of an unknown protein purified from a snake venom is presented. It is also shown that the approach can be successfully applied to the identification of protein sequences and validation of sequence assignments in cryo-EM protein structures | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a SIMBAD | |
650 | 4 | |a bioinformatics | |
650 | 4 | |a cryo-EM | |
650 | 4 | |a findMySequence | |
650 | 4 | |a neural networks | |
650 | 4 | |a protein sequences | |
650 | 4 | |a protein structures | |
650 | 4 | |a structure determination | |
700 | 1 | |a Simpkin, Adam J |e verfasserin |4 aut | |
700 | 1 | |a Leonardo, Diego A |e verfasserin |4 aut | |
700 | 1 | |a Seifert-Davila, Wolfram |e verfasserin |4 aut | |
700 | 1 | |a Vivas-Ruiz, Dan E |e verfasserin |4 aut | |
700 | 1 | |a Keegan, Ronan M |e verfasserin |4 aut | |
700 | 1 | |a Rigden, Daniel J |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t IUCrJ |d 2014 |g 9(2022), Pt 1 vom: 01. Jan., Seite 86-97 |w (DE-627)NLM240540468 |x 2052-2525 |7 nnns |
773 | 1 | 8 | |g volume:9 |g year:2022 |g number:Pt 1 |g day:01 |g month:01 |g pages:86-97 |
856 | 4 | 0 | |u http://dx.doi.org/10.1107/S2052252521011088 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 9 |j 2022 |e Pt 1 |b 01 |c 01 |h 86-97 |