Assigning secondary structure in proteins using AI

© 2021. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature..

Knowledge about protein structure assignment enriches the structural and functional understanding of proteins. Accurate and reliable structure assignment data is crucial for secondary structure prediction systems. Since the 1980s, various methods based on hydrogen bond analysis and atomic coordinate geometry, followed by machine learning, have been employed in protein structure assignment. However, the assignment process becomes challenging when missing atoms are present in the protein files. Our method proposed a multi-class classifier program named DLFSA for assigning protein secondary structure elements (SSE) using convolutional neural networks (CNNs). A fast and efficient GPU-based parallel procedure extracts fragments from protein files. The model implemented in this work is trained with a subset of the protein fragments and achieves 88.1% and 82.5% train and test accuracy, respectively. The model uses only Cα coordinates for secondary structure assignments. The model has been successfully tested on a few full-length proteins also. Results from the fragment-based studies demonstrate the feasibility of applying deep learning solutions for structure assignment problems.

Medienart:

E-Artikel

Erscheinungsjahr:

2021

Erschienen:

2021

Enthalten in:

Zur Gesamtaufnahme - volume:27

Enthalten in:

Journal of molecular modeling - 27(2021), 9 vom: 17. Aug., Seite 252

Sprache:

Englisch

Beteiligte Personen:

Antony, Jisna Vellara [VerfasserIn]
Madhu, Prayagh [VerfasserIn]
Balakrishnan, Jayaraj Pottekkattuvalappil [VerfasserIn]
Yadav, Hemant [VerfasserIn]

Links:

Volltext

Themen:

Convolutional neural networks
Deep learning
Fragment library creation
Journal Article
Multi-class classifier
Protein fragments
Protein secondary structures
Protein structure assignment
Proteins

Anmerkungen:

Date Completed 24.01.2022

Date Revised 24.01.2022

published: Electronic

Citation Status MEDLINE

doi:

10.1007/s00894-021-04825-x

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM329465740