PPred-PCKSM : A multi-layer predictor for identifying promoter and its variants using position based features

Copyright © 2022 Elsevier Ltd. All rights reserved..

Promoter is a small region of DNA where a protein called RNA polymerase binds thus resulting in initiation of transcription of a specific gene. In bacteria with prokaryotic cell type, the sigma subunit that combines with RNA polymerase helps in identifying promoters. In Escherichia coli (E.coli), the promoters are identified by different sigma factors consisting of different functionalities. There have been various methods used for prediction of different class of promoters. However, these methods need to be improved for better identification and classification of promoters. In this work, we propose a new multi-layer predictor named PPred-PCKSM that uses position-correlation based k-mer scoring matrix (PCKSM), a new feature extraction strategy and an artificial neural network (ANN) for predicting promoters and its six types, namely σ70, σ24, σ28, σ32, σ38 and σ54 in E.coli bacteria. We employ PCKSM technique to extract feature sets from different k-mers. The feature sets obtained from trimers and tetramers are concatenated and then passed through ANN for final prediction. The resultant feature set contained effective features that contributed towards achieving an accuracy of 98.02% and Matthews correlation coefficient (MCC) of 96.04% for promoter prediction task. Our model used 5-fold cross validation on the benchmark dataset and outperformed all the current state-of-art-methods used for prediction of promoters and its different types in E.coli bacteria.

Medienart:

E-Artikel

Erscheinungsjahr:

2022

Erschienen:

2022

Enthalten in:

Zur Gesamtaufnahme - volume:97

Enthalten in:

Computational biology and chemistry - 97(2022) vom: 15. Apr., Seite 107623

Sprache:

Englisch

Beteiligte Personen:

Bhukya, Raju [VerfasserIn]
Kumari, Archana [VerfasserIn]
Amilpur, Santhosh [VerfasserIn]
Dasari, Chandra Mohan [VerfasserIn]

Links:

Volltext

Themen:

Artificial neural network
DNA-Directed RNA Polymerases
EC 2.7.7.6
Gene regulation
Journal Article
Machine learning
Position-correlation based k-mer scoring matrix (PCKSM)
Promoters
Sigma Factor
Sigma factors

Anmerkungen:

Date Completed 08.03.2022

Date Revised 08.03.2022

published: Print-Electronic

Citation Status MEDLINE

doi:

10.1016/j.compbiolchem.2022.107623

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM335999913