PPred-PCKSM : A multi-layer predictor for identifying promoter and its variants using position based features
Copyright © 2022 Elsevier Ltd. All rights reserved..
Promoter is a small region of DNA where a protein called RNA polymerase binds thus resulting in initiation of transcription of a specific gene. In bacteria with prokaryotic cell type, the sigma subunit that combines with RNA polymerase helps in identifying promoters. In Escherichia coli (E.coli), the promoters are identified by different sigma factors consisting of different functionalities. There have been various methods used for prediction of different class of promoters. However, these methods need to be improved for better identification and classification of promoters. In this work, we propose a new multi-layer predictor named PPred-PCKSM that uses position-correlation based k-mer scoring matrix (PCKSM), a new feature extraction strategy and an artificial neural network (ANN) for predicting promoters and its six types, namely σ70, σ24, σ28, σ32, σ38 and σ54 in E.coli bacteria. We employ PCKSM technique to extract feature sets from different k-mers. The feature sets obtained from trimers and tetramers are concatenated and then passed through ANN for final prediction. The resultant feature set contained effective features that contributed towards achieving an accuracy of 98.02% and Matthews correlation coefficient (MCC) of 96.04% for promoter prediction task. Our model used 5-fold cross validation on the benchmark dataset and outperformed all the current state-of-art-methods used for prediction of promoters and its different types in E.coli bacteria.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2022 |
---|---|
Erschienen: |
2022 |
Enthalten in: |
Zur Gesamtaufnahme - volume:97 |
---|---|
Enthalten in: |
Computational biology and chemistry - 97(2022) vom: 15. Apr., Seite 107623 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Bhukya, Raju [VerfasserIn] |
---|
Links: |
---|
Anmerkungen: |
Date Completed 08.03.2022 Date Revised 08.03.2022 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1016/j.compbiolchem.2022.107623 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM335999913 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM335999913 | ||
003 | DE-627 | ||
005 | 20231225231136.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231225s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1016/j.compbiolchem.2022.107623 |2 doi | |
028 | 5 | 2 | |a pubmed24n1119.xml |
035 | |a (DE-627)NLM335999913 | ||
035 | |a (NLM)35065417 | ||
035 | |a (PII)S1476-9271(22)00003-2 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Bhukya, Raju |e verfasserin |4 aut | |
245 | 1 | 0 | |a PPred-PCKSM |b A multi-layer predictor for identifying promoter and its variants using position based features |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 08.03.2022 | ||
500 | |a Date Revised 08.03.2022 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Copyright © 2022 Elsevier Ltd. All rights reserved. | ||
520 | |a Promoter is a small region of DNA where a protein called RNA polymerase binds thus resulting in initiation of transcription of a specific gene. In bacteria with prokaryotic cell type, the sigma subunit that combines with RNA polymerase helps in identifying promoters. In Escherichia coli (E.coli), the promoters are identified by different sigma factors consisting of different functionalities. There have been various methods used for prediction of different class of promoters. However, these methods need to be improved for better identification and classification of promoters. In this work, we propose a new multi-layer predictor named PPred-PCKSM that uses position-correlation based k-mer scoring matrix (PCKSM), a new feature extraction strategy and an artificial neural network (ANN) for predicting promoters and its six types, namely σ70, σ24, σ28, σ32, σ38 and σ54 in E.coli bacteria. We employ PCKSM technique to extract feature sets from different k-mers. The feature sets obtained from trimers and tetramers are concatenated and then passed through ANN for final prediction. The resultant feature set contained effective features that contributed towards achieving an accuracy of 98.02% and Matthews correlation coefficient (MCC) of 96.04% for promoter prediction task. Our model used 5-fold cross validation on the benchmark dataset and outperformed all the current state-of-art-methods used for prediction of promoters and its different types in E.coli bacteria | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Artificial neural network | |
650 | 4 | |a Gene regulation | |
650 | 4 | |a Machine learning | |
650 | 4 | |a Position-correlation based k-mer scoring matrix (PCKSM) | |
650 | 4 | |a Promoters | |
650 | 4 | |a Sigma factors | |
650 | 7 | |a Sigma Factor |2 NLM | |
650 | 7 | |a DNA-Directed RNA Polymerases |2 NLM | |
650 | 7 | |a EC 2.7.7.6 |2 NLM | |
700 | 1 | |a Kumari, Archana |e verfasserin |4 aut | |
700 | 1 | |a Amilpur, Santhosh |e verfasserin |4 aut | |
700 | 1 | |a Dasari, Chandra Mohan |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Computational biology and chemistry |d 2003 |g 97(2022) vom: 15. Apr., Seite 107623 |w (DE-627)NLM125623690 |x 1476-928X |7 nnns |
773 | 1 | 8 | |g volume:97 |g year:2022 |g day:15 |g month:04 |g pages:107623 |
856 | 4 | 0 | |u http://dx.doi.org/10.1016/j.compbiolchem.2022.107623 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 97 |j 2022 |b 15 |c 04 |h 107623 |