Details der Publikation - A sequence-based two-layer predictor for identifying enhancers and their strength through enhanced feature extraction

A sequence-based two-layer predictor for identifying enhancers and their strength through enhanced feature extraction

Enhancers are short regulatory DNA fragments that are bound with proteins called activators. They are free-bound and distant elements, which play a vital role in controlling gene expression. It is challenging to identify enhancers and their strength due to their dynamic nature. Although some machine learning methods exist to accelerate identification process, their prediction accuracy and efficiency will need more improvement. In this regard, we propose a two-layer prediction model with enhanced feature extraction strategy which does feature combination from improved position-specific amino acid propensity (PSTKNC) method along with Enhanced Nucleic Acid Composition (ENAC) and Composition of k-spaced Nucleic Acid Pairs (CKSNAP). The feature sets from all three feature extraction approaches were concatenated and then sent through a simple artificial neural network (ANN) to accurately identify enhancers in the first layer and their strength in the second layer. Experiments are conducted on benchmark chromatin nine cell lines dataset. A 10-fold cross validation method is employed to evaluate model's performance. The results show that the proposed model gives an outstanding performance with 94.50%, 0.8903 of accuracy and Matthew's correlation coefficient (MCC) in predicting enhancers and fairly does well with independent test also when compared with all other existing methods.

Medienart:	E-Artikel

Erscheinungsjahr:	2022
Erschienen:	2022

Enthalten in:	Zur Gesamtaufnahme - volume:20
Enthalten in:	Journal of bioinformatics and computational biology - 20(2022), 2 vom: 19. Apr., Seite 2250005

Sprache:	Englisch

Beteiligte Personen:	Amilpur, Santhosh [VerfasserIn] Bhukya, Raju [VerfasserIn]

Links:	Volltext

Themen:	9007-49-2 Artificial neural network DNA Enhancers Feature extraction Gene regulation Genome annotation Journal Article Research Support, Non-U.S. Gov't

Anmerkungen:	Date Completed 10.05.2022 Date Revised 15.07.2022 published: Print-Electronic Citation Status MEDLINE

doi:	10.1142/S0219720022500056

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM337949573

Internformat


LEADER	01000naa a22002652 4500
001	NLM337949573
003	DE-627
005	20231225235557.0
007	cr uuu---uuuuu
008	231225s2022 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1142/S0219720022500056 \|2 doi
028	5	2	\|a pubmed24n1126.xml
035			\|a (DE-627)NLM337949573
035			\|a (NLM)35264081
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Amilpur, Santhosh \|e verfasserin \|4 aut
245	1	2	\|a A sequence-based two-layer predictor for identifying enhancers and their strength through enhanced feature extraction
264		1	\|c 2022
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 10.05.2022
500			\|a Date Revised 15.07.2022
500			\|a published: Print-Electronic
500			\|a Citation Status MEDLINE
520			\|a Enhancers are short regulatory DNA fragments that are bound with proteins called activators. They are free-bound and distant elements, which play a vital role in controlling gene expression. It is challenging to identify enhancers and their strength due to their dynamic nature. Although some machine learning methods exist to accelerate identification process, their prediction accuracy and efficiency will need more improvement. In this regard, we propose a two-layer prediction model with enhanced feature extraction strategy which does feature combination from improved position-specific amino acid propensity (PSTKNC) method along with Enhanced Nucleic Acid Composition (ENAC) and Composition of k-spaced Nucleic Acid Pairs (CKSNAP). The feature sets from all three feature extraction approaches were concatenated and then sent through a simple artificial neural network (ANN) to accurately identify enhancers in the first layer and their strength in the second layer. Experiments are conducted on benchmark chromatin nine cell lines dataset. A 10-fold cross validation method is employed to evaluate model's performance. The results show that the proposed model gives an outstanding performance with 94.50%, 0.8903 of accuracy and Matthew's correlation coefficient (MCC) in predicting enhancers and fairly does well with independent test also when compared with all other existing methods
650		4	\|a Journal Article
650		4	\|a Research Support, Non-U.S. Gov't
650		4	\|a Enhancers
650		4	\|a artificial neural network
650		4	\|a feature extraction
650		4	\|a gene regulation
650		4	\|a genome annotation
650		7	\|a DNA \|2 NLM
650		7	\|a 9007-49-2 \|2 NLM
700	1		\|a Bhukya, Raju \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t Journal of bioinformatics and computational biology \|d 2003 \|g 20(2022), 2 vom: 19. Apr., Seite 2250005 \|w (DE-627)NLM149554192 \|x 1757-6334 \|7 nnns
773	1	8	\|g volume:20 \|g year:2022 \|g number:2 \|g day:19 \|g month:04 \|g pages:2250005
856	4	0	\|u http://dx.doi.org/10.1142/S0219720022500056 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|d 20 \|j 2022 \|e 2 \|b 19 \|c 04 \|h 2250005

A sequence-based two-layer predictor for identifying enhancers and their strength through enhanced feature extraction

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände