DeepReg : a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes
© 2024. The Author(s)..
Deep learning models (DLMs) have gained importance in predicting, detecting, translating, and classifying a diversity of inputs. In bioinformatics, DLMs have been used to predict protein structures, transcription factor-binding sites, and promoters. In this work, we propose a hybrid model to identify transcription factors (TFs) among prokaryotic and eukaryotic protein sequences, named Deep Regulation (DeepReg) model. Two architectures were used in the DL model: a convolutional neural network (CNN), and a bidirectional long-short-term memory (BiLSTM). DeepReg reached a precision of 0.99, a recall of 0.97, and an F1-score of 0.98. The quality of our predictions, the bias-variance trade-off approach, and the characterization of new TF predictions were evaluated and compared against those produced by DeepTFactor, as well as against experimental data from three model organisms. Predictions based on our DLM tended to exhibit less variance and bias than those from DeepTFactor, thus increasing reliability and decreasing overfitting.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2024 |
---|---|
Erschienen: |
2024 |
Enthalten in: |
Zur Gesamtaufnahme - volume:14 |
---|---|
Enthalten in: |
Scientific reports - 14(2024), 1 vom: 21. Apr., Seite 9155 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Ledesma-Dominguez, Leonardo [VerfasserIn] |
---|
Links: |
---|
Themen: |
Journal Article |
---|
Anmerkungen: |
Date Completed 23.04.2024 Date Revised 24.04.2024 published: Electronic Citation Status MEDLINE |
---|
doi: |
10.1038/s41598-024-59487-5 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM371333644 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM371333644 | ||
003 | DE-627 | ||
005 | 20240424232254.0 | ||
007 | cr uuu---uuuuu | ||
008 | 240422s2024 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1038/s41598-024-59487-5 |2 doi | |
028 | 5 | 2 | |a pubmed24n1385.xml |
035 | |a (DE-627)NLM371333644 | ||
035 | |a (NLM)38644393 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Ledesma-Dominguez, Leonardo |e verfasserin |4 aut | |
245 | 1 | 0 | |a DeepReg |b a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes |
264 | 1 | |c 2024 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 23.04.2024 | ||
500 | |a Date Revised 24.04.2024 | ||
500 | |a published: Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a © 2024. The Author(s). | ||
520 | |a Deep learning models (DLMs) have gained importance in predicting, detecting, translating, and classifying a diversity of inputs. In bioinformatics, DLMs have been used to predict protein structures, transcription factor-binding sites, and promoters. In this work, we propose a hybrid model to identify transcription factors (TFs) among prokaryotic and eukaryotic protein sequences, named Deep Regulation (DeepReg) model. Two architectures were used in the DL model: a convolutional neural network (CNN), and a bidirectional long-short-term memory (BiLSTM). DeepReg reached a precision of 0.99, a recall of 0.97, and an F1-score of 0.98. The quality of our predictions, the bias-variance trade-off approach, and the characterization of new TF predictions were evaluated and compared against those produced by DeepTFactor, as well as against experimental data from three model organisms. Predictions based on our DLM tended to exhibit less variance and bias than those from DeepTFactor, thus increasing reliability and decreasing overfitting | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, Non-U.S. Gov't | |
650 | 7 | |a Transcription Factors |2 NLM | |
700 | 1 | |a Carbajal-Degante, Erik |e verfasserin |4 aut | |
700 | 1 | |a Moreno-Hagelsieb, Gabriel |e verfasserin |4 aut | |
700 | 1 | |a Perez-Rueda, Ernesto |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Scientific reports |d 2011 |g 14(2024), 1 vom: 21. Apr., Seite 9155 |w (DE-627)NLM215703936 |x 2045-2322 |7 nnns |
773 | 1 | 8 | |g volume:14 |g year:2024 |g number:1 |g day:21 |g month:04 |g pages:9155 |
856 | 4 | 0 | |u http://dx.doi.org/10.1038/s41598-024-59487-5 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 14 |j 2024 |e 1 |b 21 |c 04 |h 9155 |