Discovery of a non-canonical GRHL1 binding site using deep convolutional and recurrent neural networks
© 2023. The Author(s)..
BACKGROUND: Transcription factors regulate gene expression by binding to transcription factor binding sites (TFBSs). Most models for predicting TFBSs are based on position weight matrices (PWMs), which require a specific motif to be present in the DNA sequence and do not consider interdependencies of nucleotides. Novel approaches such as Transcription Factor Flexible Models or recurrent neural networks consequently provide higher accuracies. However, it is unclear whether such approaches can uncover novel non-canonical, hitherto unexpected TFBSs relevant to human transcriptional regulation.
RESULTS: In this study, we trained a convolutional recurrent neural network with HT-SELEX data for GRHL1 binding and applied it to a set of GRHL1 binding sites obtained from ChIP-Seq experiments from human cells. We identified 46 non-canonical GRHL1 binding sites, which were not found by a conventional PWM approach. Unexpectedly, some of the newly predicted binding sequences lacked the CNNG core motif, so far considered obligatory for GRHL1 binding. Using isothermal titration calorimetry, we experimentally confirmed binding between the GRHL1-DNA binding domain and predicted GRHL1 binding sites, including a non-canonical GRHL1 binding site. Mutagenesis of individual nucleotides revealed a correlation between predicted binding strength and experimentally validated binding affinity across representative sequences. This correlation was neither observed with a PWM-based nor another deep learning approach.
CONCLUSIONS: Our results show that convolutional recurrent neural networks may uncover unanticipated binding sites and facilitate quantitative transcription factor binding predictions.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:24 |
---|---|
Enthalten in: |
BMC genomics - 24(2023), 1 vom: 04. Dez., Seite 736 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Proft, Sebastian [VerfasserIn] |
---|
Links: |
---|
Themen: |
GRHL1 protein, human |
---|
Anmerkungen: |
Date Completed 06.12.2023 Date Revised 07.12.2023 published: Electronic Citation Status MEDLINE |
---|
doi: |
10.1186/s12864-023-09830-3 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM365406732 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM365406732 | ||
003 | DE-627 | ||
005 | 20231226101608.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1186/s12864-023-09830-3 |2 doi | |
028 | 5 | 2 | |a pubmed24n1217.xml |
035 | |a (DE-627)NLM365406732 | ||
035 | |a (NLM)38049725 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Proft, Sebastian |e verfasserin |4 aut | |
245 | 1 | 0 | |a Discovery of a non-canonical GRHL1 binding site using deep convolutional and recurrent neural networks |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 06.12.2023 | ||
500 | |a Date Revised 07.12.2023 | ||
500 | |a published: Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a © 2023. The Author(s). | ||
520 | |a BACKGROUND: Transcription factors regulate gene expression by binding to transcription factor binding sites (TFBSs). Most models for predicting TFBSs are based on position weight matrices (PWMs), which require a specific motif to be present in the DNA sequence and do not consider interdependencies of nucleotides. Novel approaches such as Transcription Factor Flexible Models or recurrent neural networks consequently provide higher accuracies. However, it is unclear whether such approaches can uncover novel non-canonical, hitherto unexpected TFBSs relevant to human transcriptional regulation | ||
520 | |a RESULTS: In this study, we trained a convolutional recurrent neural network with HT-SELEX data for GRHL1 binding and applied it to a set of GRHL1 binding sites obtained from ChIP-Seq experiments from human cells. We identified 46 non-canonical GRHL1 binding sites, which were not found by a conventional PWM approach. Unexpectedly, some of the newly predicted binding sequences lacked the CNNG core motif, so far considered obligatory for GRHL1 binding. Using isothermal titration calorimetry, we experimentally confirmed binding between the GRHL1-DNA binding domain and predicted GRHL1 binding sites, including a non-canonical GRHL1 binding site. Mutagenesis of individual nucleotides revealed a correlation between predicted binding strength and experimentally validated binding affinity across representative sequences. This correlation was neither observed with a PWM-based nor another deep learning approach | ||
520 | |a CONCLUSIONS: Our results show that convolutional recurrent neural networks may uncover unanticipated binding sites and facilitate quantitative transcription factor binding predictions | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Genetics | |
650 | 4 | |a Grainyhead-like 1 | |
650 | 4 | |a Machine learning | |
650 | 4 | |a Neural networks | |
650 | 4 | |a Transcription factor binding | |
650 | 7 | |a Transcription Factors |2 NLM | |
650 | 7 | |a Nucleotides |2 NLM | |
650 | 7 | |a GRHL1 protein, human |2 NLM | |
650 | 7 | |a Repressor Proteins |2 NLM | |
700 | 1 | |a Leiz, Janna |e verfasserin |4 aut | |
700 | 1 | |a Heinemann, Udo |e verfasserin |4 aut | |
700 | 1 | |a Seelow, Dominik |e verfasserin |4 aut | |
700 | 1 | |a Schmidt-Ott, Kai M |e verfasserin |4 aut | |
700 | 1 | |a Rutkiewicz, Maria |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t BMC genomics |d 2000 |g 24(2023), 1 vom: 04. Dez., Seite 736 |w (DE-627)NLM10921594X |x 1471-2164 |7 nnns |
773 | 1 | 8 | |g volume:24 |g year:2023 |g number:1 |g day:04 |g month:12 |g pages:736 |
856 | 4 | 0 | |u http://dx.doi.org/10.1186/s12864-023-09830-3 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 24 |j 2023 |e 1 |b 04 |c 12 |h 736 |