RNAirport : a deep neural network-based database characterizing representative gene models in plants

Copyright © 2024 The Authors. Published by Elsevier Ltd.. All rights reserved..

A 5'-leader, known initially as the 5'-untranslated region, contains multiple isoforms due to alternative splicings (aS) and transcription start sites (aTSS). Therefore, a representative 5'-leader is demanded to examine the embedded RNA regulatory elements in controlling translation efficiency. Here, we develop a ranking algorithm and a deep-learning model to annotate representative 5'-leaders for five plant species. We rank the intra- and inter-sample frequency of aS-mediated transcript isoforms using the Kruskal-Wallis test-based algorithm and identify the representative aS-5'-leader. To further assign a representative 5'-end, we train the deep-learning model 5'leaderP to learn aTSS-mediated 5'-end distribution patterns from cap-analysis gene expression (CAGE) data. The model accurately predicts the 5'-end, confirmed experimentally in Arabidopsis and rice. The representative 5'-leader-contained gene models and 5'leaderP can be accessed at RNAirport (http://www.rnairport.com/leader5P/). This stage 1 5'-leader annotation records 5'-leader diversity and will pave the way to Ribo-Seq ORF annotation, identical to the project recently initiated by human GENCODE.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - year:2024

Enthalten in:

Journal of genetics and genomics = Yi chuan xue bao - (2024) vom: 20. März

Sprache:

Englisch

Beteiligte Personen:

Zhu, Sitao [VerfasserIn]
Yuan, Shu [VerfasserIn]
Niu, Ruixia [VerfasserIn]
Zhou, Yulu [VerfasserIn]
Wang, Zhao [VerfasserIn]
Xu, Guoyong [VerfasserIn]

Links:

Volltext

Themen:

5′-leader
Deep learning
Journal Article
RNA regulatory elements
Synthetic biology
Transcript isoforms
Translational control
UORF

Anmerkungen:

Date Revised 22.03.2024

published: Print-Electronic

Citation Status Publisher

doi:

10.1016/j.jgg.2024.03.004

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM370085477