Rapid and accurate identification of ribosomal RNA sequences via deep learning

© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research..

Advances in transcriptomic and translatomic techniques enable in-depth studies of RNA activity profiles and RNA-based regulatory mechanisms. Ribosomal RNA (rRNA) sequences are highly abundant among cellular RNA, but if the target sequences do not include polyadenylation, these cannot be easily removed in library preparation, requiring their post-hoc removal with computational techniques to accelerate and improve downstream analyses. Here, we describe RiboDetector, a novel software based on a Bi-directional Long Short-Term Memory (BiLSTM) neural network, which rapidly and accurately identifies rRNA reads from transcriptomic, metagenomic, metatranscriptomic, noncoding RNA, and ribosome profiling sequence data. Compared with state-of-the-art approaches, RiboDetector produced at least six times fewer misclassifications on the benchmark datasets. Importantly, the few false positives of RiboDetector were not enriched in certain Gene Ontology (GO) terms, suggesting a low bias for downstream functional profiling. RiboDetector also demonstrated a remarkable generalizability for detecting novel rRNA sequences that are divergent from the training data with sequence identities of <90%. On a personal computer, RiboDetector processed 40M reads in less than 6 min, which was ∼50 times faster in GPU mode and ∼15 times in CPU mode than other methods. RiboDetector is available under a GPL v3.0 license at https://github.com/hzi-bifo/RiboDetector.

Medienart:

E-Artikel

Erscheinungsjahr:

2022

Erschienen:

2022

Enthalten in:

Zur Gesamtaufnahme - volume:50

Enthalten in:

Nucleic acids research - 50(2022), 10 vom: 10. Juni, Seite e60

Sprache:

Englisch

Beteiligte Personen:

Deng, Zhi-Luo [VerfasserIn]
Münch, Philipp C [VerfasserIn]
Mreches, René [VerfasserIn]
McHardy, Alice C [VerfasserIn]

Links:

Volltext

Themen:

63231-63-0
Journal Article
RNA
RNA, Ribosomal
Research Support, Non-U.S. Gov't

Anmerkungen:

Date Completed 10.06.2022

Date Revised 16.07.2022

published: Print

Citation Status MEDLINE

doi:

10.1093/nar/gkac112

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM337203717