ESPRESSO : Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data

Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes.

Medienart:

E-Artikel

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

Zur Gesamtaufnahme - volume:9

Enthalten in:

Science advances - 9(2023), 3 vom: 20. Jan., Seite eabq5072

Sprache:

Englisch

Beteiligte Personen:

Gao, Yuan [VerfasserIn]
Wang, Feng [VerfasserIn]
Wang, Robert [VerfasserIn]
Kutschera, Eric [VerfasserIn]
Xu, Yang [VerfasserIn]
Xie, Stephan [VerfasserIn]
Wang, Yuanyuan [VerfasserIn]
Kadash-Edmondson, Kathryn E [VerfasserIn]
Lin, Lan [VerfasserIn]
Xing, Yi [VerfasserIn]

Links:

Volltext

Themen:

63231-63-0
Journal Article
Protein Isoforms
RNA

Anmerkungen:

Date Completed 24.01.2023

Date Revised 24.11.2023

published: Print-Electronic

Citation Status MEDLINE

doi:

10.1126/sciadv.abq5072

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM351735887