Procrustes is a machine-learning approach that removes cross-platform batch effects from clinical RNA sequencing data

© 2024. The Author(s)..

With the increased use of gene expression profiling for personalized oncology, optimized RNA sequencing (RNA-seq) protocols and algorithms are necessary to provide comparable expression measurements between exome capture (EC)-based and poly-A RNA-seq. Here, we developed and optimized an EC-based protocol for processing formalin-fixed, paraffin-embedded samples and a machine-learning algorithm, Procrustes, to overcome batch effects across RNA-seq data obtained using different sample preparation protocols like EC-based or poly-A RNA-seq protocols. Applying Procrustes to samples processed using EC and poly-A RNA-seq protocols showed the expression of 61% of genes (N = 20,062) to correlate across both protocols (concordance correlation coefficient > 0.8, versus 26% before transformation by Procrustes), including 84% of cancer-specific and cancer microenvironment-related genes (versus 36% before applying Procrustes; N = 1,438). Benchmarking analyses also showed Procrustes to outperform other batch correction methods. Finally, we showed that Procrustes can project RNA-seq data for a single sample to a larger cohort of RNA-seq data. Future application of Procrustes will enable direct gene expression analysis for single tumor samples to support gene expression-based treatment decisions.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:7

Enthalten in:

Communications biology - 7(2024), 1 vom: 30. März, Seite 392

Sprache:

Englisch

Beteiligte Personen:

Kotlov, Nikita [VerfasserIn]
Shaposhnikov, Kirill [VerfasserIn]
Tazearslan, Cagdas [VerfasserIn]
Chasse, Madison [VerfasserIn]
Baisangurov, Artur [VerfasserIn]
Podsvirova, Svetlana [VerfasserIn]
Fernandez, Dawn [VerfasserIn]
Abdou, Mary [VerfasserIn]
Kaneunyenye, Leznath [VerfasserIn]
Morgan, Kelley [VerfasserIn]
Cheremushkin, Ilya [VerfasserIn]
Zemskiy, Pavel [VerfasserIn]
Chelushkin, Maxim [VerfasserIn]
Sorokina, Maria [VerfasserIn]
Belova, Ekaterina [VerfasserIn]
Khorkova, Svetlana [VerfasserIn]
Lozinsky, Yaroslav [VerfasserIn]
Nuzhdina, Katerina [VerfasserIn]
Vasileva, Elena [VerfasserIn]
Kravchenko, Dmitry [VerfasserIn]
Suryamohan, Kushal [VerfasserIn]
Nomie, Krystle [VerfasserIn]
Curran, John [VerfasserIn]
Fowler, Nathan [VerfasserIn]
Bagaev, Alexander [VerfasserIn]

Links:

Volltext

Themen:

63231-63-0
Journal Article
RNA

Anmerkungen:

Date Completed 01.04.2024

Date Revised 02.04.2024

published: Electronic

Citation Status MEDLINE

doi:

10.1038/s42003-024-06020-z

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM370449290