metaExpertPro: a computational workflow for metaproteomics spectral library construction and data-independent acquisition mass spectrometry data analysis

Abstract Background Analysis of mass spectrometry-based metaproteomic data, in particular large-scale data-independent acquisition MS (DIA-MS) data, remains a computational challenge. Here, we aim to develop a software tool for efficiently constructing spectral libraries and analyzing extensive datasets of DIA-based metaproteomics.Results We present a computational pipeline called metaExpertPro for metaproteomics data analysis. This pipeline encompasses spectral library generation using data-dependent acquisition MS (DDA-MS), protein identification and quantification using DIA-MS, functional and taxonomic annotation, as well as quantitative matrix generation for both microbiota and hosts. To enhance accessibility and ease of use, all modules and dependencies are encapsulated within a Docker container.By integrating FragPipe and DIA-NN, metaExpertPro offers compatibility with both Orbitrap-based and PASEF-based DDA and DIA data. To evaluate the depth and accuracy of identification and quantification, we conducted extensive assessments using human fecal samples and benchmark tests. Performance tests conducted on human fecal samples demonstrated that metaExpertPro quantified an average of 45,000 peptides in a 60-minute diaPASEF injection. Notably, metaExpertPro outperformed three existing software tools by characterizing a higher number of peptides and proteins. Importantly, metaExpertPro maintained a low factual False Discovery Rate (FDR) of less than 5% for protein groups across four benchmark tests. Applying a filter of five peptides per genus, metaExpertPro achieved relatively high accuracy (F-score = 0.67–0.90) in genus diversity and demonstrated a high correlation (rSpearman= 0.73–0.82) between the measured and true genus relative abundance in benchmark tests.Additionally, the quantitative results at the protein, taxonomy, and function levels exhibited high reproducibility and consistency across the commonly adopted public human gut microbial protein databases IGC and UHGP. In a metaproteomic analysis of dyslipidemia patients, metaExpertPro revealed characteristic alterations in microbial functions and potential interactions between the microbiota and the host.Conclusions metaExpertPro presents a robust one-stop computational solution for constructing metaproteomics spectral libraries, analyzing DIA-MS data, and annotating taxonomic as well as functional data..

Medienart:

Preprint

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

bioRxiv.org - (2023) vom: 11. Dez. Zur Gesamtaufnahme - year:2023

Sprache:

Englisch

Beteiligte Personen:

Sun, Yingying [VerfasserIn]
Xing, Ziyuan [VerfasserIn]
Liang, Shuang [VerfasserIn]
Miao, Zelei [VerfasserIn]
Zhuo, Lai-bao [VerfasserIn]
Jiang, Wenhao [VerfasserIn]
Zhao, Hui [VerfasserIn]
Gao, Huanhuan [VerfasserIn]
Xie, Yuting [VerfasserIn]
Zhou, Yan [VerfasserIn]
Yue, Liang [VerfasserIn]
Cai, Xue [VerfasserIn]
Chen, Yu-ming [VerfasserIn]
Zheng, Ju-Sheng [VerfasserIn]
Guo, Tiannan [VerfasserIn]

Links:

Volltext [kostenfrei]

Themen:

570
Biology

doi:

10.1101/2023.11.29.569331

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

XBI041716930