Carbohydrate-active enzyme annotation in microbiomes using dbCAN

CAZymes or carbohydrate-active enzymes are critically important for human gut health, lignocellulose degradation, global carbon recycling, soil health, and plant disease. We developed dbCAN as a web server in 2012 and actively maintain it for automated CAZyme annotation. Considering data privacy and scalability, we provide run_dbcan as a standalone software package since 2018 to allow users perform more secure and scalable CAZyme annotation on their local servers. Here, we offer a comprehensive computational protocol on automated CAZyme annotation of microbiome sequencing data, covering everything from short read pre-processing to data visualization of CAZyme and glycan substrate occurrence and abundance in multiple samples. Using a real-world metagenomic sequencing dataset, this protocol describes commands for dataset and software preparation, metagenome assembly, gene prediction, CAZyme prediction, CAZyme gene cluster (CGC) prediction, glycan substrate prediction, and data visualization. The expected results include publication-quality plots for the abundance of CAZymes, CGCs, and substrates from multiple CAZyme annotation routes (individual sample assembly, co-assembly, and assembly-free). For the individual sample assembly route, this protocol takes ∼33h on a Linux computer with 40 CPUs, while other routes will be faster. This protocol does not require programming experience from users, but it does assume a familiarity with the Linux command-line interface and the ability to run Python scripts in the terminal. The target audience includes the tens of thousands of microbiome researchers who routinely use our web server. This protocol will encourage them to perform more secure, rapid, and scalable CAZyme annotation on their local computer servers.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - year:2024

Enthalten in:

bioRxiv : the preprint server for biology - (2024) vom: 11. Jan.

Sprache:

Englisch

Beteiligte Personen:

Zheng, Jinfang [VerfasserIn]
Huang, Le [VerfasserIn]
Yi, Haidong [VerfasserIn]
Yan, Yuchen [VerfasserIn]
Zhang, Xinpeng [VerfasserIn]
Akresi, Jerry [VerfasserIn]
Yin, Yanbin [VerfasserIn]

Links:

Volltext

Themen:

Preprint

Anmerkungen:

Date Revised 23.01.2024

published: Electronic

Citation Status PubMed-not-MEDLINE

doi:

10.1101/2024.01.10.575125

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM36750801X