Faecal microbiome-based machine learning for multi-class disease diagnosis
© 2022. The Author(s)..
Systemic characterisation of the human faecal microbiome provides the opportunity to develop non-invasive approaches in the diagnosis of a major human disease. However, shared microbial signatures across different diseases make accurate diagnosis challenging in single-disease models. Herein, we present a machine-learning multi-class model using faecal metagenomic dataset of 2,320 individuals with nine well-characterised phenotypes, including colorectal cancer, colorectal adenomas, Crohn's disease, ulcerative colitis, irritable bowel syndrome, obesity, cardiovascular disease, post-acute COVID-19 syndrome and healthy individuals. Our processed data covers 325 microbial species derived from 14.3 terabytes of sequence. The trained model achieves an area under the receiver operating characteristic curve (AUROC) of 0.90 to 0.99 (Interquartile range, IQR, 0.91-0.94) in predicting different diseases in the independent test set, with a sensitivity of 0.81 to 0.95 (IQR, 0.87-0.93) at a specificity of 0.76 to 0.98 (IQR 0.83-0.95). Metagenomic analysis from public datasets of 1,597 samples across different populations observes comparable predictions with AUROC of 0.69 to 0.91 (IQR 0.79-0.87). Correlation of the top 50 microbial species with disease phenotypes identifies 363 significant associations (FDR < 0.05). This microbiome-based multi-disease model has potential clinical application in disease diagnostics and treatment response monitoring and warrants further exploration.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2022 |
---|---|
Erschienen: |
2022 |
Enthalten in: |
Zur Gesamtaufnahme - volume:13 |
---|---|
Enthalten in: |
Nature communications - 13(2022), 1 vom: 10. Nov., Seite 6818 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Su, Qi [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 14.11.2022 Date Revised 26.12.2022 published: Electronic Citation Status MEDLINE |
---|
doi: |
10.1038/s41467-022-34405-3 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM348709803 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM348709803 | ||
003 | DE-627 | ||
005 | 20231226040909.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1038/s41467-022-34405-3 |2 doi | |
028 | 5 | 2 | |a pubmed24n1162.xml |
035 | |a (DE-627)NLM348709803 | ||
035 | |a (NLM)36357393 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Su, Qi |e verfasserin |4 aut | |
245 | 1 | 0 | |a Faecal microbiome-based machine learning for multi-class disease diagnosis |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 14.11.2022 | ||
500 | |a Date Revised 26.12.2022 | ||
500 | |a published: Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a © 2022. The Author(s). | ||
520 | |a Systemic characterisation of the human faecal microbiome provides the opportunity to develop non-invasive approaches in the diagnosis of a major human disease. However, shared microbial signatures across different diseases make accurate diagnosis challenging in single-disease models. Herein, we present a machine-learning multi-class model using faecal metagenomic dataset of 2,320 individuals with nine well-characterised phenotypes, including colorectal cancer, colorectal adenomas, Crohn's disease, ulcerative colitis, irritable bowel syndrome, obesity, cardiovascular disease, post-acute COVID-19 syndrome and healthy individuals. Our processed data covers 325 microbial species derived from 14.3 terabytes of sequence. The trained model achieves an area under the receiver operating characteristic curve (AUROC) of 0.90 to 0.99 (Interquartile range, IQR, 0.91-0.94) in predicting different diseases in the independent test set, with a sensitivity of 0.81 to 0.95 (IQR, 0.87-0.93) at a specificity of 0.76 to 0.98 (IQR 0.83-0.95). Metagenomic analysis from public datasets of 1,597 samples across different populations observes comparable predictions with AUROC of 0.69 to 0.91 (IQR 0.79-0.87). Correlation of the top 50 microbial species with disease phenotypes identifies 363 significant associations (FDR < 0.05). This microbiome-based multi-disease model has potential clinical application in disease diagnostics and treatment response monitoring and warrants further exploration | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, Non-U.S. Gov't | |
700 | 1 | |a Liu, Qin |e verfasserin |4 aut | |
700 | 1 | |a Lau, Raphaela Iris |e verfasserin |4 aut | |
700 | 1 | |a Zhang, Jingwan |e verfasserin |4 aut | |
700 | 1 | |a Xu, Zhilu |e verfasserin |4 aut | |
700 | 1 | |a Yeoh, Yun Kit |e verfasserin |4 aut | |
700 | 1 | |a Leung, Thomas W H |e verfasserin |4 aut | |
700 | 1 | |a Tang, Whitney |e verfasserin |4 aut | |
700 | 1 | |a Zhang, Lin |e verfasserin |4 aut | |
700 | 1 | |a Liang, Jessie Q Y |e verfasserin |4 aut | |
700 | 1 | |a Yau, Yuk Kam |e verfasserin |4 aut | |
700 | 1 | |a Zheng, Jiaying |e verfasserin |4 aut | |
700 | 1 | |a Liu, Chengyu |e verfasserin |4 aut | |
700 | 1 | |a Zhang, Mengjing |e verfasserin |4 aut | |
700 | 1 | |a Cheung, Chun Pan |e verfasserin |4 aut | |
700 | 1 | |a Ching, Jessica Y L |e verfasserin |4 aut | |
700 | 1 | |a Tun, Hein M |e verfasserin |4 aut | |
700 | 1 | |a Yu, Jun |e verfasserin |4 aut | |
700 | 1 | |a Chan, Francis K L |e verfasserin |4 aut | |
700 | 1 | |a Ng, Siew C |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Nature communications |d 2010 |g 13(2022), 1 vom: 10. Nov., Seite 6818 |w (DE-627)NLM199274525 |x 2041-1723 |7 nnns |
773 | 1 | 8 | |g volume:13 |g year:2022 |g number:1 |g day:10 |g month:11 |g pages:6818 |
856 | 4 | 0 | |u http://dx.doi.org/10.1038/s41467-022-34405-3 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 13 |j 2022 |e 1 |b 10 |c 11 |h 6818 |