Unsupervised meta-clustering identifies risk clusters in acute myeloid leukemia based on clinical and genetic profiles
© 2023. The Author(s)..
BACKGROUND: Increasingly large and complex biomedical data sets challenge conventional hypothesis-driven analytical approaches, however, data-driven unsupervised learning can detect inherent patterns in such data sets.
METHODS: While unsupervised analysis in the medical literature commonly only utilizes a single clustering algorithm for a given data set, we developed a large-scale model with 605 different combinations of target dimensionalities as well as transformation and clustering algorithms and subsequent meta-clustering of individual results. With this model, we investigated a large cohort of 1383 patients from 59 centers in Germany with newly diagnosed acute myeloid leukemia for whom 212 clinical, laboratory, cytogenetic and molecular genetic parameters were available.
RESULTS: Unsupervised learning identifies four distinct patient clusters, and statistical analysis shows significant differences in rate of complete remissions, event-free, relapse-free and overall survival between the four clusters. In comparison to the standard-of-care hypothesis-driven European Leukemia Net (ELN2017) risk stratification model, we find all three ELN2017 risk categories being represented in all four clusters in varying proportions indicating unappreciated complexity of AML biology in current established risk stratification models. Further, by using assigned clusters as labels we subsequently train a supervised model to validate cluster assignments on a large external multicenter cohort of 664 intensively treated AML patients.
CONCLUSIONS: Dynamic data-driven models are likely more suitable for risk stratification in the context of increasingly complex medical data than rigid hypothesis-driven models to allow for a more personalized treatment allocation and gain novel insights into disease biology.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:3 |
---|---|
Enthalten in: |
Communications medicine - 3(2023), 1 vom: 17. Mai, Seite 68 |
Sprache: |
Englisch |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Revised 20.05.2023 published: Electronic Citation Status PubMed-not-MEDLINE |
---|
doi: |
10.1038/s43856-023-00298-6 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM357008138 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM357008138 | ||
003 | DE-627 | ||
005 | 20231226071648.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1038/s43856-023-00298-6 |2 doi | |
028 | 5 | 2 | |a pubmed24n1189.xml |
035 | |a (DE-627)NLM357008138 | ||
035 | |a (NLM)37198246 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Eckardt, Jan-Niklas |e verfasserin |4 aut | |
245 | 1 | 0 | |a Unsupervised meta-clustering identifies risk clusters in acute myeloid leukemia based on clinical and genetic profiles |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Revised 20.05.2023 | ||
500 | |a published: Electronic | ||
500 | |a Citation Status PubMed-not-MEDLINE | ||
520 | |a © 2023. The Author(s). | ||
520 | |a BACKGROUND: Increasingly large and complex biomedical data sets challenge conventional hypothesis-driven analytical approaches, however, data-driven unsupervised learning can detect inherent patterns in such data sets | ||
520 | |a METHODS: While unsupervised analysis in the medical literature commonly only utilizes a single clustering algorithm for a given data set, we developed a large-scale model with 605 different combinations of target dimensionalities as well as transformation and clustering algorithms and subsequent meta-clustering of individual results. With this model, we investigated a large cohort of 1383 patients from 59 centers in Germany with newly diagnosed acute myeloid leukemia for whom 212 clinical, laboratory, cytogenetic and molecular genetic parameters were available | ||
520 | |a RESULTS: Unsupervised learning identifies four distinct patient clusters, and statistical analysis shows significant differences in rate of complete remissions, event-free, relapse-free and overall survival between the four clusters. In comparison to the standard-of-care hypothesis-driven European Leukemia Net (ELN2017) risk stratification model, we find all three ELN2017 risk categories being represented in all four clusters in varying proportions indicating unappreciated complexity of AML biology in current established risk stratification models. Further, by using assigned clusters as labels we subsequently train a supervised model to validate cluster assignments on a large external multicenter cohort of 664 intensively treated AML patients | ||
520 | |a CONCLUSIONS: Dynamic data-driven models are likely more suitable for risk stratification in the context of increasingly complex medical data than rigid hypothesis-driven models to allow for a more personalized treatment allocation and gain novel insights into disease biology | ||
650 | 4 | |a Journal Article | |
700 | 1 | |a Röllig, Christoph |e verfasserin |4 aut | |
700 | 1 | |a Metzeler, Klaus |e verfasserin |4 aut | |
700 | 1 | |a Heisig, Peter |e verfasserin |4 aut | |
700 | 1 | |a Stasik, Sebastian |e verfasserin |4 aut | |
700 | 1 | |a Georgi, Julia-Annabell |e verfasserin |4 aut | |
700 | 1 | |a Kroschinsky, Frank |e verfasserin |4 aut | |
700 | 1 | |a Stölzel, Friedrich |e verfasserin |4 aut | |
700 | 1 | |a Platzbecker, Uwe |e verfasserin |4 aut | |
700 | 1 | |a Spiekermann, Karsten |e verfasserin |4 aut | |
700 | 1 | |a Krug, Utz |e verfasserin |4 aut | |
700 | 1 | |a Braess, Jan |e verfasserin |4 aut | |
700 | 1 | |a Görlich, Dennis |e verfasserin |4 aut | |
700 | 1 | |a Sauerland, Cristina |e verfasserin |4 aut | |
700 | 1 | |a Woermann, Bernhard |e verfasserin |4 aut | |
700 | 1 | |a Herold, Tobias |e verfasserin |4 aut | |
700 | 1 | |a Hiddemann, Wolfgang |e verfasserin |4 aut | |
700 | 1 | |a Müller-Tidow, Carsten |e verfasserin |4 aut | |
700 | 1 | |a Serve, Hubert |e verfasserin |4 aut | |
700 | 1 | |a Baldus, Claudia D |e verfasserin |4 aut | |
700 | 1 | |a Schäfer-Eckart, Kerstin |e verfasserin |4 aut | |
700 | 1 | |a Kaufmann, Martin |e verfasserin |4 aut | |
700 | 1 | |a Krause, Stefan W |e verfasserin |4 aut | |
700 | 1 | |a Hänel, Mathias |e verfasserin |4 aut | |
700 | 1 | |a Berdel, Wolfgang E |e verfasserin |4 aut | |
700 | 1 | |a Schliemann, Christoph |e verfasserin |4 aut | |
700 | 1 | |a Mayer, Jiri |e verfasserin |4 aut | |
700 | 1 | |a Hanoun, Maher |e verfasserin |4 aut | |
700 | 1 | |a Schetelig, Johannes |e verfasserin |4 aut | |
700 | 1 | |a Wendt, Karsten |e verfasserin |4 aut | |
700 | 1 | |a Bornhäuser, Martin |e verfasserin |4 aut | |
700 | 1 | |a Thiede, Christian |e verfasserin |4 aut | |
700 | 1 | |a Middeke, Jan Moritz |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Communications medicine |d 2021 |g 3(2023), 1 vom: 17. Mai, Seite 68 |w (DE-627)NLM330650033 |x 2730-664X |7 nnns |
773 | 1 | 8 | |g volume:3 |g year:2023 |g number:1 |g day:17 |g month:05 |g pages:68 |
856 | 4 | 0 | |u http://dx.doi.org/10.1038/s43856-023-00298-6 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 3 |j 2023 |e 1 |b 17 |c 05 |h 68 |