Deep learning features encode interpretable morphologies within histological images
© 2022. The Author(s)..
Convolutional neural networks (CNNs) are revolutionizing digital pathology by enabling machine learning-based classification of a variety of phenotypes from hematoxylin and eosin (H&E) whole slide images (WSIs), but the interpretation of CNNs remains difficult. Most studies have considered interpretability in a post hoc fashion, e.g. by presenting example regions with strongly predicted class labels. However, such an approach does not explain the biological features that contribute to correct predictions. To address this problem, here we investigate the interpretability of H&E-derived CNN features (the feature weights in the final layer of a transfer-learning-based architecture). While many studies have incorporated CNN features into predictive models, there has been little empirical study of their properties. We show such features can be construed as abstract morphological genes ("mones") with strong independent associations to biological phenotypes. Many mones are specific to individual cancer types, while others are found in multiple cancers especially from related tissue types. We also observe that mone-mone correlations are strong and robustly preserved across related cancers. Importantly, linear mone-based classifiers can very accurately separate 38 distinct classes (19 tumor types and their adjacent normals, AUC = [Formula: see text] for each class prediction), and linear classifiers are also highly effective for universal tumor detection (AUC = [Formula: see text]). This linearity provides evidence that individual mones or correlated mone clusters may be associated with interpretable histopathological features or other patient characteristics. In particular, the statistical similarity of mones to gene expression values allows integrative mone analysis via expression-based bioinformatics approaches. We observe strong correlations between individual mones and individual gene expression values, notably mones associated with collagen gene expression in ovarian cancer. Mone-expression comparisons also indicate that immunoglobulin expression can be identified using mones in colon adenocarcinoma and that immune activity can be identified across multiple cancer types, and we verify these findings by expert histopathological review. Our work demonstrates that mones provide a morphological H&E decomposition that can be effectively associated with diverse phenotypes, analogous to the interpretability of transcription via gene expression values. Our work also demonstrates mones can be interpreted without using a classifier as a proxy.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2022 |
---|---|
Erschienen: |
2022 |
Enthalten in: |
Zur Gesamtaufnahme - volume:12 |
---|---|
Enthalten in: |
Scientific reports - 12(2022), 1 vom: 08. Juni, Seite 9428 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Foroughi Pour, Ali [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 10.06.2022 Date Revised 13.11.2022 published: Electronic Citation Status MEDLINE |
---|
doi: |
10.1038/s41598-022-13541-2 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM341981656 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM341981656 | ||
003 | DE-627 | ||
005 | 20231226204817.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1038/s41598-022-13541-2 |2 doi | |
028 | 5 | 2 | |a pubmed24n1139.xml |
035 | |a (DE-627)NLM341981656 | ||
035 | |a (NLM)35676395 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Foroughi Pour, Ali |e verfasserin |4 aut | |
245 | 1 | 0 | |a Deep learning features encode interpretable morphologies within histological images |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 10.06.2022 | ||
500 | |a Date Revised 13.11.2022 | ||
500 | |a published: Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a © 2022. The Author(s). | ||
520 | |a Convolutional neural networks (CNNs) are revolutionizing digital pathology by enabling machine learning-based classification of a variety of phenotypes from hematoxylin and eosin (H&E) whole slide images (WSIs), but the interpretation of CNNs remains difficult. Most studies have considered interpretability in a post hoc fashion, e.g. by presenting example regions with strongly predicted class labels. However, such an approach does not explain the biological features that contribute to correct predictions. To address this problem, here we investigate the interpretability of H&E-derived CNN features (the feature weights in the final layer of a transfer-learning-based architecture). While many studies have incorporated CNN features into predictive models, there has been little empirical study of their properties. We show such features can be construed as abstract morphological genes ("mones") with strong independent associations to biological phenotypes. Many mones are specific to individual cancer types, while others are found in multiple cancers especially from related tissue types. We also observe that mone-mone correlations are strong and robustly preserved across related cancers. Importantly, linear mone-based classifiers can very accurately separate 38 distinct classes (19 tumor types and their adjacent normals, AUC = [Formula: see text] for each class prediction), and linear classifiers are also highly effective for universal tumor detection (AUC = [Formula: see text]). This linearity provides evidence that individual mones or correlated mone clusters may be associated with interpretable histopathological features or other patient characteristics. In particular, the statistical similarity of mones to gene expression values allows integrative mone analysis via expression-based bioinformatics approaches. We observe strong correlations between individual mones and individual gene expression values, notably mones associated with collagen gene expression in ovarian cancer. Mone-expression comparisons also indicate that immunoglobulin expression can be identified using mones in colon adenocarcinoma and that immune activity can be identified across multiple cancer types, and we verify these findings by expert histopathological review. Our work demonstrates that mones provide a morphological H&E decomposition that can be effectively associated with diverse phenotypes, analogous to the interpretability of transcription via gene expression values. Our work also demonstrates mones can be interpreted without using a classifier as a proxy | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, N.I.H., Extramural | |
700 | 1 | |a White, Brian S |e verfasserin |4 aut | |
700 | 1 | |a Park, Jonghanne |e verfasserin |4 aut | |
700 | 1 | |a Sheridan, Todd B |e verfasserin |4 aut | |
700 | 1 | |a Chuang, Jeffrey H |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Scientific reports |d 2011 |g 12(2022), 1 vom: 08. Juni, Seite 9428 |w (DE-627)NLM215703936 |x 2045-2322 |7 nnns |
773 | 1 | 8 | |g volume:12 |g year:2022 |g number:1 |g day:08 |g month:06 |g pages:9428 |
856 | 4 | 0 | |u http://dx.doi.org/10.1038/s41598-022-13541-2 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 12 |j 2022 |e 1 |b 08 |c 06 |h 9428 |