SAMPLER : unsupervised representations for rapid analysis of whole slide tissue images

Copyright © 2023 The Author(s). Published by Elsevier B.V. All rights reserved..

BACKGROUND: Deep learning has revolutionized digital pathology, allowing automatic analysis of hematoxylin and eosin (H&E) stained whole slide images (WSIs) for diverse tasks. WSIs are broken into smaller images called tiles, and a neural network encodes each tile. Many recent works use supervised attention-based models to aggregate tile-level features into a slide-level representation, which is then used for downstream analysis. Training supervised attention-based models is computationally intensive, architecture optimization of the attention module is non-trivial, and labeled data are not always available. Therefore, we developed an unsupervised and fast approach called SAMPLER to generate slide-level representations.

METHODS: Slide-level representations of SAMPLER are generated by encoding the cumulative distribution functions of multiscale tile-level features. To assess effectiveness of SAMPLER, slide-level representations of breast carcinoma (BRCA), non-small cell lung carcinoma (NSCLC), and renal cell carcinoma (RCC) WSIs of The Cancer Genome Atlas (TCGA) were used to train separate classifiers distinguishing tumor subtypes in FFPE and frozen WSIs. In addition, BRCA and NSCLC classifiers were externally validated on frozen WSIs. Moreover, SAMPLER's attention maps identify regions of interest, which were evaluated by a pathologist. To determine time efficiency of SAMPLER, we compared runtime of SAMPLER with two attention-based models. SAMPLER concepts were used to improve the design of a context-aware multi-head attention model (context-MHA).

FINDINGS: SAMPLER-based classifiers were comparable to state-of-the-art attention deep learning models to distinguish subtypes of BRCA (AUC = 0.911 ± 0.029), NSCLC (AUC = 0.940 ± 0.018), and RCC (AUC = 0.987 ± 0.006) on FFPE WSIs (internal test sets). However, training SAMLER-based classifiers was >100 times faster. SAMPLER models successfully distinguished tumor subtypes on both internal and external test sets of frozen WSIs. Histopathological review confirmed that SAMPLER-identified high attention tiles contained subtype-specific morphological features. The improved context-MHA distinguished subtypes of BRCA and RCC (BRCA-AUC = 0.921 ± 0.027, RCC-AUC = 0.988 ± 0.010) with increased accuracy on internal test FFPE WSIs.

INTERPRETATION: Our unsupervised statistical approach is fast and effective for analyzing WSIs, with greatly improved scalability over attention-based deep learning methods. The high accuracy of SAMPLER-based classifiers and interpretable attention maps suggest that SAMPLER successfully encodes the distinct morphologies within WSIs and will be applicable to general histology image analysis problems.

FUNDING: This study was supported by the National Cancer Institute (Grant No. R01CA230031 and P30CA034196).

Errataetall:

UpdateOf: bioRxiv. 2023 Aug 03;:. - PMID 37577691

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:99

Enthalten in:

EBioMedicine - 99(2024) vom: 01. Jan., Seite 104908

Sprache:

Englisch

Beteiligte Personen:

Mukashyaka, Patience [VerfasserIn]
Sheridan, Todd B [VerfasserIn]
Foroughi Pour, Ali [VerfasserIn]
Chuang, Jeffrey H [VerfasserIn]

Links:

Volltext

Themen:

Deep learning
Digital pathology
Journal Article
Multiple instance learning
Representation learning
Unsupervised learning
WSI representation

Anmerkungen:

Date Completed 22.01.2024

Date Revised 31.01.2024

published: Print-Electronic

UpdateOf: bioRxiv. 2023 Aug 03;:. - PMID 37577691

Citation Status MEDLINE

doi:

10.1016/j.ebiom.2023.104908

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM36591987X