Efficient Supervised Pretraining of Swin-Transformer for Virtual Staining of Microscopy Images
Fluorescence staining is an important technique in life science for labeling cellular constituents. However, it also suffers from being time-consuming, having difficulty in simultaneous labeling, etc. Thus, virtual staining, which does not rely on chemical labeling, has been introduced. Recently, deep learning models such as transformers have been applied to virtual staining tasks. However, their performance relies on large-scale pretraining, hindering their development in the field. To reduce the reliance on large amounts of computation and data, we construct a Swin-transformer model and propose an efficient supervised pretraining method based on the masked autoencoder (MAE). Specifically, we adopt downsampling and grid sampling to mask 75% of pixels and reduce the number of tokens. The pretraining time of our method is only 1/16 compared with the original MAE. We also design a supervised proxy task to predict stained images with multiple styles instead of masked pixels. Additionally, most virtual staining approaches are based on private datasets and evaluated by different metrics, making a fair comparison difficult. Therefore, we develop a standard benchmark based on three public datasets and build a baseline for the convenience of future researchers. We conduct extensive experiments on three benchmark datasets, and the experimental results show the proposed method achieves the best performance both quantitatively and qualitatively. In addition, ablation studies are conducted, and experimental results illustrate the effectiveness of the proposed pretraining method. The benchmark and code are available at https://github.com/birkhoffkiki/CAS-Transformer.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2024 |
---|---|
Erschienen: |
2024 |
Enthalten in: |
Zur Gesamtaufnahme - volume:43 |
---|---|
Enthalten in: |
IEEE transactions on medical imaging - 43(2024), 4 vom: 28. Apr., Seite 1388-1399 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 04.04.2024 Date Revised 04.04.2024 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1109/TMI.2023.3337253 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM365021385 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM365021385 | ||
003 | DE-627 | ||
005 | 20240404234422.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2024 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1109/TMI.2023.3337253 |2 doi | |
028 | 5 | 2 | |a pubmed24n1364.xml |
035 | |a (DE-627)NLM365021385 | ||
035 | |a (NLM)38010933 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Ma, Jiabo |e verfasserin |4 aut | |
245 | 1 | 0 | |a Efficient Supervised Pretraining of Swin-Transformer for Virtual Staining of Microscopy Images |
264 | 1 | |c 2024 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 04.04.2024 | ||
500 | |a Date Revised 04.04.2024 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Fluorescence staining is an important technique in life science for labeling cellular constituents. However, it also suffers from being time-consuming, having difficulty in simultaneous labeling, etc. Thus, virtual staining, which does not rely on chemical labeling, has been introduced. Recently, deep learning models such as transformers have been applied to virtual staining tasks. However, their performance relies on large-scale pretraining, hindering their development in the field. To reduce the reliance on large amounts of computation and data, we construct a Swin-transformer model and propose an efficient supervised pretraining method based on the masked autoencoder (MAE). Specifically, we adopt downsampling and grid sampling to mask 75% of pixels and reduce the number of tokens. The pretraining time of our method is only 1/16 compared with the original MAE. We also design a supervised proxy task to predict stained images with multiple styles instead of masked pixels. Additionally, most virtual staining approaches are based on private datasets and evaluated by different metrics, making a fair comparison difficult. Therefore, we develop a standard benchmark based on three public datasets and build a baseline for the convenience of future researchers. We conduct extensive experiments on three benchmark datasets, and the experimental results show the proposed method achieves the best performance both quantitatively and qualitatively. In addition, ablation studies are conducted, and experimental results illustrate the effectiveness of the proposed pretraining method. The benchmark and code are available at https://github.com/birkhoffkiki/CAS-Transformer | ||
650 | 4 | |a Journal Article | |
700 | 1 | |a Chen, Hao |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t IEEE transactions on medical imaging |d 1982 |g 43(2024), 4 vom: 28. Apr., Seite 1388-1399 |w (DE-627)NLM082855269 |x 1558-254X |7 nnns |
773 | 1 | 8 | |g volume:43 |g year:2024 |g number:4 |g day:28 |g month:04 |g pages:1388-1399 |
856 | 4 | 0 | |u http://dx.doi.org/10.1109/TMI.2023.3337253 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 43 |j 2024 |e 4 |b 28 |c 04 |h 1388-1399 |