EPT-Net : Edge Perception Transformer for 3D Medical Image Segmentation
The convolutional neural network has achieved remarkable results in most medical image seg- mentation applications. However, the intrinsic locality of convolution operation has limitations in modeling the long-range dependency. Although the Transformer designed for sequence-to-sequence global prediction was born to solve this problem, it may lead to limited positioning capability due to insufficient low-level detail features. Moreover, low-level features have rich fine-grained information, which greatly impacts edge segmentation decisions of different organs. However, a simple CNN module is difficult to capture the edge information in fine-grained features, and the computational power and memory consumed in processing high-resolution 3D features are costly. This paper proposes an encoder-decoder network that effectively combines edge perception and Transformer structure to segment medical images accurately, called EPT-Net. Under this framework, this paper proposes a Dual Position Transformer to enhance the 3D spatial positioning ability effectively. In addition, as low-level features contain detailed information, we conduct an Edge Weight Guidance module to extract edge information by minimizing the edge information function without adding network parameters. Furthermore, we verified the effectiveness of the proposed method on three datasets, including SegTHOR 2019, Multi-Atlas Labeling Beyond the Cranial Vault and the re-labeled KiTS19 dataset called KiTS19-M by us. The experimental results show that EPT-Net has significantly improved compared with the state-of-the-art medical image segmentation method.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:42 |
---|---|
Enthalten in: |
IEEE transactions on medical imaging - 42(2023), 11 vom: 22. Nov., Seite 3229-3243 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Yang, Jingyi [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 30.10.2023 Date Revised 30.10.2023 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1109/TMI.2023.3278461 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM357187210 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM357187210 | ||
003 | DE-627 | ||
005 | 20231226072039.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1109/TMI.2023.3278461 |2 doi | |
028 | 5 | 2 | |a pubmed24n1190.xml |
035 | |a (DE-627)NLM357187210 | ||
035 | |a (NLM)37216246 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Yang, Jingyi |e verfasserin |4 aut | |
245 | 1 | 0 | |a EPT-Net |b Edge Perception Transformer for 3D Medical Image Segmentation |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 30.10.2023 | ||
500 | |a Date Revised 30.10.2023 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a The convolutional neural network has achieved remarkable results in most medical image seg- mentation applications. However, the intrinsic locality of convolution operation has limitations in modeling the long-range dependency. Although the Transformer designed for sequence-to-sequence global prediction was born to solve this problem, it may lead to limited positioning capability due to insufficient low-level detail features. Moreover, low-level features have rich fine-grained information, which greatly impacts edge segmentation decisions of different organs. However, a simple CNN module is difficult to capture the edge information in fine-grained features, and the computational power and memory consumed in processing high-resolution 3D features are costly. This paper proposes an encoder-decoder network that effectively combines edge perception and Transformer structure to segment medical images accurately, called EPT-Net. Under this framework, this paper proposes a Dual Position Transformer to enhance the 3D spatial positioning ability effectively. In addition, as low-level features contain detailed information, we conduct an Edge Weight Guidance module to extract edge information by minimizing the edge information function without adding network parameters. Furthermore, we verified the effectiveness of the proposed method on three datasets, including SegTHOR 2019, Multi-Atlas Labeling Beyond the Cranial Vault and the re-labeled KiTS19 dataset called KiTS19-M by us. The experimental results show that EPT-Net has significantly improved compared with the state-of-the-art medical image segmentation method | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, Non-U.S. Gov't | |
700 | 1 | |a Jiao, Licheng |e verfasserin |4 aut | |
700 | 1 | |a Shang, Ronghua |e verfasserin |4 aut | |
700 | 1 | |a Liu, Xu |e verfasserin |4 aut | |
700 | 1 | |a Li, Ruiyang |e verfasserin |4 aut | |
700 | 1 | |a Xu, Longchang |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t IEEE transactions on medical imaging |d 1982 |g 42(2023), 11 vom: 22. Nov., Seite 3229-3243 |w (DE-627)NLM082855269 |x 1558-254X |7 nnns |
773 | 1 | 8 | |g volume:42 |g year:2023 |g number:11 |g day:22 |g month:11 |g pages:3229-3243 |
856 | 4 | 0 | |u http://dx.doi.org/10.1109/TMI.2023.3278461 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 42 |j 2023 |e 11 |b 22 |c 11 |h 3229-3243 |