Simulation-driven training of vision transformers enables metal artifact reduction of highly truncated CBCT scans

© 2023 The Authors. Medical Physics published by Wiley Periodicals LLC on behalf of American Association of Physicists in Medicine..

BACKGROUND: Due to the high attenuation of metals, severe artifacts occur in cone beam computed tomography (CBCT). The metal segmentation in CBCT projections usually serves as a prerequisite for metal artifact reduction (MAR) algorithms.

PURPOSE: The occurrence of truncation caused by the limited detector size leads to the incomplete acquisition of metal masks from the threshold-based method in CBCT volume. Therefore, segmenting metal directly in CBCT projections is pursued in this work.

METHODS: Since the generation of high quality clinical training data is a constant challenge, this study proposes to generate simulated digital radiographs (data I) based on real CT data combined with self-designed computer aided design (CAD) implants. In addition to the simulated projections generated from 3D volumes, 2D x-ray images combined with projections of implants serve as the complementary data set (data II) to improve the network performance. In this work, SwinConvUNet consisting of shift window (Swin) vision transformers (ViTs) with patch merging as encoder is proposed for metal segmentation.

RESULTS: The model's performance is evaluated on accurately labeled test datasets obtained from cadaver scans as well as the unlabeled clinical projections. When trained on the data I only, the convolutional neural network (CNN) encoder-based networks UNet and TransUNet achieve only limited performance on the cadaver test data, with an average dice score of 0.821 and 0.850. After using both data II and data I during training, the average dice scores for the two models increase to 0.906 and 0.919, respectively. By replacing the CNN encoder with Swin transformer, the proposed SwinConvUNet reaches an average dice score of 0.933 for cadaver projections when only trained on the data I. Furthermore, SwinConvUNet has the largest average dice score of 0.953 for cadaver projections when trained on the combined data set.

CONCLUSIONS: Our experiments quantitatively demonstrate the effectiveness of the combination of the projections simulated under two pathways for network training. Besides, the proposed SwinConvUNet trained on the simulated projections performs state-of-the-art, robust metal segmentation as demonstrated on experiments on cadaver and clinical data sets. With the accurate segmentations from the proposed model, MAR can be conducted even for highly truncated CBCT scans.

Medienart:

E-Artikel

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

Zur Gesamtaufnahme - year:2023

Enthalten in:

Medical physics - (2023) vom: 27. Dez., Seite e16919

Sprache:

Englisch

Beteiligte Personen:

Fan, Fuxin [VerfasserIn]
Ritschl, Ludwig [VerfasserIn]
Beister, Marcel [VerfasserIn]
Biniazan, Ramyar [VerfasserIn]
Wagner, Fabian [VerfasserIn]
Kreher, Björn [VerfasserIn]
Gottschalk, Tristan M [VerfasserIn]
Kappler, Steffen [VerfasserIn]
Maier, Andreas [VerfasserIn]

Links:

Volltext

Themen:

Data argumentation
Journal Article
Metal artifact reduction
Metal segmentation
Swin vision transformer

Anmerkungen:

Date Revised 27.12.2023

published: Print-Electronic

Citation Status Publisher

doi:

10.1002/mp.16919

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM366412213