Details der Publikation - Instrument-tissue Interaction Detection Framework for Surgical Video Understanding

Instrument-tissue Interaction Detection Framework for Surgical Video Understanding

Instrument-tissue interaction detection task, which helps understand surgical activities, is vital for constructing computer-assisted surgery systems but with many challenges. Firstly, most models represent instrument-tissue interaction in a coarse-grained way which only focuses on classification and lacks the ability to automatically detect instruments and tissues. Secondly, existing works do not fully consider relations between intra-and inter-frame of instruments and tissues. In the paper, we propose to represent instrument-tissue interaction as ⟨instrument class, instrument bounding box, tissue class, tissue bounding box, action class⟩ quintuple and present an Instrument-Tissue Interaction Detection Network (ITIDNet) to detect the quintuple for surgery videos understanding. Specifically, we propose a Snippet Consecutive Feature (SCF) Layer to enhance features by modeling relationships of proposals in the current frame using global context information in the video snippet. We also propose a Spatial Corresponding Attention (SCA) Layer to incorporate features of proposals between adjacent frames through spatial encoding. To reason relationships between instruments and tissues, a Temporal Graph (TG) Layer is proposed with intra-frame connections to exploit relationships between instruments and tissues in the same frame and inter-frame connections to model the temporal information for the same instance. For evaluation, we build a cataract surgery video (PhacoQ) dataset and a cholecystectomy surgery video (CholecQ) dataset. Experimental results demonstrate the promising performance of our model, which outperforms other state-of-the-art models on both datasets.

Medienart:	E-Artikel

Erscheinungsjahr:	2024
Erschienen:	2024

Enthalten in:	Zur Gesamtaufnahme - volume:PP
Enthalten in:	IEEE transactions on medical imaging - PP(2024) vom: 26. März

Sprache:	Englisch

Beteiligte Personen:	Lin, Wenjun [VerfasserIn] Hu, Yan [VerfasserIn] Fu, Huazhu [VerfasserIn] Yang, Mingming [VerfasserIn] Chng, Chin-Boon [VerfasserIn] Kawasaki, Ryo [VerfasserIn] Chui, Cheekong [VerfasserIn] Liu, Jiang [VerfasserIn]

Links:	Volltext

Themen:	Journal Article

Anmerkungen:	Date Revised 26.03.2024 published: Print-Electronic Citation Status Publisher

doi:	10.1109/TMI.2024.3381209

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM370202740

Internformat


LEADER	01000naa a22002652 4500
001	NLM370202740
003	DE-627
005	20240328000642.0
007	cr uuu---uuuuu
008	240328s2024 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TMI.2024.3381209 \|2 doi
028	5	2	\|a pubmed24n1351.xml
035			\|a (DE-627)NLM370202740
035			\|a (NLM)38530715
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Lin, Wenjun \|e verfasserin \|4 aut
245	1	0	\|a Instrument-tissue Interaction Detection Framework for Surgical Video Understanding
264		1	\|c 2024
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 26.03.2024
500			\|a published: Print-Electronic
500			\|a Citation Status Publisher
520			\|a Instrument-tissue interaction detection task, which helps understand surgical activities, is vital for constructing computer-assisted surgery systems but with many challenges. Firstly, most models represent instrument-tissue interaction in a coarse-grained way which only focuses on classification and lacks the ability to automatically detect instruments and tissues. Secondly, existing works do not fully consider relations between intra-and inter-frame of instruments and tissues. In the paper, we propose to represent instrument-tissue interaction as ⟨instrument class, instrument bounding box, tissue class, tissue bounding box, action class⟩ quintuple and present an Instrument-Tissue Interaction Detection Network (ITIDNet) to detect the quintuple for surgery videos understanding. Specifically, we propose a Snippet Consecutive Feature (SCF) Layer to enhance features by modeling relationships of proposals in the current frame using global context information in the video snippet. We also propose a Spatial Corresponding Attention (SCA) Layer to incorporate features of proposals between adjacent frames through spatial encoding. To reason relationships between instruments and tissues, a Temporal Graph (TG) Layer is proposed with intra-frame connections to exploit relationships between instruments and tissues in the same frame and inter-frame connections to model the temporal information for the same instance. For evaluation, we build a cataract surgery video (PhacoQ) dataset and a cholecystectomy surgery video (CholecQ) dataset. Experimental results demonstrate the promising performance of our model, which outperforms other state-of-the-art models on both datasets
650		4	\|a Journal Article
700	1		\|a Hu, Yan \|e verfasserin \|4 aut
700	1		\|a Fu, Huazhu \|e verfasserin \|4 aut
700	1		\|a Yang, Mingming \|e verfasserin \|4 aut
700	1		\|a Chng, Chin-Boon \|e verfasserin \|4 aut
700	1		\|a Kawasaki, Ryo \|e verfasserin \|4 aut
700	1		\|a Chui, Cheekong \|e verfasserin \|4 aut
700	1		\|a Liu, Jiang \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on medical imaging \|d 1982 \|g PP(2024) vom: 26. März \|w (DE-627)NLM082855269 \|x 1558-254X \|7 nnns
773	1	8	\|g volume:PP \|g year:2024 \|g day:26 \|g month:03
856	4	0	\|u http://dx.doi.org/10.1109/TMI.2024.3381209 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|d PP \|j 2024 \|b 26 \|c 03

Instrument-tissue Interaction Detection Framework for Surgical Video Understanding

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände