Two-stream vision transformer based multi-label recognition for TCM prescriptions construction

Copyright © 2024. Published by Elsevier Ltd..

Traditional Chinese medicine (TCM) observation diagnosis images (including facial and tongue images) provide essential human body information, holding significant importance in clinical medicine for diagnosis and treatment. TCM prescriptions, known for their simplicity, non-invasiveness, and low side effects, have been widely applied worldwide. Exploring automated herbal prescription construction based on visual diagnosis holds vital value in delving into the correlation between external features and herbal prescriptions and offering medical services in mobile healthcare systems. To effectively integrate multi-perspective visual diagnosis images and automate prescription construction, this study proposes a multi-herb recommendation framework based on Visual Transformer and multi-label classification. The framework comprises three key components: image encoder, label embedding module, and cross-modal fusion classification module. The image encoder employs a dual-stream Visual Transformer to learn dependencies between different regions of input images, capturing both local and global features. The label embedding module utilizes Graph Convolutional Networks to capture associations between diverse herbal labels. Finally, two Multi-Modal Factorized Bilinear modules are introduced as effective components to fuse cross-modal vectors, creating an end-to-end multi-label image-herb recommendation model. Through experimentation with real facial and tongue images and generating prescription data closely resembling real samples. The precision is 50.06 %, the recall rate is 48.33 %, and the F1-score is 49.18 %. This study validates the feasibility of automated herbal prescription construction from the perspective of visual diagnosis. Simultaneously, it provides valuable insights for constructing herbal prescriptions automatically from more physical information.

Medienart:

E-Artikel

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

Zur Gesamtaufnahme - volume:170

Enthalten in:

Computers in biology and medicine - 170(2024) vom: 20. Feb., Seite 107920

Sprache:

Englisch

Beteiligte Personen:

Zhao, Zijuan [VerfasserIn]
Qiang, Yan [VerfasserIn]
Yang, Fenghao [VerfasserIn]
Hou, Xiao [VerfasserIn]
Zhao, Juanjuan [VerfasserIn]
Song, Kai [VerfasserIn]

Links:

Volltext

Themen:

Facial and tongue images
Graph convolutional network
Journal Article
Multi-label image recognition
Prescriptions construction
Visual transformer

Anmerkungen:

Date Completed 28.02.2024

Date Revised 28.02.2024

published: Print-Electronic

Citation Status MEDLINE

doi:

10.1016/j.compbiomed.2024.107920

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM367349639