XTag-CLIP: Robust and Reliable Thyroid Scar Analysis with Limited Data via Cross-Attention

Speaker

Eunju Lee
| Department of Imaging Science, Intelligent Information Processing Lab

Abstract

Thyroidectomy often leaves visible scars that affect patients’ quality of life, yet current scar classification models are impractical for small clinics due to limited data and restricted model access. We introduce XTag-CLIP, which augments CLIP with a feature tagging module and cross-attention fusion to enable accurate classification with small datasets. Trained first on general scar data and then fine-tuned on thyroidectomy-specific cases, our method improves accuracy by 14%p and F1 score by 0.0670 over the CLIP baseline.

Eunju Lee received a B.S. and M.S. degree in imaging engineering from Chung-Ang University, Seoul, Korea, in 2020 and 2022. She is currently a Ph.D. student in imaging engineering at the Graduate School of Advanced Imaging Science, Multimedia and Film at Chung-Ang University. Her current research focuses on deep learning and computer vision.

List