2025 중앙대 AIIF 첨단영상국제페스티벌

Preliminary work for fine-detail preserving medical VLM

Speaker

Jongsu Youn
| Graduate School of Advanced Imaging Sciences, CAU

Abstract

In the field of medical artificial intelligence, model performance and reliability are critical, and the complexity of data is higher than in general domains. Nevertheless, coarse-grained predictions provide little clinical utility, leading to a demand for highly complex tasks in real-world applications. Although advances in artificial intelligence have improved both task complexity and model performance, there remains a lack of vision-language models (VLMs) capable of handling fine-grained tasks specific to the medical domain. This paper presents preliminary work toward implementing fine-detail preserving vision-language pretraining tailored for medical applications.

Jongsu Youn is a Ph.D. candidate in the Department of Image Science at the Graduate School of Advanced Imaging, Chung-Ang University. He has a strong interest in multimodal learning, and during his master’s program, he conducted research on multimodal learning with audio–video and audio–image combinations. In his doctoral program, after publishing research on audio–image multimodal learning in a journal, he has been focusing on multimodal learning in the domain of medical artificial intelligence.He is particularly interested in the differences between general domains and specialized domains in multimodal learning, and his research aims to develop multimodal models that can provide practical benefits in specialized domains.

List

ABOUT

EXHIBITION
Horizons of Synesthesia

CONFERENCE
Palimpsest of Reality

FILM
Echoes of Memories

ARCHIVE

ABOUT

EXHIBITION Horizons of Synesthesia

CONFERENCE Palimpsest of Reality

FILM Echoes of Memories

ARCHIVE

EXHIBITION
Horizons of Synesthesia

CONFERENCE
Palimpsest of Reality

FILM
Echoes of Memories