Leveraging Prior Knowledge of Diffusion Model for Person Search

Speaker

Giyeol Kim
| Graduate School of Advanced Imaging Science and Film, Chung-Ang University

Abstract

Person search aims to jointly perform person detection and re-identification by localizing and identifying a query person within a gallery of uncropped scene images. Existing methods predominantly utilize ImageNet pre-trained backbones, which may be suboptimal for capturing the complex spatial context and fine-grained identity cues necessary for person search. Moreover, they rely on a shared backbone feature for both person detection and re-identification, leading to suboptimal features due to conflicting optimization objectives. In this paper, we propose DiffPS (Diffusion Prior Knowledge for Person Search), a novel framework that leverages a pre-trained diffusion model while eliminating the optimization conflict between two sub-tasks. We analyze key properties of diffusion priors and propose three specialized modules: (i) Diffusion-Guided Region Proposal Network (DGRPN) for enhanced person localization, (ii) Multi-Scale Frequency Refinement Network (MSFRN) to mitigate shape bias, and (iii) Semantic-Adaptive Feature Aggregation Network (SFAN) to leverage text-aligned diffusion features. DiffPS sets a new state-of-the-art on CUHK-SYSU and PRW.

Giyeol Kim is a Master's student at Chung-Ang University since 2024, under the supervision of Professor Chanho Eom in the Perceptual AI Lab. His research interests lie in exploring how rapidly evolving AI technologies can be effectively applied to solve real-world applications. Within the field of computer vision, He is particularly interested in representation learning, person re-identification, and diffusion models.

List