Subjective Evaluation on Generative Models for Mosaic Media Art
Dahyeon Kye Graduate School of Advanced Imaging Science and Film, Chung-Ang University
Dahyeon Kye
| Graduate School of Advanced Imaging Science and Film, Chung-Ang University
This paper presents a subjective evaluation of various generative models for the creation of mosaic media art videos. Specifically, we utilize Runway and ChatGPT-4o for text-to-image generation, and employ Stable Video Diffusion (SVD) and HiPer for image-to-video generation. By employing specific text prompts, we generate images and subsequently transform them into video sequences. Our evaluation criteria includes visual fidelity, temporal consistency, and control accuracy. Through subjective comparison, we identify ChatGPT-4o as the most effective model for producing detailed and coherent mosaic images, while SVD excels in maintaining temporal consistency and visual coherence in mosaic video sequences. These findings highlight the strengths and limitations of each model, offering valuable insights into their practical applications in generative mosaic video creation. The implications of these results are discussed, providing direction for future research and potential advancements in mosaic media art.