Analyzing Visible Articulatory Movements in Speech Production for Speech-driven 3D Facial Animation

Hyungkyu Kim Department of Advanced Image and Arts, Chung-Ang University, South Korea
Speaker

Hyungkyu Kim
| Department of Advanced Image and Arts, Chung-Ang University, South Korea

Abstract

Speech-driven 3D facial animation aims to generate realistic facial meshes based on input speech signals. However, due to a lack of understanding of visible articulatory movements, current state-of-the-art methods result in inaccurate lip and jaw movements. Traditional evaluation metrics, such as lip vertex error (LVE), often fail to represent the quality of visual results. Based on our observation, we reveal the problems with existing evaluation metrics and raise the necessity for separate evaluation approaches for 3D axes. Comprehensive analysis shows that most recent methods struggle to precisely predict lip and jaw movements in 3D space.

List