Creating high-fidelity 3D head avatars has always been a research hotspot, but there remains a great challenge under lightweight sparse view setups. In this paper, we propose Gaussian Head Avatar represented by controllable 3D Gaussians for high-fidelity head avatar modeling. We optimize the neutral 3D Gaussians and a fully learned MLP-based deformation field to capture complex expressions. The two parts benefit each other, thereby our method can model fine-grained dynamic details while ensuring expression accuracy. Furthermore, we devise a well-designed geometry-guided initialization strategy based on implicit SDF and Deep Marching Tetrahedra for the stability and convergence of the training procedure. Experiments show our approach outperforms other state-of-the-art sparse-view methods, achieving ultra high-fidelity rendering quality at 2K resolution even under exaggerated expressions.
创建高保真三维头部形象一直是研究热点,但在轻量级稀疏视图设置下仍然存在巨大挑战。在这篇论文中,我们提出了由可控三维高斯表示的 Gaussian Head Avatar,用于高保真头部形象建模。我们优化了中性的三维高斯模型和一个完全学习的基于 MLP 的变形场,以捕捉复杂的表情。这两部分相辅相成,因此我们的方法可以在确保表情准确性的同时,建模精细的动态细节。此外,我们设计了一个基于隐式SDF和深度行进四面体的精心设计的几何引导初始化策略,以确保训练程序的稳定性和收敛性。实验表明,我们的方法在超高保真渲染质量方面超越了其他最先进的稀疏视图方法,即使在夸张的表情下也能实现2K分辨率。