Realistic 3D human generation from text prompts is a desirable yet challenging task. Existing methods optimize 3D representations like mesh or neural fields via score distillation sampling (SDS), which suffers from inadequate fine details or excessive training time. In this paper, we propose an efficient yet effective framework, HumanGaussian, that generates high-quality 3D humans with fine-grained geometry and realistic appearance. Our key insight is that 3D Gaussian Splatting is an efficient renderer with periodic Gaussian shrinkage or growing, where such adaptive density control can be naturally guided by intrinsic human structures. Specifically, 1) we first propose a Structure-Aware SDS that simultaneously optimizes human appearance and geometry. The multi-modal score function from both RGB and depth space is leveraged to distill the Gaussian densification and pruning process. 2) Moreover, we devise an Annealed Negative Prompt Guidance by decomposing SDS into a noisier generative score and a cleaner classifier score, which well addresses the over-saturation issue. The floating artifacts are further eliminated based on Gaussian size in a prune-only phase to enhance generation smoothness. Extensive experiments demonstrate the superior efficiency and competitive quality of our framework, rendering vivid 3D humans under diverse scenarios.
从文本提示中生成逼真的3D人类是一个令人向往但具有挑战性的任务。现有方法通过评分蒸馏采样(SDS)优化像网格或神经场这样的3D表示,但这些方法存在细节不足或训练时间过长的问题。在本文中,我们提出了一种高效且有效的框架HumanGaussian,它生成具有细致几何结构和逼真外观的高质量3D人类。我们的关键见解是,3D高斯飞溅是一种高效的渲染器,具有周期性的高斯缩减或增长,这种自适应密度控制可以自然地由人类内在结构引导。具体来说,1)我们首先提出了一种结构感知的SDS,它同时优化人类外观和几何结构。利用RGB和深度空间的多模态评分函数来蒸馏高斯密化和修剪过程。2)此外,我们设计了一种退火负面提示指导,通过将SDS分解为噪声较大的生成评分和较清晰的分类器评分,有效解决了过饱和问题。基于高斯大小的修剪阶段进一步消除了浮动伪影,以增强生成的平滑度。广泛的实验表明,我们的框架在效率上卓越,并在渲染多样化场景下的3D人类方面具有竞争力的质量。