Neural Radiance Fields (NeRFs) have demonstrated remarkable potential in capturing complex 3D scenes with high fidelity. However, one persistent challenge that hinders the widespread adoption of NeRFs is the computational bottleneck due to the volumetric rendering. On the other hand, 3D Gaussian splatting (3DGS) has recently emerged as an alternative representation that leverages a 3D Gaussisan-based representation and adopts the rasterization pipeline to render the images rather than volumetric rendering, achieving very fast rendering speed and promising image quality. However, a significant drawback arises as 3DGS entails a substantial number of 3D Gaussians to maintain the high fidelity of the rendered images, which requires a large amount of memory and storage. To address this critical issue, we place a specific emphasis on two key objectives: reducing the number of Gaussian points without sacrificing performance and compressing the Gaussian attributes, such as view-dependent color and covariance. To this end, we propose a learnable mask strategy that significantly reduces the number of Gaussians while preserving high performance. In addition, we propose a compact but effective representation of view-dependent color by employing a grid-based neural field rather than relying on spherical harmonics. Finally, we learn codebooks to compactly represent the geometric attributes of Gaussian by vector quantization. In our extensive experiments, we consistently show over 10× reduced storage and enhanced rendering speed, while maintaining the quality of the scene representation, compared to 3DGS. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering.
神经辐射场(NeRFs)在捕捉高保真复杂三维场景方面显示出了显著的潜力。然而,阻碍NeRFs广泛应用的一个持续挑战是由于体积渲染导致的计算瓶颈。另一方面,3D高斯飞溅(3DGS)最近作为一种替代表达方式出现,它利用基于3D高斯的表示,并采用光栅化管线而不是体积渲染来渲染图像,实现了非常快速的渲染速度和有希望的图像质量。然而,一个显著的缺点是3DGS需要大量的3D高斯来维持渲染图像的高保真度,这需要大量的内存和存储空间。为了解决这个关键问题,我们特别强调两个关键目标:在不牺牲性能的情况下减少高斯点的数量,并压缩高斯属性,如视角依赖的颜色和协方差。为此,我们提出了一种可学习的掩码策略,显著减少了高斯数量同时保持了高性能。此外,我们提出了一种基于网格的神经场来有效表示视角依赖的颜色,而不是依赖于球谐函数。最后,我们通过向量量化学习码本来紧凑地表示高斯的几何属性。在我们广泛的实验中,与3DGS相比,我们一致地展示了超过10倍的存储减少和增强的渲染速度,同时保持了场景表示的质量。我们的工作提供了一个全面的三维场景表示框架,实现了高性能、快速训练、紧凑性和实时渲染。