This paper presents Planar Gaussian Splatting (PGS), a novel neural rendering approach to learn the 3D geometry and parse the 3D planes of a scene, directly from multiple RGB images. The PGS leverages Gaussian primitives to model the scene and employ a hierarchical Gaussian mixture approach to group them. Similar Gaussians are progressively merged probabilistically in the tree-structured Gaussian mixtures to identify distinct 3D plane instances and form the overall 3D scene geometry. In order to enable the grouping, the Gaussian primitives contain additional parameters, such as plane descriptors derived by lifting 2D masks from a general 2D segmentation model and surface normals. Experiments show that the proposed PGS achieves state-of-the-art performance in 3D planar reconstruction without requiring either 3D plane labels or depth supervision. In contrast to existing supervised methods that have limited generalizability and struggle under domain shift, PGS maintains its performance across datasets thanks to its neural rendering and scene-specific optimization mechanism, while also being significantly faster than existing optimization-based approaches.
本文提出了平面高斯喷溅(Planar Gaussian Splatting, PGS),一种新颖的神经渲染方法,用于从多张RGB图像直接学习场景的三维几何和解析三维平面结构。PGS利用高斯原语对场景进行建模,并采用分层高斯混合方法对这些高斯进行分组。通过树状高斯混合模型的概率性逐步合并,相似的高斯被识别为不同的三维平面实例,从而形成整体的三维场景几何。 为了实现分组,每个高斯原语包含额外的参数,例如从通用二维分割模型中提升的二维掩码平面描述符和表面法线。实验结果表明,PGS在三维平面重建中实现了最先进的性能,无需三维平面标签或深度监督。与现有的监督方法相比,这些方法通常在领域迁移下表现不佳且泛化能力有限,而PGS得益于其神经渲染和场景特定优化机制,能够在跨数据集的情况下保持优异表现,同时显著快于现有的基于优化的方法。 PGS为三维平面重建提供了一种无需显式监督的新路径,在保持高效性的同时,拓展了在未标注数据和多场景条件下的应用潜力。