We present a novel approach for 3D indoor scene reconstruction that combines 3D Gaussian Splatting (3DGS) with mesh representations. We use meshes for the room layout of the indoor scene, such as walls, ceilings, and floors, while employing 3D Gaussians for other objects. This hybrid approach leverages the strengths of both representations, offering enhanced flexibility and ease of editing. However, joint training of meshes and 3D Gaussians is challenging because it is not clear which primitive should affect which part of the rendered image. Objects close to the room layout often struggle during training, particularly when the room layout is textureless, which can lead to incorrect optimizations and unnecessary 3D Gaussians. To overcome these challenges, we employ Segment Anything Model (SAM) to guide the selection of primitives. The SAM mask loss enforces each instance to be represented by either Gaussians or meshes, ensuring clear separation and stable training. Furthermore, we introduce an additional densification stage without resetting the opacity after the standard densification. This stage mitigates the degradation of image quality caused by a limited number of 3D Gaussians after the standard densification.
我们提出了一种新颖的 3D 室内场景重建方法,结合了 3D 高斯溅射(3DGS)和网格表示。我们使用网格来表示室内场景的房间布局,如墙壁、天花板和地板,同时使用 3D 高斯函数来表示其他物体。这种混合方法利用了两种表示方式的优势,提供了更高的灵活性和更容易的编辑能力。然而,网格和 3D 高斯函数的联合训练具有挑战性,因为不清楚哪个基元应该影响渲染图像的哪个部分。靠近房间布局的物体在训练过程中经常遇到困难,特别是当房间布局没有纹理时,这可能导致不正确的优化和不必要的 3D 高斯函数。为了克服这些挑战,我们采用分割任意物体模型(SAM)来指导基元的选择。SAM 掩码损失强制每个实例要么由高斯函数表示,要么由网格表示,确保清晰的分离和稳定的训练。此外,我们引入了一个额外的密集化阶段,在标准密集化之后不重置不透明度。这个阶段缓解了标准密集化后由于 3D 高斯函数数量有限而导致的图像质量下降问题。