Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 2 KB

2409.16502.md

File metadata and controls

5 lines (3 loc) · 2 KB

GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization

Although various visual localization approaches exist, such as scene coordinate and pose regression, these methods often struggle with high memory consumption or extensive optimization requirements. To address these challenges, we utilize recent advancements in novel view synthesis, particularly 3D Gaussian Splatting (3DGS), to enhance localization. 3DGS allows for the compact encoding of both 3D geometry and scene appearance with its spatial features. Our method leverages the dense description maps produced by XFeat's lightweight keypoint detection and description model. We propose distilling these dense keypoint descriptors into 3DGS to improve the model's spatial understanding, leading to more accurate camera pose predictions through 2D-3D correspondences. After estimating an initial pose, we refine it using a photometric warping loss. Benchmarking on popular indoor and outdoor datasets shows that our approach surpasses state-of-the-art Neural Render Pose (NRP) methods, including NeRFMatch and PNeRFLoc.s

尽管已有多种视觉定位方法,如场景坐标和姿态回归,这些方法通常面临高内存消耗或大量优化需求的问题。为了解决这些挑战,我们利用了新视角合成领域的最新进展,特别是3D高斯分布(3D Gaussian Splatting, 3DGS),来增强定位能力。3DGS 通过其空间特征实现了对三维几何和场景外观的紧凑编码。我们的方法利用了XFeat轻量级关键点检测和描述模型生成的密集描述图,并将这些密集关键点描述符蒸馏到3DGS中,从而提升模型的空间理解能力,通过2D-3D对应关系实现更准确的相机姿态预测。在估计初始姿态后,我们使用光度变形损失进行精细优化。在热门的室内和室外数据集上进行基准测试显示,我们的方法在性能上超越了最新的神经渲染姿态(Neural Render Pose, NRP)方法,包括NeRFMatch和PNeRFLoc。