Novel View Synthesis (NVS) without Structure-from-Motion (SfM) pre-processed camera poses--referred to as SfM-free methods--is crucial for promoting rapid response capabilities and enhancing robustness against variable operating conditions. Recent SfM-free methods have integrated pose optimization, designing end-to-end frameworks for joint camera pose estimation and NVS. However, most existing works rely on per-pixel image loss functions, such as L2 loss. In SfM-free methods, inaccurate initial poses lead to misalignment issue, which, under the constraints of per-pixel image loss functions, results in excessive gradients, causing unstable optimization and poor convergence for NVS. In this study, we propose a correspondence-guided SfM-free 3D Gaussian splatting for NVS. We use correspondences between the target and the rendered result to achieve better pixel alignment, facilitating the optimization of relative poses between frames. We then apply the learned poses to optimize the entire scene. Each 2D screen-space pixel is associated with its corresponding 3D Gaussians through approximated surface rendering to facilitate gradient back propagation. Experimental results underline the superior performance and time efficiency of the proposed approach compared to the state-of-the-art baselines.
无需结构化运动(Structure-from-Motion, SfM)预处理的相机姿态的新视图合成(Novel View Synthesis, NVS)方法,即无SfM方法,对于促进快速响应能力和增强在可变操作条件下的鲁棒性至关重要。最近的无SfM方法已集成了姿态优化,设计了用于联合相机姿态估计和NVS的端到端框架。然而,大多数现有工作依赖于每像素图像损失函数,如L2损失。在无SfM方法中,不准确的初始姿态会导致对齐问题,在每像素图像损失函数的约束下,导致过大的梯度,从而引起不稳定的优化和NVS的差收敛性。在本研究中,我们提出了一种基于对应关系引导的无SfM 3D高斯投影用于NVS。我们利用目标图像和渲染结果之间的对应关系来实现更好的像素对齐,促进帧间相对姿态的优化。然后,我们应用学习到的姿态来优化整个场景。通过近似表面渲染,每个2D屏幕空间像素与其对应的3D高斯相关联,以便于梯度反向传播。实验结果表明,与现有最先进的基线方法相比,该方法在性能和时间效率上具有显著优势。