Skip to content

Latest commit

 

History

History
8 lines (6 loc) · 2.88 KB

2412.01543.md

File metadata and controls

8 lines (6 loc) · 2.88 KB

6DOPE-GS: Online 6D Object Pose Estimation using Gaussian Splatting

Efficient and accurate object pose estimation is an essential component for modern vision systems in many applications such as Augmented Reality, autonomous driving, and robotics. While research in model-based 6D object pose estimation has delivered promising results, model-free methods are hindered by the high computational load in rendering and inferring consistent poses of arbitrary objects in a live RGB-D video stream. To address this issue, we present 6DOPE-GS, a novel method for online 6D object pose estimation & tracking with a single RGB-D camera by effectively leveraging advances in Gaussian Splatting. Thanks to the fast differentiable rendering capabilities of Gaussian Splatting, 6DOPE-GS can simultaneously optimize for 6D object poses and 3D object reconstruction. To achieve the necessary efficiency and accuracy for live tracking, our method uses incremental 2D Gaussian Splatting with an intelligent dynamic keyframe selection procedure to achieve high spatial object coverage and prevent erroneous pose updates. We also propose an opacity statistic-based pruning mechanism for adaptive Gaussian density control, to ensure training stability and efficiency. We evaluate our method on the HO3D and YCBInEOAT datasets and show that 6DOPE-GS matches the performance of state-of-the-art baselines for model-free simultaneous 6D pose tracking and reconstruction while providing a 5× speedup. We also demonstrate the method's suitability for live, dynamic object tracking and reconstruction in a real-world setting.

高效且精准的物体位姿估计是增强现实(AR)、自动驾驶和机器人等许多应用中现代视觉系统的核心组件。尽管基于模型的6D物体位姿估计研究取得了令人鼓舞的成果,但无模型方法由于需要在实时RGB-D视频流中渲染和推断任意物体的一致位姿,通常面临较高的计算负担。 为解决这一问题,我们提出了6DOPE-GS,一种利用单个RGB-D相机进行在线6D物体位姿估计与跟踪的新方法,充分利用了高斯散点(Gaussian Splatting)的技术进步。借助高斯散点的快速可微渲染能力,6DOPE-GS能够同时优化6D物体位姿和3D物体重建。 为了实现实时跟踪所需的效率和精度,我们的方法采用增量式2D高斯散点结合智能动态关键帧选择策略,以实现高空间覆盖率并避免错误的位姿更新。此外,我们提出了一种基于不透明度统计的修剪机制,用于自适应高斯密度控制,以确保训练的稳定性与效率。 我们在HO3D和YCBInEOAT数据集上对该方法进行了评估,结果表明,6DOPE-GS在无模型的6D位姿跟踪和重建任务中表现与最先进的基线方法相当,同时提供了5倍的速度提升。此外,我们还展示了该方法在实时动态物体跟踪和重建的真实场景中的适用性。