Scene reconstruction from casually captured videos has wide applications in real-world scenarios. With recent advancements in differentiable rendering techniques, several methods have attempted to simultaneously optimize scene representations (NeRF or 3DGS) and camera poses. Despite recent progress, existing methods relying on traditional camera input tend to fail in high-speed (or equivalently low-frame-rate) scenarios. Event cameras, inspired by biological vision, record pixel-wise intensity changes asynchronously with high temporal resolution, providing valuable scene and motion information in blind inter-frame intervals. In this paper, we introduce the event camera to aid scene construction from a casually captured video for the first time, and propose Event-Aided Free-Trajectory 3DGS, called EF-3DGS, which seamlessly integrates the advantages of event cameras into 3DGS through three key components. First, we leverage the Event Generation Model (EGM) to fuse events and frames, supervising the rendered views observed by the event stream. Second, we adopt the Contrast Maximization (CMax) framework in a piece-wise manner to extract motion information by maximizing the contrast of the Image of Warped Events (IWE), thereby calibrating the estimated poses. Besides, based on the Linear Event Generation Model (LEGM), the brightness information encoded in the IWE is also utilized to constrain the 3DGS in the gradient domain. Third, to mitigate the absence of color information of events, we introduce photometric bundle adjustment (PBA) to ensure view consistency across events and frames.We evaluate our method on the public Tanks and Temples benchmark and a newly collected real-world dataset, RealEv-DAVIS.
从随意拍摄的视频中重建场景在实际应用中具有广泛用途。随着可微渲染技术的最新进展,一些方法尝试同时优化场景表示(如NeRF或3DGS)和相机位姿。然而,现有依赖传统相机输入的方法在高速(或等效的低帧率)场景中往往表现不佳。事件相机受生物视觉启发,能够以高时间分辨率异步记录像素级强度变化,为帧间的盲区提供宝贵的场景和运动信息。本文首次引入事件相机来辅助随意拍摄视频的场景重建,并提出了Event-Aided Free-Trajectory 3DGS(EF-3DGS),通过三个关键组件将事件相机的优势无缝集成到3DGS中。首先,我们利用事件生成模型(EGM)融合事件和帧,以事件流监督观察到的渲染视图。其次,我们在分段方式中采用对比度最大化(CMax)框架,通过最大化事件扭曲图像(IWE)的对比度来提取运动信息,从而校准估计的相机位姿。此外,基于线性事件生成模型(LEGM),IWE编码的亮度信息也被用于在梯度域约束3DGS。第三,为缓解事件缺少颜色信息的问题,我们引入光度束调整(PBA),以确保事件与帧之间的视角一致性。我们在公共的Tanks and Temples基准和一个新收集的真实数据集RealEv-DAVIS上对该方法进行了评估。