Capturing and re-animating the 3D structure of articulated objects present significant barriers. On one hand, methods requiring extensively calibrated multi-view setups are prohibitively complex and resource-intensive, limiting their practical applicability. On the other hand, while single-camera Neural Radiance Fields (NeRFs) offer a more streamlined approach, they have excessive training and rendering costs. 3D Gaussian Splatting would be a suitable alternative but for two reasons. Firstly, existing methods for 3D dynamic Gaussians require synchronized multi-view cameras, and secondly, the lack of controllability in dynamic scenarios. We present CoGS, a method for Controllable Gaussian Splatting, that enables the direct manipulation of scene elements, offering real-time control of dynamic scenes without the prerequisite of pre-computing control signals. We evaluated CoGS using both synthetic and real-world datasets that include dynamic objects that differ in degree of difficulty. In our evaluations, CoGS consistently outperformed existing dynamic and controllable neural representations in terms of visual fidelity.
捕捉和重现关节物体的三维结构面临着重大障碍。一方面,需要广泛校准的多视图设置方法过于复杂和资源密集,限制了它们的实际应用性。另一方面,虽然单摄像头的神经辐射场(NeRFs)提供了一种更加简洁的方法,但它们的训练和渲染成本过高。三维高斯投影(3D Gaussian Splatting)本可以是一个合适的替代方法,但存在两个问题。首先,现有的三维动态高斯方法需要同步的多视图摄像头,其次,动态场景下缺乏可控性。我们提出了CoGS,一种可控高斯投影方法,能够直接操作场景元素,提供动态场景的实时控制,无需预先计算控制信号。我们使用包括动态对象在内的合成和真实世界数据集对CoGS进行了评估,这些动态对象在难度上有所不同。在我们的评估中,CoGS在视觉保真度方面始终优于现有的动态和可控神经表示。