Ola Shorinwa*, Johnathan Tucker*, Aliyah Smith, Aiden Swann, Timothy Chen, Roya Firoozi, Monroe Kennedy III, Mac Schwager
Stanford University
*Equal Contribution.
We present Splat-MOVER, a modular robotics stack for open-vocabulary robotic manipulation,which leverages the editability of Gaussian Splatting (GSplat) scene representations to enable multi-stage manipulation tasks. Splat-MOVER consists of: (i) ASK-Splat, a GSplat representation that distills semantic and grasp affordance features into the 3D scene. ASK-Splat enables geometric, semantic, and affordance understanding of 3D scenes, which is critical for many robotics tasks; (ii) SEE-Splat, a real-time scene-editing module using 3D semantic masking and infilling to visualize the motions of objects that result from robot interactions in the real-world. SEE-Splat creates a “digital twin” of the evolving environment throughout the manipulation task; and (iii) Grasp-Splat, a grasp generation module that uses ASK-Splat and SEE-Splat to propose affordance-aligned candidate grasps for open-world objects.
This repository utilizes Nerfstudio, GraspNet, and VRB. This repo has been verified to work with the following package versions: nerfstudio==1.1.0
, gsplat=0.1.13
, and lang-sam
with the commit SHA a1a9557
.
Please install these packages from source, then proceed with the following steps:
- Clone this repo.
git clone [email protected]:StanfordMSL/Splat-MOVER.git
- Install
sagesplat
as a Python package.
python -m pip install -e .
- Register
sagesplat
with Nerfstudio.
ns-install-cli
Now, you can run sagesplat
like other models in Nerfstudio using:
ns-train sagesplat --data <path to the data directory for a given scene>
You can try out the data used in the experiments here.