Learning Intuitive Physics with Multimodal Generative Models

This package provides the PyTorch implementation and the vision-based tactile sensor simulator for our AAAI 2021 paper. The tactile simulator is based on PyBullet and provides the simulation of the Semi-transparent Tactile Sensor (STS).

Installation

The recommended way is to install the package and all its dependencies in a virtual environment using:

git clone https://github.com/SAIC-MONTREAL/multimodal-dynamics.git
cd multimodal-dynamics
pip install -e .

Visuotactile Simulation

The sub-package tact_sim provides the components required for visoutactile simulation of the STS sensor and is implemented in PyBullet. The simulation is vision based and is not meant to be physically accurate of the contacts and soft body dynamics.

To run an example script of an object falling on the sensor use:

python tact_sim/examples/demo.py --show_image --object winebottle

This loads the object from the graphics/objects and renders the resulting visual and tactile images.

The example scripts following the name format experiments/exp_{ID}_{task}.py have been used to generate the dataset of our AAAI 2021 paper. In order to run them, you need to have ShapeNetSem dataset installed on your machine.

Preparing ShapeNetSem

Follow the steps below to download and prepare the ShapeNetSem dataset:

Register and get access to ShapeNetSem.
Only the OBJ and texture files are needed. Download models-OBJ.zip and models-textures.zip.
Download metadata.csv and categories.synset.csv.
Unzip the compressed files and move the contents of models-textures.zip to models-OBJ/models:

.
└── ShapeNetSem
    ├── categories.synset.csv
    ├── metadata.csv
    └── models-OBJ
        └── models

Data Collection

To run the data collection scripts use:

python experiments/exp_{ID}_{task}.py --logdir {path_to_logdir} --dataset_dir {path_to_ShapeNetSem} --category "WineBottle, Camera" --show_image

To see all available object classes that are suitable for these experiments see tact_sim/config.py.

Learning Multimodal Dynamics Models

Once you have collected the dataset, you can start training the multimodal ''resting state predictor'' dynamics model, as described in the paper, using:

python main.py --dataset-path {absolute_path_dataset} --problem-type seq_modeling --input-type visuotactile --model-name cnn-mvae --use-pose

This trains the MVAE model that fuses visual, tactile and pose modalilities into a shared latent space.

To train the resting state predictor for a single modality (e.g., tactile or visual only), use:

python main.py --dataset-path {absolute_path_dataset} --problem-type seq_modeling --input-type visual --model-name cnn-vae

To train a standard one-step dynamics model, use dyn_modeling as the input argument problem_type.

Experiments

Check this video for a demo of the experiments:

License

This work by SECA is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Citing

If you use this code in your research, please cite:

@article{rezaei2021learning,
  title={Learning Intuitive Physics with Multimodal Generative Models},
  author={Rezaei-Shoshtari, Sahand and Hogan, Francois Robert and Jenkin, Michael and Meger, David and Dudek, Gregory},
  journal={arXiv preprint arXiv:2101.04454},
  year={2021}
}

@inproceedings{hogan2021seeing,
  title={Seeing Through your Skin: Recognizing Objects with a Novel Visuotactile Sensor},
  author={Hogan, Francois R and Jenkin, Michael and Rezaei-Shoshtari, Sahand and Girdhar, Yogesh and Meger, David and Dudek, Gregory},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={1218--1227},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
graphics/objects		graphics/objects
mmdyn		mmdyn
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Intuitive Physics with Multimodal Generative Models

Installation

Visuotactile Simulation

Preparing ShapeNetSem

Data Collection

Learning Multimodal Dynamics Models

Experiments

License

Citing

About

Releases

Packages

Languages

License

SAIC-MONTREAL/multimodal-dynamics

Folders and files

Latest commit

History

Repository files navigation

Learning Intuitive Physics with Multimodal Generative Models

Installation

Visuotactile Simulation

Preparing ShapeNetSem

Data Collection

Learning Multimodal Dynamics Models

Experiments

License

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages