I used arl-affpose-ros-node with the following repos:
- LabelFusion for generating real images.
- NDDS for generating synthetic images.
- arl-affpose-dataset-utils a custom dataset that I generated.
- pytorch-simple-affnet for predicting an object affordance labels.
- densefusion for predicting an object 6-DoF pose.
- barrett-wam-arm for robotic grasping experiments. Specifically barrett_tf_publisher and barrett_trac_ik.
Here is an overview of our Architecture.
There are four main coordinate frames are used to grasp an object:
- base link of the manipulator
- camera frame
- object frame
- end effector frame
Note that the camera frame is a dynamic transform as we mounted our camera on our arm. See barrett_tf_publisher.
Note the object frame was determined either using marker-based methods, such as aruco_ros, or using deep learning, such as DOPE or DenseFusion.
Note that we used a 8-DoF Barrett Hand for the end effector frame. Which has +/- 17.5cm from tip to the center of the palm. Note that two-finger grippers require the object pose to be accurate within +/- 2cm.
- Ubuntu 18.04
- Cuda 10.0
- Python 2.7: 'conda create --name AFFDFROSNode python=2.7'
- Pytorch 1.4: 'conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.0 -c pytorch'
$ conda env create -f environment.yml --name AFFDFROSNode