ieee 7410800

Jump to bottom

Yana edited this page May 17, 2017 · 2 revisions

ICCV 2016

[ieee 7410800] Understanding Everyday Hands in Action from RGB-D Images [PDF] [notes]

Gregory Rogez, James S. Supancic, Deva Ramanan

Objectives

Use RGBD to predict hand position + contacts with objects + forces from classification in 73 grasp classes (using taxonomy from robotics)

Synthesis

single-image pose estimation from RGBD data

Constructed RGBD GUN dataset (grasp understanding dataset)

Training on both real and synthetic depth data (3000 synthetic examples per grasp from freely available Poser models)

Pipeline

segment hand from background clutter using depth cues
depth-based hand detection using a hand-pose classifier, trained on synthetic data as bayesian model depending on the pose

average predictions on superpixel region (extracted from RGB) to predict segmentation label

extract deep features from full rgb image, cropped window and segmented image => vector of size 3096*3
multiclass SVM on obtained concatenated vector

Refinement

==> How to obtain more precise estimate from quantified predictions (71 classes)

select closest training sample (from synthetic dataset, with ground truth position and estimated force orientation) according to the hand's depth map matching
assimilate contact points, forces, 3D pose from neighbour to original sample

Synthetic dataset

Produces 3d pose, grasp label, contacts and force direction vectors.

Contacts and force direction are estimated using the mesh representation of the hand

Notes

Grasp taxonomy

includes non-prehensible object interactions (pushinc, pressing...)