Skip to content

ieee 7410800

Yana edited this page May 17, 2017 · 2 revisions

ICCV 2016

[ieee 7410800] Understanding Everyday Hands in Action from RGB-D Images [PDF] [notes]

Gregory Rogez, James S. Supancic, Deva Ramanan

Objectives

Use RGBD to predict hand position + contacts with objects + forces from classification in 73 grasp classes (using taxonomy from robotics)

Synthesis

single-image pose estimation from RGBD data

Constructed RGBD GUN dataset (grasp understanding dataset)

Training on both real and synthetic depth data (3000 synthetic examples per grasp from freely available Poser models)

Pipeline

  • segment hand from background clutter using depth cues

  • depth-based hand detection using a hand-pose classifier, trained on synthetic data as bayesian model depending on the pose

  • average predictions on superpixel region (extracted from RGB) to predict segmentation label
  • extract deep features from full rgb image, cropped window and segmented image => vector of size 3096*3

  • multiclass SVM on obtained concatenated vector

Refinement

==> How to obtain more precise estimate from quantified predictions (71 classes)

  • select closest training sample (from synthetic dataset, with ground truth position and estimated force orientation) according to the hand's depth map matching
  • assimilate contact points, forces, 3D pose from neighbour to original sample

Synthetic dataset

Produces 3d pose, grasp label, contacts and force direction vectors.

Contacts and force direction are estimated using the mesh representation of the hand

Notes

Grasp taxonomy

includes non-prehensible object interactions (pushinc, pressing...)

Clone this wiki locally