Skip to content

1603.07763.md

Yana edited this page May 29, 2020 · 1 revision
Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video, CVPR'17 {paper} {project page} {code.gz} {dataset.zip}

Hao Jiang, Kristen Grauman

Objective

Go beyond previous work that reconstruct only visible first person poses (visible arms)r

Learn a prior of full body motion given environment visible cues

Datasets
  • Collected with both 3rd (Kinect) and 1st person (chest-mounted GoPro) views, both provide RGB streams
    • Using Kinect V2 sensor, capture ground truth human poses.
    • 3D positions of 25 body joints defined in the MS Kinect SDK.
    • chest-mounted camera
    • 18 ground truth videos, 3 videos for training rest for testing
    • 10 subjects, normal daily activities
Method
  • Handle pose estimation as per-frame classification task
    • k-means with L2 norm to obtain K=300 pose clusters (where poses are all ground-truth poses in training set).
    • dynamic features
      • use optical flow to compute point correspondences, which is used to compute the homographye (underlying assumption that scene is planar? Only one homography for the full scene is computed afaiu. I am missing smthg here)
      • use homographies between consecutive frames to estimate camera rotation, assuming rotation dominates over translation, and camera intrinsics are known
    • static features
      • collect a dataset of standing vs sitting, train classifier on standing vs sitting
  • Additional temporal model on 1-3minute temporal sequences to produce
    • constraints on transitioning from pose clusters given existing transitions in training set
    • encourage consistency between predictions from static and motion features

runtime: 0.5s per frame

Experiments

Report mean cm errors per for different joints

Compare to several 3rd person baselines on their dataset For upper-body joints results slightly better then always-standing baseline (that predicts fixed standing pose), clearer improvement on lower joints

Clone this wiki locally