Weird results when applying SemGCN to 2D pose from image #32

duckduck-sys · 2020-10-28T10:47:56Z

Inference on images in the wild using SemGCN has been partially covered in this thread and others, but only the overall process has been made clear. I.e.:

Step 1: Use a 2D pose estimation network to generate 2D pose in MPII format.
Step 2: Convert 2D pose from MPII format to H36M format as done here.
Step 3: Pre-process the 2D input pose as done here.
Step 4: Use the pre-processed 2D pose in H36M format as input to the SemGCN SH model. It outputs 3D pose in H36M format.

Below i will follow each step, using the test image of size 300x600 to the left.

For Step 1, i use EfficientPose to generate the MPII format 2d pose of the test image as shown above on the right, here's the numeric output:

positions = [[[108. 512.]	# Right ankle
              [114. 428.]	# Right knee
              [124. 320.]	# Right hip
              [186. 324.]	# Left hip
              [178. 426.]	# Left knee
              [176. 512.]	# Left ankle
              [156. 322.]	# Pelvis
              [162. 152.]	# Thorax
              [164. 114.]	# Upper neck
              [166.  24.]	# Head top
              [ 60. 322.]	# Right wrist
              [ 78. 238.]	# Right elbow
              [ 96. 148.]	# Right shoulder
              [230. 154.]	# Left shoulder
              [240. 246.]	# Left elbow
              [224. 326.]]]	# Left wrist

For Step 2, i run this:

positions = positions[:, SH_TO_GT_PERM, :]

To get the output:

positions = [[[156. 322.]
              [124. 320.]
              [114. 428.]
              [108. 512.]
              [186. 324.]
              [178. 426.]
              [176. 512.]
              [162. 152.]
              [164. 114.]
              [166.  24.]
              [230. 154.]
              [240. 246.]
              [224. 326.]
              [ 96. 148.]
              [ 78. 238.]
              [ 60. 322.]]]

For Step 3, i run this:

positions[..., :2] = normalize_screen_coordinates(positions[..., :2], w=300, h=600)

To get the output:

positions = [[[ 0.0399  0.1466 ]
              [-0.1733  0.1333 ]
              [-0.2400  0.8533 ]
              [-0.2799  1.4133 ]
              [ 0.2400  0.1600 ]
              [ 0.1866  0.8399 ]
              [ 0.1733  1.4133 ]
              [ 0.0800 -0.9866 ]
              [ 0.0933 -1.2400 ]
              [ 0.1066 -1.8400 ]
              [ 0.5333 -0.9733 ]
              [ 0.6000 -0.3600 ]
              [ 0.4933  0.1733 ]
              [-0.3600 -1.0133 ]
              [-0.4800 -0.4133 ]
              [-0.6000  0.1466 ]]]

For Step 4, the above is used as input to the SemGCN SH model running this:

inputs_2d = torch.from_numpy(positions)
inputs_2d = inputs_2d.to(device)
outputs_3d = model_pos(inputs_2d).cpu()
outputs = outputs_3d[:, :, :] - outputs_3d[:, :1, :]

Which gives the output:

outputs = [[[ 0.0000  0.0000  0.0000 ]
            [-0.0769 -0.6899 -0.2520 ]
            [ 0.0847 -0.4062 -0.0607 ]
            [ 0.4154  0.2318  0.4062 ]
            [ 0.2708 -0.5181 -0.0504 ]
            [ 0.3431 -0.7337  0.3018 ]
            [ 0.6379  0.6684  0.2033 ]
            [ 0.1650 -0.9141 -0.8496 ]
            [ 0.5825 -2.1341  0.2762 ]
            [ 1.1561 -1.5364 -0.6433 ]
            [ 1.1612 -1.1453 -0.2103 ]
            [ 0.9097 -0.6763  0.2361 ]
            [ 0.8202 -0.2971  0.2679 ]
            [ 0.8008 -1.1936 -0.1120 ]
            [ 0.2124 -1.3246  0.5563 ]
            [ 0.5093 -0.4762  0.3473 ]]]

When visualized this looks completely wrong... See image below. Can anyone highlight on where the problem lies? Is it a problem with the pre-processing, or with the model?

The text was updated successfully, but these errors were encountered:

develduan · 2020-11-23T18:40:03Z

@duckduck-sys I think there are two points to note about the data:

location: the neck should be halfway between the shoulders, and the thorax should be roughly halfway between the neck and the hips.
scale: normalization depends on the image width(mapping to [-1,1] based on the w), thus, the proportion of human body in the image needs to match H36M.

raw output
after scaling the locations
after modifying the location of the neck and the thorax

dandingol03 · 2020-12-06T08:25:22Z

@duckduck-sys I think there are two points to note about the data:

location: the neck should be halfway between the shoulders, and the thorax should be roughly halfway between the neck and the hips.

scale: normalization depends on the image width(mapping to [-1,1] based on the w), thus, the proportion of human body in the image needs to match H36M.

raw output

after scaling the locations

after modifying the location of the neck and the thorax

hi, how do you calculate the spine point

dandingol03 · 2020-12-06T08:34:21Z

hi @duckduck-sys
大神，你的数据归一化后有点奇怪，还有我想请教下您现在能正确回归出3d pose了吗，主要是hip和spine节点我不懂计算，还有就是hip节点要假定为(0,0)吗

lisa676 · 2020-12-17T03:23:19Z

@develduan Hi Duan, can you share this solution? I'm also facing somewhat same problem.
Thanks

dandingol03 · 2020-12-17T03:36:20Z

@develduan Hi. I also face the same problem about how to figure out the spine point, because the stack-hourglass doesn't output the spine point

develduan · 2020-12-17T08:02:48Z

@lisa676 @dandingol03 Hi, I'm sorry that I stopped following this project because it didn't work very well on my dataset(the wild environment). In my dataset, all pedestrians stand upright, so I simply treated the midpoint of the neck and the pelvis as the thorax/spine: positions_mpii[i_thorax] = (positions_mpii[i_neck] + positions_mpii[i_pelvis]) / 2 . After normalize_screen_coordinates, scale the locations with a factor to fit the proportion of human body in the image of H36M, in my case: positions = positions / 2.

In my case, I want to get the 3D posture directly from the image instead of getting a 2D posture and then a 3D posture, and I got a better result by following this paper "Angjoo Kanazawa, Michael J. Black, David W. Jacobs, Jitendra Malik. End-to-end Recovery of Human Shape and Pose".

dandingol03 · 2020-12-17T16:55:47Z

@develduan Firstly, thanks for your kindly apply. Secondly, the paper " End-to-end Recovery of Human Shape and Pose" is cool, i will delve into the paper soon. And last, here is my email [email protected], maybe someday we can exchange idea about 3d pose estimation~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird results when applying SemGCN to 2D pose from image #32

Weird results when applying SemGCN to 2D pose from image #32

duckduck-sys commented Oct 28, 2020

develduan commented Nov 23, 2020

dandingol03 commented Dec 6, 2020

dandingol03 commented Dec 6, 2020

lisa676 commented Dec 17, 2020

dandingol03 commented Dec 17, 2020

develduan commented Dec 17, 2020

dandingol03 commented Dec 17, 2020

Weird results when applying SemGCN to 2D pose from image #32

Weird results when applying SemGCN to 2D pose from image #32

Comments

duckduck-sys commented Oct 28, 2020

develduan commented Nov 23, 2020

dandingol03 commented Dec 6, 2020

dandingol03 commented Dec 6, 2020

lisa676 commented Dec 17, 2020

dandingol03 commented Dec 17, 2020

develduan commented Dec 17, 2020

dandingol03 commented Dec 17, 2020