Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency in HO3D Data's Prediction #17

Open
alakhag opened this issue Jul 30, 2024 · 0 comments
Open

Inconsistency in HO3D Data's Prediction #17

alakhag opened this issue Jul 30, 2024 · 0 comments

Comments

@alakhag
Copy link

alakhag commented Jul 30, 2024

Thank you so much for this amazing work!

I see that you do not use 3D Annotations provided, but use COLMAP estimations for HO3D objects. I wanted to know if I want to utilize HO3D Annotations (mainly from meta pickle files), how can I utilize them to create dataset.

Below: Camera Coordinate System assumes camera at origin facing along -Z direction
Just for reference: HO3D data's meta pickle file contains:

  • beta, pose, trans of hand in Camera Coordinate System
  • Object Rotation and Translation in Camera Coordinate System
  • Camera Intrinsic Matrix

We also have HO3D segmentations that can be directly used as masks.

I am just trying to figure out what transformations in hand MANO parameters or Object Transformation parameters I need to do to be able to generate data.npy and train HOLD.

Edit: I tried visualizing data.npy of ShSu10, to see the canonical space. I find that canonical camera locations seem very inconsistent (1. with given original HO3D's models) and (2. among each other). Following are the camera locations with respect to object in Canonical Space.

Screenshot from 2024-07-31 18-59-08

Moreover, I choose 2 frames where I would expect camera views to be facing opposite to each other:
0000
0129

However, I get these view directions of camera's principle axis in canonical space.
Screenshot from 2024-07-31 19-08-08

How I get canonical space camera:

get_camera_param() gives 'cam_loc' and 'ray_dirs' in deformed space. Using 'tf_mat' in deformer with "inverse=True" should give camera in canonical space.

I also tried this:

I tried updating data.npy in the following way:

  • Update all scale_mat_i to identity matrix
  • Update all world_mat_i to camera intrinsic values, with 3rd column 0.
  • Update normalize_shift to [0,0,0]
  • Update entity.right's hand pose, trans and mean_shape from the given HO3D's ShSu10's meta pickle files
  • Update entity.object's parameters
  1. set obj_scale as 1.0
  2. set norm_mat as identity
  3. set object_poses from objRot and objTrans in meta pickle files.

When I train with this information, even though camera visualization, etc. looks good, I probably encounter nan weights after 1st iteration, and the training breaks in 2nd iteration.
The warning in 1st iteration is:
FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.

Not sure what is wrong.

@alakhag alakhag changed the title HO3D Data Inconsistency in HO3D Data's Prediction Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant