Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions on the different frames used in the project #6

Open
RanDing2000 opened this issue Dec 17, 2024 · 11 comments
Open

Some questions on the different frames used in the project #6

RanDing2000 opened this issue Dec 17, 2024 · 11 comments

Comments

@RanDing2000
Copy link

Dear Eugenio,

I am currently trying to understand the different frames used in the project, such as the object frame, camera frame, world frame, and base frame. I noticed that for the base frame, there are multiple representations like palm, hand, and ttip. I have a few questions for clarification:

  1. Is the pose passed to the MPLibPlanner in the base frame with the ttip representation?
  2. Does CenterGraspNet first predict the grasp pose in the object frame (by multiplying the predicted object pose) and then convert it to the world frame using the camera pose? If so, how is this later converted to the base frame?
  3. Does GIGA predict the grasp pose in the palm coordinate system, and does it require a palm-to-hand transformation?

Thank you in advance for your clarification!

@RanDing2000
Copy link
Author

Another questions: In centergrasp.sgdf.mesh_grasps, the generate_grasps function allows us to generate grasps for the object in both the object pose and ttip representation.

  1. However, how can we obtain the grasp for this object in a scene such that it can be executed by the MPLibPlanner?
  2. From sim.scene_objs, we can get a pose4x4 attribute. Does this pose4x4 represent a obj2world transformation?

@RanDing2000
Copy link
Author

Another question: At leat in the PickGigaPackedEnv environment, there is a scale parameter for each object, but I don't see any scaling operation being applied. Could you clarify how or where this parameter is used?

@RanDing2000
Copy link
Author

RanDing2000 commented Dec 17, 2024

There are lots of -= operation, like "palm", "hand", "ttip" transformations. Does these all happen on the object pose? Why does this -= value different?

In centergrasp/sgdf/data_structures.py

14 | @classmethod
-- | --
15 | def from_torch(cls, pc_th: torch.Tensor, grasps_th: torch.Tensor, confs_th: torch.Tensor):
16 | pc, grasps, confs = data_utils.th_to_np(pc_th, grasps_th, confs_th)
17 | pc_o3d = o3d.geometry.PointCloud(o3d.utility.Vector3dVector(pc))
18 | # Transform from palm to hand frame
19 | grasps[..., :3, 3] -= 0.0624 * grasps[..., :3, 2]
20 | return cls(pc_o3d, grasps, confs)

In centergrasp/sgdf/mesh_grasps.py

--
48 | poses_data[..., :3, 3] -= 0.041 * poses_data[..., :3, 2]
49 | elif frame == "hand":
50 | poses_data[..., :3, 3] -= 0.1034 * poses_data[..., :3, 2]
51 | return poses_data

In centergrasp/graspnet/sgdf_data.py

if frame == "palm":
90 | poses_data[..., :3, 3] -= 0.02 * poses_data[..., :3, 2]
91 | elif frame == "hand":
92 | poses_data[..., :3, 3] -= 0.0824 * poses_data[..., :3, 2]
93 | return poses_data

In centergrasp/visualization.py

"""Create a 3D mesh visualizing a parallel yaw gripper. It consists of four cylinders.
--
293 | It is represented in "hand_link" frame (i.e. origin is at the wrist)
294 | palm frame: += 0.0624 in z direction
295 | ttip frame: += 0.1034 in z direction
296 |  
297 | Args:
298 | color (list, optional): RGB values of marker. Defaults to [0, 0, 255].

@chisarie
Copy link
Collaborator

The base frame is the base of the franka arm. The object frame is at the centroid of the given object. The frames hand, palm and ttip are all refered to the franka-hand (the end-effector), the only difference is a translation in z-direction. hand is at the wrist, palm is at the base of the fingers, tooltip is at the end of the fingers.

QUESTION 1

  1. You pass to MPLibPlanner an end-effector pose as the 'hand' frame, i.e. you pass the desired 'hand' pose in base frame
  2. Yes, CenterGrasp predicts object poses in camera frame and grasp poses in object frame. Then you transform all grasps in camera frame as well, by multiplying with the object pose. Once you have everything in camera frame, you can transform it to the base frame of the robot by knowing your camera pose
  3. Yes, you can see the transformation here
    wTeegoal = wTgrasp * self.palmThand

QUESTION 2

  1. Just multiply with the object pose
  2. Yes

QUESTION 3
Yes the scale parameter is used in multiple places, in particular when loading the mesh in the simulator, for example here

filename=str(obj.collision_fpath), scale=obj.scale, density=obj.density

QUESTION 4
The values in the graspnet folder are different, because the GraspNet1B dataset uses a different gripper, not the franka one as I use in the rest of the repo. All the others are consistent (Note that 0.1034 - 0.0624 = 0.041)

@RanDing2000
Copy link
Author

RanDing2000 commented Dec 17, 2024

Thanks so much for your clear clarification. It solves my most questions.

  1. For question 2-1, I want to add: since generate_grasps in the centergrasp.sgdf.mesh_grasps generate grasps on the ttip in the obj frame. In the generate_grasps function, should we pass the scale parameter to it? (It is very werid that if I set the value to 1, there will become nearly 10% successful grasps, if set the value to the scale parameter of the object, it will become nearly zero) Do we need to first convert it to hand, than multiply by the object pose?
  2. In this project, does base frame equals to the world frame?
  3. Do you know how to write function in sapien to save or restore states like sim.save_state() and sim.restore_state()` in pybullet? (If needed, I can open another issus)

@RanDing2000
Copy link
Author

To clarify the task above, I aim to perform grasp generation in a cluttered scene within the Sapien environment using a Franka Arm and the MPLibPlanner. Specifically, I want to generate grasps similar to the GIGA method and retrain the GIGA model.

@chisarie
Copy link
Collaborator

  1. It depends on which dataset you are using. The meshes from GIGA are already scaled to a proper size, so you just leave the scale to 1. In other datasets, some object meshes are saved with a too big scale, so you need to scale them down for grasping. The way I generate my dataset, is by using scales 0.7, 0.85 and 1.0, to have more variety, as you can see in make_grasp_labels.py. If you want to visualize the scale of your objects, you can run generate_grasps with the argument vis_flag=True: it will spawn an open3d viewer, and if you hold down esc on your keyboard, you will see the grasp sampling process

  2. Yes

  3. To be completely sure you should ask the sapien authors, but I think this is not possible.

@RanDing2000
Copy link
Author

Dear Eugenio, Thanks so much for the reply! Your responses clearly solve my questions.

@RanDing2000
Copy link
Author

In the mesh_utils module, the generate_grasps function is documented to produce grasp poses expressed as the gripper's tip pose in the object's coordinate frame.

The code includes transformations that adjust the grasp pose along the z-axis (out_grasp[:3, 2]), such as:

  • tip2hand: out_grasp[:3, 3] -= 0.0824 * out_grasp[:3, 2]
  • tip2palm: out_grasp[:3, 3] -= 0.02 * out_grasp[:3, 2]
  • palm2hand: out_grasp[:3, 3] -= 0.0624 * out_grasp[:3, 2]

Is that correct for franka gripper?

@chisarie
Copy link
Collaborator

No, those transformations are valid for the gripper used in the GraspNet1B dataset, (see its docs). Watch out, because everything in the graspnet folder refers to that gripper. In particular, the frame transforms are defined in the read_poses_data function. There are two of them:

  • For the GraspNet1B gripper: link
  • For the Franka Gripper: link

For the franka gripper, you can also confirm my numbers by looking at the datasheet

@RanDing2000
Copy link
Author

Thank you so much! I have a couple of questions:

  1. Does CenterGraspNet initially predict the hand pose? I’m referring to the implementation in [this line](
    def from_torch(cls, pc_th: torch.Tensor, grasps_th: torch.Tensor, confs_th: torch.Tensor):
    ).
  2. In centergrasp/sapien/stick_gripper_fcl.py, does the move_to_grasp_pose function operate on the palm pose?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants