Some questions on the different frames used in the project #6

RanDing2000 · 2024-12-17T11:33:02Z

Dear Eugenio,

I am currently trying to understand the different frames used in the project, such as the object frame, camera frame, world frame, and base frame. I noticed that for the base frame, there are multiple representations like palm, hand, and ttip. I have a few questions for clarification:

Is the pose passed to the MPLibPlanner in the base frame with the ttip representation?
Does CenterGraspNet first predict the grasp pose in the object frame (by multiplying the predicted object pose) and then convert it to the world frame using the camera pose? If so, how is this later converted to the base frame?
Does GIGA predict the grasp pose in the palm coordinate system, and does it require a palm-to-hand transformation?

Thank you in advance for your clarification!

RanDing2000 · 2024-12-17T11:47:25Z

Another questions: In centergrasp.sgdf.mesh_grasps, the generate_grasps function allows us to generate grasps for the object in both the object pose and ttip representation.

However, how can we obtain the grasp for this object in a scene such that it can be executed by the MPLibPlanner?
From sim.scene_objs, we can get a pose4x4 attribute. Does this pose4x4 represent a obj2world transformation?

RanDing2000 · 2024-12-17T12:14:18Z

Another question: At leat in the PickGigaPackedEnv environment, there is a scale parameter for each object, but I don't see any scaling operation being applied. Could you clarify how or where this parameter is used?

RanDing2000 · 2024-12-17T13:29:22Z

There are lots of -= operation, like "palm", "hand", "ttip" transformations. Does these all happen on the object pose? Why does this -= value different?

In centergrasp/sgdf/data_structures.py

14 | @classmethod
-- | --
15 | def from_torch(cls, pc_th: torch.Tensor, grasps_th: torch.Tensor, confs_th: torch.Tensor):
16 | pc, grasps, confs = data_utils.th_to_np(pc_th, grasps_th, confs_th)
17 | pc_o3d = o3d.geometry.PointCloud(o3d.utility.Vector3dVector(pc))
18 | # Transform from palm to hand frame
19 | grasps[..., :3, 3] -= 0.0624 * grasps[..., :3, 2]
20 | return cls(pc_o3d, grasps, confs)

In centergrasp/sgdf/mesh_grasps.py

--
48 | poses_data[..., :3, 3] -= 0.041 * poses_data[..., :3, 2]
49 | elif frame == "hand":
50 | poses_data[..., :3, 3] -= 0.1034 * poses_data[..., :3, 2]
51 | return poses_data

In centergrasp/graspnet/sgdf_data.py

if frame == "palm":
90 | poses_data[..., :3, 3] -= 0.02 * poses_data[..., :3, 2]
91 | elif frame == "hand":
92 | poses_data[..., :3, 3] -= 0.0824 * poses_data[..., :3, 2]
93 | return poses_data

In centergrasp/visualization.py

"""Create a 3D mesh visualizing a parallel yaw gripper. It consists of four cylinders.
--
293 | It is represented in "hand_link" frame (i.e. origin is at the wrist)
294 | palm frame: += 0.0624 in z direction
295 | ttip frame: += 0.1034 in z direction
296 |  
297 | Args:
298 | color (list, optional): RGB values of marker. Defaults to [0, 0, 255].

chisarie · 2024-12-17T13:47:29Z

The base frame is the base of the franka arm. The object frame is at the centroid of the given object. The frames hand, palm and ttip are all refered to the franka-hand (the end-effector), the only difference is a translation in z-direction. hand is at the wrist, palm is at the base of the fingers, tooltip is at the end of the fingers.

QUESTION 1

You pass to MPLibPlanner an end-effector pose as the 'hand' frame, i.e. you pass the desired 'hand' pose in base frame
Yes, CenterGrasp predicts object poses in camera frame and grasp poses in object frame. Then you transform all grasps in camera frame as well, by multiplying with the object pose. Once you have everything in camera frame, you can transform it to the base frame of the robot by knowing your camera pose
Yes, you can see the transformation here

CenterGrasp/centergrasp/pipelines/giga_pipeline.py

Line 76 in 72bc8f1

wTeegoal = wTgrasp * self.palmThand

QUESTION 2

Just multiply with the object pose
Yes

QUESTION 3
Yes the scale parameter is used in multiple places, in particular when loading the mesh in the simulator, for example here

CenterGrasp/centergrasp/sapien/sapien_utils.py

Line 149 in 72bc8f1

filename=str(obj.collision_fpath), scale=obj.scale, density=obj.density

QUESTION 4
The values in the graspnet folder are different, because the GraspNet1B dataset uses a different gripper, not the franka one as I use in the rest of the repo. All the others are consistent (Note that 0.1034 - 0.0624 = 0.041)

RanDing2000 · 2024-12-17T14:10:25Z

Thanks so much for your clear clarification. It solves my most questions.

For question 2-1, I want to add: since generate_grasps in the centergrasp.sgdf.mesh_grasps generate grasps on the ttip in the obj frame. In the generate_grasps function, should we pass the scale parameter to it? (It is very werid that if I set the value to 1, there will become nearly 10% successful grasps, if set the value to the scale parameter of the object, it will become nearly zero) Do we need to first convert it to hand, than multiply by the object pose?
In this project, does base frame equals to the world frame?
Do you know how to write function in sapien to save or restore states like sim.save_state() and sim.restore_state()` in pybullet? (If needed, I can open another issus)

RanDing2000 · 2024-12-17T14:32:08Z

To clarify the task above, I aim to perform grasp generation in a cluttered scene within the Sapien environment using a Franka Arm and the MPLibPlanner. Specifically, I want to generate grasps similar to the GIGA method and retrain the GIGA model.

chisarie · 2024-12-19T12:02:27Z

It depends on which dataset you are using. The meshes from GIGA are already scaled to a proper size, so you just leave the scale to 1. In other datasets, some object meshes are saved with a too big scale, so you need to scale them down for grasping. The way I generate my dataset, is by using scales 0.7, 0.85 and 1.0, to have more variety, as you can see in make_grasp_labels.py. If you want to visualize the scale of your objects, you can run generate_grasps with the argument vis_flag=True: it will spawn an open3d viewer, and if you hold down esc on your keyboard, you will see the grasp sampling process
Yes
To be completely sure you should ask the sapien authors, but I think this is not possible.

RanDing2000 · 2024-12-19T12:09:50Z

Dear Eugenio, Thanks so much for the reply! Your responses clearly solve my questions.

RanDing2000 · 2024-12-19T15:09:48Z

In the mesh_utils module, the generate_grasps function is documented to produce grasp poses expressed as the gripper's tip pose in the object's coordinate frame.

The code includes transformations that adjust the grasp pose along the z-axis (out_grasp[:3, 2]), such as:

tip2hand: out_grasp[:3, 3] -= 0.0824 * out_grasp[:3, 2]
tip2palm: out_grasp[:3, 3] -= 0.02 * out_grasp[:3, 2]
palm2hand: out_grasp[:3, 3] -= 0.0624 * out_grasp[:3, 2]

Is that correct for franka gripper?

chisarie · 2024-12-19T15:58:52Z

No, those transformations are valid for the gripper used in the GraspNet1B dataset, (see its docs). Watch out, because everything in the graspnet folder refers to that gripper. In particular, the frame transforms are defined in the read_poses_data function. There are two of them:

For the GraspNet1B gripper: link
For the Franka Gripper: link

For the franka gripper, you can also confirm my numbers by looking at the datasheet

RanDing2000 · 2024-12-20T16:44:09Z

Thank you so much! I have a couple of questions:

Does CenterGraspNet initially predict the hand pose? I’m referring to the implementation in [this line](

CenterGrasp/centergrasp/sgdf/data_structures.py

Line 15 in 72bc8f1

def from_torch(cls, pc_th: torch.Tensor, grasps_th: torch.Tensor, confs_th: torch.Tensor):

).
In centergrasp/sapien/stick_gripper_fcl.py, does the move_to_grasp_pose function operate on the palm pose?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions on the different frames used in the project #6

Some questions on the different frames used in the project #6

RanDing2000 commented Dec 17, 2024

RanDing2000 commented Dec 17, 2024

RanDing2000 commented Dec 17, 2024

RanDing2000 commented Dec 17, 2024 •

edited

Loading

chisarie commented Dec 17, 2024

RanDing2000 commented Dec 17, 2024 •

edited

Loading

RanDing2000 commented Dec 17, 2024

chisarie commented Dec 19, 2024

RanDing2000 commented Dec 19, 2024

RanDing2000 commented Dec 19, 2024

chisarie commented Dec 19, 2024

RanDing2000 commented Dec 20, 2024

Some questions on the different frames used in the project #6

Some questions on the different frames used in the project #6

Comments

RanDing2000 commented Dec 17, 2024

RanDing2000 commented Dec 17, 2024

RanDing2000 commented Dec 17, 2024

RanDing2000 commented Dec 17, 2024 • edited Loading

chisarie commented Dec 17, 2024

RanDing2000 commented Dec 17, 2024 • edited Loading

RanDing2000 commented Dec 17, 2024

chisarie commented Dec 19, 2024

RanDing2000 commented Dec 19, 2024

RanDing2000 commented Dec 19, 2024

chisarie commented Dec 19, 2024

RanDing2000 commented Dec 20, 2024

RanDing2000 commented Dec 17, 2024 •

edited

Loading

RanDing2000 commented Dec 17, 2024 •

edited

Loading