Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object orientation seems to have a big effect on results #35

Open
raabuchanan opened this issue May 8, 2023 · 2 comments
Open

Object orientation seems to have a big effect on results #35

raabuchanan opened this issue May 8, 2023 · 2 comments

Comments

@raabuchanan
Copy link

raabuchanan commented May 8, 2023

Hello,

I've been trying out Flowbot on real data to see how robust it is. I'm finding that the viewing direction of the camera has a big effect on the direction of the articulation flow.

Here I placed a box on a table and manually segmented the articulated part. The articulation flow pretty good. (For target flow I just used the predicted flow in order to make the visualization work)
Screenshot from 2023-05-08 10-27-38

Now I simply rotate the box about 45 degrees about the gravity vector. Now the articulation flow is skew. In fact, it seem like the flow is always in the direction of the viewing angle.
Screenshot from 2023-05-08 10-27-55

I checked and when training I use randomize_camera=True So I don't understand what is going on. Is this expected behaviour? Is there maybe a parameter I've set wrong?

Thanks!

@beneisner
Copy link
Collaborator

Hi, Thanks again for your interest in the project, and trying out the code. What you observed is expected - although I understand why "randomize_camera" is misleading in this case. The training code we released creates a model which can make predictions in one particular world reference frame (the frame you observed has good results). What randomize_camera does is to randomize the camera position during sampling of the points for training, not randomize the reference frame that are fed into the model. In other words, there is one global reference frame where you place a camera to sample points (always expressed in that global frame). You can vary the position of the camera, which will give you a different sampling of the points (i.e. from occlusions, etc.), but the reference frame those points are expressed in will be consistent.

There is no fundamental reason we couldn't train things in the camera frame (and thereby get the results you might expect from any direction). However, some visual affordances rely on the gravity vector being known, so we didn't want to deal with some custom/arbitrary distribution over camera poses. In retrospect we probably could have just rotated the camera around the z-axis (maintaining gravity vector being negative z).

@raabuchanan
Copy link
Author

Hi @beneisner thanks for your response. It's reassuring to know it's expected behaviour at least.

I may try implementing your suggestions of representing the pointcloud in camera frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants