Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IMU inference #114

Open
lixinghe1999 opened this issue Apr 15, 2024 · 0 comments
Open

IMU inference #114

lixinghe1999 opened this issue Apr 15, 2024 · 0 comments

Comments

@lixinghe1999
Copy link

Dear contributors,

I am currently working on IMU inference.
I use the group normalization for ACC and GYRO (from IMU2CLIP), respectively, and observe some problems below:

  1. A 5-second or 10-second clip is too long, it can cover > 2 narrations, so it can be ambiguous. I modified it by a 2-second clip and pad it to 10 seconds, not sure whether I am correct.
  2. According to the paper, it is not sure whether the IMU embedding corresponds to "summary of the full video" or "one sentence of the narration".
  3. Since the IMU performance is relatively low and the signal is not readable by humans, it is really hard for me to confirm whether I am correct or not.
    Thank you in advance for your help!

I attach an image below, where each subplot refers to a 2-second clip for one narration, you can see 1 & 2 are identical and 3 & 4 are identical two. The reason behind is that the narrations are just too close to each other.
ego4d_imu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant