-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
on Screen gaze #19
Comments
Hi, @younesmch I think both the gaze vector and head pose are in camera coordinate system. So, I think we can get the gaze point on the screen with the following: point_on_screen = face.center - face.gaze_vector * face.center[2] / face.gaze_vector[2] As I moved my head position in front of a camera while staring at a fixed point, the computed gaze points on the screen seemed to be consistent to some extent, though they were not very accurate and there seemed to be a problem with the y-coordinate being lower than expected. |
Hi , @hysts what i get on plot (y data on red color) and units in centimetre on CCS for X data the values it is acceptable between(-15,15) cm on CCS |
@hysts I tried the line of code you mentioned, but when my head position in 3d is fixed and i only change the orientation of my head while looking at the same point on my screen, the point_on_screen coordinates are changing depending on the orientation of my head. So in the line What is not clear to me is relative to what are the pitch and yaw computed? when are they 0? Do we have to do some kind of transformation to it opposite to the head rotation relative to the camera? I can send you a video/example to showcase the problem if needed. |
@tomilles the gaze vector is calculated on screen coordinated system and the units in métre |
@younesmch so how do you get consistent/unchanged point_on_screen coordinates while gazing at the same point on screen but moving your head around? I tried many ways but I cannot seem to figure it out. Could you walk me through your steps or show me? |
@tomilles for me i think the head pos it is injected during the training data as mentioned in the paper so the model can predict the correct point independently to head pos |
I experimented to see how the predicted gaze point on the screen shifts depending on the head pose, and the following are the results: I took 200 frames of videos with my head pose fixed and looking in the direction of the camera, and plotted the results. The plots in the first row are for moving the head position in the XYZ direction, and the plots in the second row are for rotating the head pose in the pitch, yaw, and roll directions. The axes of the graphs are flipped for visualization, and the units are centimeters here. Also, the distance from the camera is basically about 50cm, with "near" being about 25cm and "far" being about 100cm. It seems that the predicted gaze vectors are off by about 20 degrees in the pitch direction, and that the predicted gaze point on the screen shifts into the X direction when the head rotates in the yaw direction. I think something is wrong, but can't figure out what it is. This may take some time. Please let me know if you have any ideas. |
@hysts i think the problem is with datasets which cover a limited area of gaze which make the model can't predict out of this area Distributions of head angle (h) and gaze angle (g) |
I forgot to mention, but I used a model pretrained with the ETH-XGaze dataset, which covers much wider range of gaze and head direction, in the above experiment. The distribution bias in the dataset could be the cause, but I'm not sure at this point. |
@hysts i can't see the problem the model predict the vertical distance totally non significant and in little range of gaze |
@hysts I recreated pitch,yaw labels for MPIIFaceGaze dataset using MediaPipe based head pose estimator and your great developed tools and this script which I borrowed from official eth-xgaze github: estimator.face_model_3d.estimate_head_pose(face, estimator.camera)
estimator.face_model_3d.compute_3d_pose(face)
estimator.face_model_3d.compute_face_eye_centers(face, 'ETH-XGaze')
estimator.head_pose_normalizer.normalize(im, face)
hR = face.head_pose_rot.as_matrix()
euler = face.head_pose_rot.as_euler('XYZ')
hRx = hR[:, 0]
forward = (face.center / face.distance).reshape(3)
down = np.cross(forward, hRx)
down /= np.linalg.norm(down)
right = np.cross(down, forward)
right /= np.linalg.norm(right)
R = np.c_[right, down, forward].T # rotation matrix R
gaze_point = np.array(line[24:27])
face_center = np.array(line[21:24])
gc = gaze_point - face.center*1000 #face_center
gc_normalized = np.dot(R, gc)
gc_normalized = gc_normalized / np.linalg.norm(gc_normalized)
gaze_theta = np.arcsin((-1) * gc_normalized[1])
gaze_phi = np.arctan2((-1) * gc_normalized[0], (-1) * gc_normalized[2])
gaze_norm_2d = np.asarray([gaze_theta, gaze_phi]) then finetuned your |
Hi, @ffletcherr Oh, that's wonderful! Thank you very much for the information. I had thought that the discrepancy in the pitch direction could be due to differences in the 3D models, but I hadn't checked it myself. But looking at your results, it seems more likely that it was indeed the case. By the way, it's just a small detail, but I'm not sure if the original normalization was "wrong". I think it's simply a difference in the 3D models used. I mean, the process of head pose estimation is like rotating a rigid face mask in 3D space to get the best fit based on facial landmarks, but if a different mask is used, the best fit pose could be different. Anyway, I will check differences in the 3D models and the model you trained soon. And thank you again, it's really helpful in narrowing down the problem. |
|
hi @ffletcherr, |
Any updates on code to resolve the on screen gaze location? I would even be open to starting with something potentially even simpler, such as just determining if the face in the video "keeps their eyes on the camera" throughout the video. |
hi thnks for great working again
i just wanna get the on screen gaze point
i calculate point of intersection between screen plane and gaze segment(as the source code written in https://git.hcics.simtech.uni-stuttgart.de/public-projects/opengaze/-/wikis/API-calls)
i transfer the point of intersection to screen coordinate system but i get wrong result
can some one help in which coordinate system gaze vector ,eye point
if someone done help thnks
The text was updated successfully, but these errors were encountered: