Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关键点准确度的问题 #9

Open
LebronJames0423 opened this issue Mar 21, 2022 · 3 comments
Open

关键点准确度的问题 #9

LebronJames0423 opened this issue Mar 21, 2022 · 3 comments

Comments

@LebronJames0423
Copy link

请问下,当在640480的分辨率时,关键点的位置看起来比较准确,当我设置成1280720时,关键点检测的位置就不太准确了,特别是脸部轮廓的位置,会超出脸部的范围,请问这个是什么原因呢?

@pntt3011
Copy link
Owner

pntt3011 commented Mar 21, 2022

Hello,
I try capturing my webcam, resizing the frames to 1280 x 720 and loading them to the model. The results still look accurate for me.

However, I think there is a possibility.
The frames are resized to 128x128 in the preprocess because it's the shape of the model input, which has ratio 1:1. 640x480 has ratio 4:3 and 1280x720 has ratio 16:9 so when resized, 16:9 will "shrink" more than 4:3. Moreover, my ssd anchors are generated with fixed ratio 1:1 (you can try replacing generateAnchors with the generate_anchors here) so when mapping the detection results to the original size, it may give a rectangle roi instead of a square one. (Note that in calculateRoiFromDetection, I multiply the height by 2 and width by 1.5 to "hack" this issue, you can try adjusting those numbers too).
Then the face roi is passed to key point detection and if the roi is not "square enough", this can lead to inaccurate results (the key point model performs better for inputs with ratio 1:1).

(I use Google Translate to translate your issue. If I misunderstand your question please let me know in English).

@LebronJames0423
Copy link
Author

Hello, I try capturing my webcam, resizing the frames to 1280 x 720 and loading them to the model. The results still look accurate for me.

However, I think there is a possibility. The frames are resized to 128x128 in the preprocess because it's the shape of the model input, which has ratio 1:1. 640x480 has ratio 4:3 and 1280x720 has ratio 16:9 so when resized, 16:9 will "shrink" more than 4:3. Moreover, my ssd anchors are generated with fixed ratio 1:1 (you can try replacing generateAnchors with the generate_anchors here) so when mapping the detection results to the original size, it may give a rectangle roi instead of a square one. (Note that in calculateRoiFromDetection, I multiply the height by 2 and width by 1.5 to "hack" this issue, you can try adjusting those numbers too). Then the face roi is passed to key point detection and if the roi is not "square enough", this can lead to inaccurate results (the key point model performs better for inputs with ratio 1:1).

(I use Google Translate to translate your issue. If I misunderstand your question please let me know in English).
Thank you for your advice ! You are right,
it can be resolved by setting the "w" and "h" to "detection.roi.width * origWidth * 1.f" and "detection.roi.height * origHeight * 2.f;" in function calculateRoiFromDetection.
And another problem i find that there is always a small amount of jitter on the keypoints, even though they seem small. However, when it is used in other applications, the effect will not be very good, such as using it in the big-eye and face-lifting function of the beauty camera. How can i solve it?

@pntt3011
Copy link
Owner

Hello, according to this issue, it is Mediapipe's problem. Some workarounds (included Mediapipe's) are mentioned in that thread. I'll try to implement one when I have free time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants