-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation limitation #20
Comments
Hi! Yes, we did compare, our supervised descent model on COFW achieves a state-of-the-art average error of 0.071, which is even a bit better than reported in our paper (http://www.patrikhuber.ch/files/RCRC_SPL_2015.pdf). You're right however in that the pre-trained LFPW model is not as good. It's however not really possible to compare on the "original LFPW", since there's no standard set of images (everybody uses a different subset, since the images have to be downloaded from the web). In fact we didn't use around 200 or so images for the training, which they used. Since Intraface have never released their code (they say in their paper they released code, but they only ever released a library in binary form) and they also will not release it, it's not possible to compare implementations. Our benchmarks show it's working fine though and it's just a matter of choosing a perturbation/initialisation strategy, a large enough training database and tuning the feature extraction windows a bit. I hope I'll have some time soon-ish to replace the model with one trained on a larger database, but you can actually also do that quite easily with Regarding their SIFT, we tested the HOG descriptor quite extensively and vlhog is really good, I don't think Xiong's xxsift is better, it's just a bit faster I think. |
Hi, Patrik |
Hi, I was actually working on 300-VW a few weeks ago and hope to be able to continue it at some point in the next months. In the meantime, I'm happy to help in any way I can. |
Yeah, I've seen the discussion #19 and I have to disagree. |
The rotation is actually something I've seen from Tim Cootes in a presentation yesterday, I definitely want to try that. |
Could you please explain me how to calculate "the shape with mean shape (aligned on the image based on previous frame shape)"? |
You can use Procrustes analysis in order to find rotation, translation and scaling that aligns mean shape with previous frame shape |
Thanks. to you + Patrik: |
@Thanh-Binh: Yea more or less - I didn't correct the rotation, as mentioned before. I translate and scale the mean landmarks to be best aligned with the landmarks from the previous frame. Regarding the training: Well, after what @genawass posted, I'm actually not sure anymore that training using a different initialisation/perturbation strategy for tracking (i.e. one that perturbs around the ground-truth) would be successful. Maybe initialising the current frame from the previous one using the mean landmarks is in fact the way to go. I think only a quantitative experiment, for example on 300-VW, would show which one works best! What I mean by different initialisation/perturbation strategy for tracking is the following: In L420 the See also #19. |
To Genawass: |
Similarly, as using still images, only you need to align the mean shape in the initial step using different strategy (based on the shape from the previous frame). The learning phase remains the same |
You need to use transformation that you've found by Procrustes to transform the mean shape as close as possible to the ground_truth/previous_frame shape. Then you need to crop, scale and rotate the image (and the transformed mean shape accordingly) in order to normalize as possible the input data, since the descriptor is not invariant to high perturbations in scale/rotation |
Patrick, it is not a PCA SIFT, since they use default 128 size descriptor per keypoint (8 bin descriptor = 8x4x4=128). The trick is probably in histogram calculation and trilinear interpolation. |
@genawass: Hmm! I thought I read something about it that suggested it was a kind of PCA-SIFT. Probably it was just the own assumption of my colleagues and me then. Thanks for the hint! |
Also I do not think that using still images you can get all the extreme poses as in video. Face detector can find the face for frontal head pose with slight head rotations. In order for tracker to deal with extreme rotations you need to train it with extreme poses data. |
The original SDM demo was released in Intraface project (Matlab and C++ versions)
Comparing to Intraface, your implementation is not as robust and stable. Of course the reasons may be (1) descriptor type (Intraface use modified SIFT), (2) Training database (Intraface use movies for tracking) , (3) Implementation details.
Have you compared your SDM implementation with Intraface?
The text was updated successfully, but these errors were encountered: