Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weight for the interactive track of DAVIS 19 challenge #15

Open
zyy-cn opened this issue Apr 22, 2020 · 3 comments
Open

Weight for the interactive track of DAVIS 19 challenge #15

zyy-cn opened this issue Apr 22, 2020 · 3 comments

Comments

@zyy-cn
Copy link

zyy-cn commented Apr 22, 2020

Hi:
Thanks for sharing the code. I notice that the current released weight is for the semi-supervised track and different from the weights you used in the interactive track of the DAVIS 19 challenge. I test this weight under the Davis-interactive framework follow the official challenge setting and only achieve AUC 67.74 on the DAVIS 17 validation set. I wonder if you have any plan to release the weights which trained for the interactive track of the DAVIS 19 challenge?

@seoungwugoh
Copy link
Owner

We are now under review for the interactive version of STM. We plan to upload the code for the interactive VOS after the review process is finished.

@zyy-cn
Copy link
Author

zyy-cn commented May 5, 2020

Thanks for your replay!

I have two more questions:

  1. I'm trying to train the model for the interactive VOS task with the following process:
    a). prepare [A_image, A_mask, B_ image, C_image] for input, [B_mask, C_mask] for GT.
    b). memorize [A_image, scribble(A_mask)], scribble(*) indicates drawing the scribble onto the mask according to the area of FP and FN.
    c). segment [B_mask] with the memory of A.
    d). memorize [B_image, scribble(B_mask)]
    e). segment [C_mask] with the memory of A, B.
    and loss is computed with B_mask, C_mask.
    Is this above process correct?

  2. What is the result (AUC for J&F) you achieved on the Davis 17 validation set on the interactive VOS task with STM? And what is the accordingly GPU for the inference?

@seoungwugoh
Copy link
Owner

seoungwugoh commented Jun 8, 2020

Hi, the training protocol of the interactive model is somewhat different from semi-supervised model (described in the paper under review). In the DAVIS interactive scenario, It does not need to process a video in a sequential order. And there is multiple rounds. To briefly explain:

a) memorize [A_image, scribble(A_mask_r0, A_mask_GT)] where r0 mask is all zeros.
b) segment [A_image, B_image, C_image] -> [A_mask_r1, B_mask_r1, C_mask_r1]
c) memorize [C_image, scribble(C_mask_r1, C_mask_GT)]
d) segment [A_image, B_image, C_image] using two memories.
losses are computed for all predictions.

STM is properly modified to be applicable to interactive mode. We used a 2080 Ti GPU. We will make the paper for interactive STM after the review process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants