-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some questions about SCM #2
Comments
I have another question, that is, is the evaluation metric GT-Known compared in the paper GT-Known top-1 or GT-Known top-5? |
May I know the environment in which the experiment was conducted?(For example,GPU) |
HI! Sorry for the late reply. (1) Yes, the E matrix in Eq.(3) is the normalized outer product of the whole vertex set (2) For the arbitrary cosine distance Wikipedia's definition of cosine distance: So the negative values represent they are not alike. (3) It's correct to some extent, as we illustrated in the suppl. materials, for simplicity, we only consider the first-order neighbors, meaning the four points. (You can experiment on the difference of connecting the second-order or more!) The semantic and spatial relations have been leveraged by SCM to diffuse the raw attention to cover the complete objects. Notice the critical design is the semantic relations are constantly updated by different ADB layers, which is shown in Fig.6, and accordingly, the updated E will revise the later diffusion status. I think the most intriguing thing is that diffusion can actually be done using one layer! I hypothesize that there should be some Reinforcement learning tricks to use one layer and receives the signal after each iteration step in Eq.(6) since we aim to find the intermediate status that the attention happens to capture the object. |
GT-Known normally uses the top-1. We give the top-k values only for convenience. |
We use A100 with memory that is able to support batch-size 256, maybe 40 GB. I don't remember very clearly. |
I benefited a lot! Thank you for your reply!
I benefited a lot! Thank you for your reply! |
Hello! I read your article carefully and was very interested in it! I have some questions as follows:
(1) Does the semantic similarity matrix E calculate the semantic similarity between all patchs?
(2) After I print E, I find a negative value. What does a negative value in E mean? (For example,the negative value -0.0383 in the first row)
tensor([[[ 1.0000, 0.3413, 0.3903, ..., 0.1250, -0.0383, 0.1996],
[ 0.3413, 1.0000, 0.4638, ..., 0.0055, 0.0692, 0.2095],
[ 0.3903, 0.4638, 1.0000, ..., 0.0800, -0.1332, 0.2198],
...,
(3) Does SCM diffuse only according to the semantic and spatial relations of the four points of its first-order neighbors?
Hope to get your reply! Thank you very much!
The text was updated successfully, but these errors were encountered: