You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I managed to get the visualisations of keyframe and support frame after putting features_cls_ = MSA_yolov_visual(features_cls, features_reg, cls_scores, fg_scores, img=imgs, pred=pred_result) in the last few lines of YOLOXHead code but I have a few questions about the Attention_msa_visual function above:
Why is attn_total just attn_cls_raw multiplied by 25? I would expect attn_cls_raw to give me bad numbers for the support frames but attn_total is also giving me super low numbers like 0.14 on support frames that are good as well
Is there a formal code for the visualisation you got and showed on your paper where you compared the QK manner, Affinity manner and cosine similarity?
The text was updated successfully, but these errors were encountered:
I managed to get the visualisations of keyframe and support frame after putting
features_cls_ = MSA_yolov_visual(features_cls, features_reg, cls_scores, fg_scores, img=imgs, pred=pred_result)
in the last few lines of YOLOXHead code but I have a few questions about theAttention_msa_visual
function above:The text was updated successfully, but these errors were encountered: