Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The problem of using yolov's algorithm ideas in yolov5/v7 #93

Open
CangHaiQingYue opened this issue Aug 13, 2024 · 3 comments
Open

The problem of using yolov's algorithm ideas in yolov5/v7 #93

CangHaiQingYue opened this issue Aug 13, 2024 · 3 comments

Comments

@CangHaiQingYue
Copy link

CangHaiQingYue commented Aug 13, 2024

Hello, thanks for your great job!
YOLOX is based on an anchor-free algorithm, and I would like to use YOLOX's ideas in the anchor-based algorithm. Now I have a question that I would like to ask:

Currently, self. n_anchors=1. For an image with an input size of 1x3x640x640, the shape of feature_cls should be 1x8040x192, while the range of values in pred_idx is [0, 8039] This is okay.

However, when self. n_anchors=3, the shape of features_cls is still 1x8040x192, but the value of pred_idx is [0, 8040 * 3-1], then an error will be reported in the function self. find_feature_store.

So I would like to ask how to resolve this conflict. It seems impractical to simply repeat features_cls three times.

pred_result, pred_idx = self.postpro_woclass(decode_res, num_classes=self.num_classes,
nms_thre=self.nms_thresh,
topK=self.Afternum,
ota_idxs=ota_idxs,
)
if not self.training and imgs.shape[0] == 1:
return self.postprocess_single_img(pred_result, self.num_classes)
cls_feat_flatten = torch.cat(
[x.flatten(start_dim=2) for x in before_nms_features], dim=2
).permute(0, 2, 1) # [b,features,channels]
reg_feat_flatten = torch.cat(
[x.flatten(start_dim=2) for x in before_nms_regf], dim=2
).permute(0, 2, 1)
(features_cls, features_reg, cls_scores,
fg_scores, locs, all_scores) = self.find_feature_score(cls_feat_flatten,
pred_idx,
reg_feat_flatten,
imgs,
pred_result)

@YuHengsss
Copy link
Owner

Thanks for your attention to our work!

To use YOLOV in detector with multiple anchors, you should rewrite the feature selection function(

fg_scores, locs, all_scores) = self.find_feature_score(cls_feat_flatten,
). Concretely, find the foreground proposals and their corresponding features. Note that multiple foreground proposals may correspond to one feature point. In such cases, one feature will be repeated multiple times. Given this concern, we choose the anchor-free detector to conduct our experiment. However, our strategy should also work in these anchor-based detectors. Any attempt is appreciated and we look forward to hearing your success!

@CangHaiQingYue
Copy link
Author

Hi @YuHengsss , thanks for your answer, I solved this by mapping [0, 8040 * 3-1] to [0, 8040-1], Cause 3 repersent anchor number, which means one point had been repeated three times.

However, I encountered another problem, which is about ref_loss. I didn't see this part in paper. Can you explain this part in detail for me?

@YuHengsss
Copy link
Owner

YuHengsss commented Aug 19, 2024

This is a classification refinement loss for the video object detection, and you can also find the IoU score refinement loss if you use YOLOV++. They are intended to optimize the classification and confidence of the object after the temporal refinement block. You can find the assignment strategy used in YOLOV for classification part as follows:
https://github.com/YuHengsss/YOLOV/blob/2ea4eb90a44cb3db791c1ac5aac38685ebbc297c/yolox/models/yolovp_msa.py#L590C21-L605C1

YOLOV++ updated the label assignment strategy to get better performance. It's a little bit complex and you can find them in

if not self.kwargs.get('cat_ota_fg',True):

and
if not self.kwargs.get('cat_ota_fg',True):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants