Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on MobileNet SSD architechture with focal loss. #5

Open
MAGI003769 opened this issue Mar 22, 2018 · 4 comments
Open

Question on MobileNet SSD architechture with focal loss. #5

MAGI003769 opened this issue Mar 22, 2018 · 4 comments

Comments

@MAGI003769
Copy link

Hello! First of all thanks for your implementation. You're really awesome.
My question is how to merger the focal loss with SSD architecture as I'm know working on SSD for my project.

  1. Is it correct that we just replace the original softmax loss by focal loss? Or, it is necessary to apply it to location loss as well?
  2. As the strength of focal loss is to solve the class imbalance, should I remove the the hard negative mining operations mentioned in SSD paper? What's your idea when your implementation?

Thanks a lot for your brilliant work and patience to read these questions. Look forward your reply.

@ChiefGodMan
Copy link
Owner

Hi, thanks for your praise.

  1. You just need to replace softmax loss by focal loss of classification.
  2. You can remove hard mining operations, but I choose to change the config file to (num_hard_example=20000, max_neg_per_pos=1000, max_total_detection=20000) instead. Actually you can have a try.

@MAGI003769
Copy link
Author

Thanks for your answering.

The output shape of original softmax loss tensor is (batch_size, num_box) which means a scalar value for each box. But, in your implementation, the return of focal loss is just a scalar. Is that means one loss value for each batch??? Doesn't the last line of code tf.reduce_sum() need to specify the axis, maybe -1 ? I think each single sample oriented by loss is a box rather than a batch.

Could you please see this problem at your earliest convenience?

@ChiefGodMan
Copy link
Owner

Oh, my god. The latest version of models has changed the loss function return value. My code is for previous version(maybe before version 1.2). You just need return per_entry_cross_ent variable rather then tf.reduce_sum() result.
Actually, the models has implemented focal loss called 'class SigmoidFocalClassificationLoss(Loss)', you can have a try.

@GallonDeng
Copy link

hi, @ailias @MAGI003769 , thus I can directly use tensorflow object_detection models api to merge focal loss with SSD. However, I am not sure,
whether it can directly be used for multi-label dataset or not? Should I do some modifications? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants