-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training not working anymore #3
Comments
Hi Chris, If you have a look at the compute_loss function in train.py, the loss function that was used before is the binary cross-entropy whereas in the latest version it is the Huber loss. One thing to notice as well is that total_loss= score_loss in both versions. Maybe it is more suitable to first learn the score only then finetune on the other tasks. |
Hi Tom, Thanks again for updating the repo and providing the inference script (only set batch size to 8). However, there seems to be an issue. I got these values during training. This does not seem to be trained correctly. I would appreciate any help on that - thanks again! Best regards, |
Hi aloukkal, Yes, I am aware of the changes affecting the loss function and confidence map representation. |
Can someone share the last known working version in this repo? |
@chris-doe @aloukkal @yhkim8412 @jackkwok Do you have any update on the issues you described here? |
Even I am getting the same losses on the current version of the repo. How did you fix it? |
Hi Tom,
first of all, thanks for updating the repo and providing the inference script.
However, there seems to be an issue now with the heatmap based scores during training. I did a clean clone of the repo and launched training as explained in the readme. Looking up the results in Tensorboard after 600 epochs, it can be seen, that the confidence maps don't show up any local maximas (while for the previous version of the repo, the confidence maps correctly showed that the network resolved depth uncertainity with increasing number of epochs and learned to localize objects). Hyperparameters as set by default (only set batch size to 8).
The inference script - using the old model checkpoints - worked for me after adapting NMS stage. Only one method (
bbox_corners
) in utils.py was missing.Do you have any idea, to get the training running again? Would appreciate any help on that - thank you!
Best regards,
Chris
The text was updated successfully, but these errors were encountered: