Replies: 1 comment 1 reply
-
The YOLO output grids have about 20,000 points that are each capable of detecting an object. build_targets() is simply identifying which grid points should match up with which targets (labels), if any.
The papers you've read likely correspond to the older, less performant darknet versions of yolov3, which is why you've not seen this before. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have watched multiple lectures explaining the yolo algorithm, but I found something in your code which doesn't seem to be discussed at all...
If I understand correctly, it seems like you are generating a total of 9 (3 in 3 loops) anchor boxes for each ground-truth object/bbox in the training batch, after that you seem to compare the w/h-ratio of the ground-truth bbox and the 9 anchors/priors, and whichever are smaller than the "anchor_t" threshold/hyperparameter are discarded. And later on, in the
compute_loss
function, you seem to calculate the IOU between the predicted bbox and those anchor bboxes which were generated in thebuild_targets
function to later use for the loss calculation.This is very confusing to me - as from the lectures I understood that we discard all predicted bboxes with an objectness score beneath a certain threshold (i.e. 0.4), and then we compare those left-over bboxes with the ground-truth bboxes directly by just taking the predicted box with the highest IOU as the "final" prediction of our model to then calculate our final loss...
But obviously that doesn't seem to be the case. Is there anywhere any piece of documentation which describes what exactly is being done during the training process of this yolov3 implementation? For instance, I have never heard of the
'anchor_t': (1, 2.0, 8.0), # anchor-multiple threshold
hyperparameter, and googling doesn't help either...Please excuse my ignorance - I have been learning about object-detection algorithms for a little over a week now.
Beta Was this translation helpful? Give feedback.
All reactions