Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guide to export custom yolo models #51

Open
hardikdava opened this issue May 8, 2023 · 4 comments
Open

Guide to export custom yolo models #51

hardikdava opened this issue May 8, 2023 · 4 comments

Comments

@hardikdava
Copy link

Hello,

  • is there any guide on how to convert custom yolo version (other than yolov5, yolov6, yolov7, yolov8)?
  • which layers should be modified in order to do on-device-decoding?
@tersekmatija
Copy link
Collaborator

@hardikdava
Hey, it might not be fully possible to do complete on-device decoding. I'll describe below how it works so you get a better understanding. To get from output predictions to actual bounding boxes, you could say you need to perform two main steps. The first step is decoding and the second step is NMS.

Decoding depends on each Yolo version (or sometimes even release...), and some versions use the same decoding approach. You can find how this is done by looking into heads of each Yolo version (example here). In tools, we prune the head, add the sigmoid activation (due to legacy reasons as YoloV4 and V5 used it as well), and then decode it based on the version we read from the layer name. In YoloV5 example we do exactly the same decoding as the head does, just it is instead done in the FW and we don't use sigmoid since it's already added to the model. After we get out the bounding boxes, then we pass this to NMS (which is the same for all Yolos) and get a final list of bounding boxes.

If you are interested in doing decoding of a specific Yolo version on device, you can compare if the bounding box decoding matches any of the supported version.

  • If yes, then export the model so that you prune the head and rename the output layers to match the names of the supported version. On-device decoding should work without a problem. If you have some issues, you can share the model and what you've tried and we could likely help.
  • If no, then feel free to open a request/issue with the Yolo you'd want to have supported. We aim for releases that are somewhat standard, common, and have advantage over other versions (such as better throughput on edge devices, significantly better detection performance, open and permissive license, ...).

@hardikdava
Copy link
Author

@tersekmatija , thanks for your reply.

I want to run damo-yolo detection model. The onnx model has 2 output nodes i.e. boxes (in the form of xyxy) and scores (shape = (number of classes, 1)). Since the boxes are already in the form of xyxy, I think I may not require decoding. But I have to find best class and then pass it to nms. The numpy operations are as follows.

output = self.model.run(None, {self.model.get_inputs()[0].name: net_image})   ## onnx prediction
scores = output[0][0]
bboxes = output[1][0]     

confidences = np.max(scores, axis=1)
valid_mask = confidences > conf_thresh
boxes = bboxes[valid_mask]
scores = scores[valid_mask]
class_ids = np.argmax(scores, axis=1)
confidences = confidences[valid_mask]

valid_boxes = non_maximum_suppression(boxes, confidences, iou_thresh)

@tersekmatija
Copy link
Collaborator

Yeah, that's already wrapped by their API. You can look here. I'd have to look deeper and investigate to see if it matches any of the current versions but will not have the time to do so anytime soon. It also seems the accuracy is not that much better and latency is measured on T4 which can deviate from OAK-D.

We could perhaps look into exposing the NMS node? CC: @themarpe

@hardikdava
Copy link
Author

@tersekmatija actually, if you expose NMS is a good idea. That will bring easy to extend custom models fully compatible to run inside device itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants