Skip to content

Commit

Permalink
Add YOLOv8-OBB and some bug fixes (#9)
Browse files Browse the repository at this point in the history
* Add YOLOv8-Obb & Refactor outputs

* Update README.md
  • Loading branch information
jamjamjon authored Apr 21, 2024
1 parent 91049fc commit beda8ef
Show file tree
Hide file tree
Showing 109 changed files with 2,532 additions and 1,930 deletions.
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,4 @@ indicatif = "0.17.8"
image = "0.25.1"
imageproc = { version = "0.24" }
ab_glyph = "0.2.23"
geo = "0.28.0"
85 changes: 55 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,65 @@
# usls

A Rust library integrated with **ONNXRuntime**, providing a collection of **Computer Vison** and **Vision-Language** models including [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [RTDETR](https://arxiv.org/abs/2304.08069), [CLIP](https://github.com/openai/CLIP), [DINOv2](https://github.com/facebookresearch/dinov2), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [BLIP](https://arxiv.org/abs/2201.12086), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) and others.
A Rust library integrated with **ONNXRuntime**, providing a collection of **Computer Vison** and **Vision-Language** models including [YOLOv5](https://github.com/ultralytics/yolov5), [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [RTDETR](https://arxiv.org/abs/2304.08069), [CLIP](https://github.com/openai/CLIP), [DINOv2](https://github.com/facebookresearch/dinov2), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [BLIP](https://arxiv.org/abs/2201.12086), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) and others.

## Recently Updated

| YOLOP-v2 | Face-Parsing | Text-Detection |
| :----------------------------: | :------------------------------: | :------------------------------: |
|<img src='examples/yolop/demo.png' height="240px">| <img src='examples/face-parsing/demo.png' height="240px"> | <img src='examples/db/demo.png' height="240px"> |


| YOLOv8-Obb |
| :----------------------------: |
|<img src='examples/yolov8/demo-obb-2.png' width="800px">|







## Supported Models

| Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
| :---------------------------------------------------------------: | :------------------------------------------------------------------------: | :----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
| [YOLOv8-detection](https://github.com/ultralytics/ultralytics) | Object Detection | [demo](examples/yolov8) |||||
| [YOLOv8-pose](https://github.com/ultralytics/ultralytics) | Keypoint Detection | [demo](examples/yolov8) |||||
| [YOLOv8-classification](https://github.com/ultralytics/ultralytics) | Classification | [demo](examples/yolov8) |||||
| [YOLOv8-segmentation](https://github.com/ultralytics/ultralytics) | Instance Segmentation | [demo](examples/yolov8) |||||
| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolov9) |||||
| [RT-DETR](https://arxiv.org/abs/2304.08069) | Object Detection | [demo](examples/rtdetr) |||||
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/fastsam) |||||
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection | [demo](examples/yolo-world) |||||
| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised | [demo](examples/dinov2) |||||
| [CLIP](https://github.com/openai/CLIP) | Vision-Language | [demo](examples/clip) ||| ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [BLIP](https://github.com/salesforce/BLIP) | Vision-Language | [demo](examples/blip) ||| ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [DB](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) |||||
| [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) |||||
| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) |||||
| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic driving Perception | [demo](examples/yolop) |||||
| Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
| :---------------------------------------------------------------: | :-------------------------: | :----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
| [YOLOv8-obb](https://github.com/ultralytics/ultralytics) | Oriented Object Detection | [demo](examples/yolov8) |||||
| [YOLOv8-detection](https://github.com/ultralytics/ultralytics) | Object Detection | [demo](examples/yolov8) |||||
| [YOLOv8-pose](https://github.com/ultralytics/ultralytics) | Keypoint Detection | [demo](examples/yolov8) |||||
| [YOLOv8-classification](https://github.com/ultralytics/ultralytics) | Classification | [demo](examples/yolov8) |||||
| [YOLOv8-segmentation](https://github.com/ultralytics/ultralytics) | Instance Segmentation | [demo](examples/yolov8) |||||
| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolov9) |||||
| [RT-DETR](https://arxiv.org/abs/2304.08069) | Object Detection | [demo](examples/rtdetr) |||||
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/fastsam) |||||
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection | [demo](examples/yolo-world) |||||
| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised | [demo](examples/dinov2) |||||
| [CLIP](https://github.com/openai/CLIP) | Vision-Language | [demo](examples/clip) ||| ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [BLIP](https://github.com/salesforce/BLIP) | Vision-Language | [demo](examples/blip) ||| ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [DB](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) |||||
| [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) |||||
| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) |||||
| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic driving Perception | [demo](examples/yolop) |||||
| [YOLOv5-classification](https://github.com/ultralytics/yolov5) | Object Detection | [demo](examples/yolov5) |||||
| [YOLOv5-segmentation](https://github.com/ultralytics/yolov5) | Instance Segmentation | [demo](examples/yolov5) |||||

## Solution Models

Additionally, this repo also provides some solution models.

| Model | Example | Result |
| :------------------------------------------------------------: | :------------------------------: | :------------------------------: |
| Lane Line Segmentation<br /> Drivable Area Segmentation<br />Car Detection<br />车道线-可行驶区域-车辆检测 | [demo](examples/yolov8-plastic-bag) |<img src='examples/yolop/demo.png' width="220px" height="140px">|
| Face Parsing<br /> 人脸解析 | [demo](examples/face-parsing) |<img src='examples/face-parsing/demo.png' width="220px" height="200px"> |
| Text Detection<br />(PPOCR-det v3, v4)<br />通用文本检测 | [demo](examples/db) |<img src='examples/db/demo.jpg' width="250px" height="200px">|
| Text Recognition<br />(PPOCR-rec v3, v4)<br />中英文-文本识别 | [demo](examples/svtr) ||
| Face-Landmark Detection<br />人脸 & 关键点检测 | [demo](examples/yolov8-face) |<img src='examples/yolov8-face/demo.jpg' width="220px" height="180px">|
| Head Detection<br /> 人头检测 | [demo](examples/yolov8-head) |<img src='examples/yolov8-head/demo.jpg' width="220px" height="180px">|
| Fall Detection<br /> 摔倒检测 | [demo](examples/yolov8-falldown) | <img src='examples/yolov8-falldown/demo.jpg' width="220px" height="180px">|
| Trash Detection<br /> 垃圾检测 | [demo](examples/yolov8-plastic-bag) |<img src='examples/yolov8-trash/demo.jpg' width="250px" height="180px">|
<details close>
<summary>Additionally, this repo also provides some solution models.</summary>

| Model | Example | Result |
| :---------------------------------------------------------------------------------------------------------: | :------------------------------: | :-----------------------------------------------------------------------------: |
| Lane Line Segmentation<br /> Drivable Area Segmentation<br />Car Detection<br />车道线-可行驶区域-车辆检测 | [demo](examples/yolov8-plastic-bag) | <img src='examples/yolop/demo.png' width="220px" height="140px"> |
| Face Parsing<br /> 人脸解析 | [demo](examples/face-parsing) | <img src='examples/face-parsing/demo.png' width="220px" height="200px"> |
| Text Detection<br />(PPOCR-det v3, v4)<br />通用文本检测 | [demo](examples/db) | <img src='examples/db/demo.png' width="250px" height="200px"> |
| Text Recognition<br />(PPOCR-rec v3, v4)<br />中英文-文本识别 | [demo](examples/svtr) | |
| Face-Landmark Detection<br />人脸 & 关键点检测 | [demo](examples/yolov8-face) | <img src='examples/yolov8-face/demo.png' width="220px" height="180px"> |
| Head Detection<br /> 人头检测 | [demo](examples/yolov8-head) | <img src='examples/yolov8-head/demo.png' width="220px" height="180px"> |
| Fall Detection<br /> 摔倒检测 | [demo](examples/yolov8-falldown) | <img src='examples/yolov8-falldown/demo.png' width="220px" height="180px"> |
| Trash Detection<br /> 垃圾检测 | [demo](examples/yolov8-plastic-bag) | <img src='examples/yolov8-trash/demo.png' width="250px" height="180px"> |

</details>

## Demo

Expand All @@ -59,8 +82,9 @@ check **[ort guide](https://ort.pyke.io/setup/linking)**

</details>


## Integrate into your own project
<details close>
<summary>Check Here</summary>

#### 1. Add `usls` as a dependency to your project's `Cargo.toml`

Expand Down Expand Up @@ -126,3 +150,4 @@ let y = model.run(&x)?;
let annotator = Annotator::default().with_saveout("YOLOv8");
annotator.annotate(&x, &y);
```
</details>
Binary file added assets/2.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/dota.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions examples/blip/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,12 @@ cargo run -r --example blip
```shell
[Unconditional image captioning]: a group of people walking around a bus
[Conditional image captioning]: three man walking in front of a bus
Some(["three man walking in front of a bus"])
```

## TODO

* [ ] Multi-batch inference for image caption
* [ ] VQA
* [ ] Retrival
* [ ] TensorRT support for textual model
10 changes: 6 additions & 4 deletions examples/blip/main.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use usls::{models::Blip, Options};
use usls::{models::Blip, DataLoader, Options};

fn main() -> Result<(), Box<dyn std::error::Error>> {
// visual
Expand All @@ -22,9 +22,11 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let mut model = Blip::new(options_visual, options_textual)?;

// image caption
model.caption("./assets/bus.jpg", None)?; // unconditional
model.caption("./assets/bus.jpg", Some("three man"))?; // conditional
// image caption (this demo use batch_size=1)
let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
let _y = model.caption(&x, None, true)?; // unconditional
let y = model.caption(&x, Some("three man"), true)?; // conditional
println!("{:?}", y[0].texts());

Ok(())
}
4 changes: 2 additions & 2 deletions examples/clip/main.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use usls::{models::Clip, ops, DataLoader, Options};
use usls::{models::Clip, DataLoader, Options};

fn main() -> Result<(), Box<dyn std::error::Error>> {
// visual
Expand Down Expand Up @@ -39,7 +39,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
let feats_image = model.encode_images(&images).unwrap();

// use image to query texts
let matrix = ops::dot2(&feats_image, &feats_text)?; // [m, n]
let matrix = feats_image.dot2(&feats_text)?;

// summary
for i in 0..paths.len() {
Expand Down
2 changes: 1 addition & 1 deletion examples/db/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ cargo run -r --example db

## Results

![](./demo.jpg)
![](./demo.png)
Binary file removed examples/db/demo.jpg
Binary file not shown.
Binary file added examples/db/demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
15 changes: 9 additions & 6 deletions examples/db/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,21 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut model = DB::new(&options)?;

// load image
let x = vec![DataLoader::try_read("./assets/db.png")?];
let x = vec![
DataLoader::try_read("./assets/db.png")?,
// DataLoader::try_read("./assets/2.jpg")?,
];

// run
let y = model.run(&x)?;

// annotate
let annotator = Annotator::default()
.without_name(true)
.without_polygons(false)
.with_mask_alpha(0)
.without_bboxes(false)
.with_saveout("DB-Text-Detection");
.without_bboxes(true)
.with_masks_alpha(60)
.with_polygon_color([255, 105, 180, 255])
.without_mbrs(true)
.with_saveout("DB");
annotator.annotate(&x, &y);

Ok(())
Expand Down
Binary file modified examples/face-parsing/demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 3 additions & 4 deletions examples/face-parsing/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
.with_i03((416, 640, 800).into())
// .with_trt(0)
// .with_fp16(true)
// .with_dry_run(10)
.with_confs(&[0.5]);
let mut model = YOLO::new(&options)?;

Expand All @@ -21,10 +20,10 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {

// annotate
let annotator = Annotator::default()
.without_conf(true)
.without_name(true)
.without_polygons(false)
.without_bboxes(true)
.without_bboxes_conf(true)
.without_bboxes_name(true)
.without_polygons(false)
.with_masks_name(false)
.with_saveout("Face-Parsing");
annotator.annotate(&x, &y);
Expand Down
2 changes: 1 addition & 1 deletion examples/fastsam/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ cargo run -r --example fastsam

## Results

![](./demo.jpg)
![](./demo.png)
Binary file removed examples/fastsam/demo.jpg
Binary file not shown.
Binary file added examples/fastsam/demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion examples/rtdetr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ cargo run -r --example rtdetr

## Results

![](./demo.jpg)
![](./demo.png)
Binary file removed examples/rtdetr/demo.jpg
Binary file not shown.
Binary file added examples/rtdetr/demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions examples/rtdetr/main.rs
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
use usls::{models::RTDETR, Annotator, DataLoader, Options, COCO_NAMES_80};
use usls::{coco, models::RTDETR, Annotator, DataLoader, Options};

fn main() -> Result<(), Box<dyn std::error::Error>> {
// build model
let options = Options::default()
.with_model("../models/rtdetr-l-f16.onnx")
.with_confs(&[0.4, 0.15]) // person: 0.4, others: 0.15
.with_names(&COCO_NAMES_80);
.with_names(&coco::NAMES_80);
let mut model = RTDETR::new(&options)?;

// load image
Expand Down
2 changes: 1 addition & 1 deletion examples/rtmo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ cargo run -r --example rtmo

## Results

![](./demo.jpg)
![](./demo.png)
Binary file removed examples/rtmo/demo.jpg
Binary file not shown.
Binary file added examples/rtmo/demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit beda8ef

Please sign in to comment.