Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Add YOLOPv2 & Face-Parsing model" #4

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 66 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,50 +1,49 @@
# usls

A Rust library integrated with **ONNXRuntime**, providing a collection of **Computer Vison** and **Vision-Language** models including [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [RTDETR](https://arxiv.org/abs/2304.08069), [CLIP](https://github.com/openai/CLIP), [DINOv2](https://github.com/facebookresearch/dinov2), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [BLIP](https://arxiv.org/abs/2201.12086), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) and others.

A Rust library integrated with **ONNXRuntime**, providing a collection of **Computer Vison** and **Vision-Language** models including [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [RTDETR](https://arxiv.org/abs/2304.08069), [CLIP](https://github.com/openai/CLIP), [DINOv2](https://github.com/facebookresearch/dinov2), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [BLIP](https://arxiv.org/abs/2201.12086), [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) and others. Many execution providers are supported, sunch as `CUDA`, `TensorRT` and `CoreML`.

## Supported Models

| Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
| :---------------------------------------------------------------: | :------------------------------------------------------------------------: | :----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
| [YOLOv8-detection](https://github.com/ultralytics/ultralytics) | Object Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv8-pose](https://github.com/ultralytics/ultralytics) | Keypoint Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv8-classification](https://github.com/ultralytics/ultralytics) | Classification | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv8-segmentation](https://github.com/ultralytics/ultralytics) | Instance Segmentation | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolov9) | ✅ | ✅ | ✅ | ✅ |
| [RT-DETR](https://arxiv.org/abs/2304.08069) | Object Detection | [demo](examples/rtdetr) | ✅ | ✅ | ✅ | ✅ |
| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/fastsam) | ✅ | ✅ | ✅ | ✅ |
| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Object Detection | [demo](examples/yolo-world) | ✅ | ✅ | ✅ | ✅ |
| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision-Self-Supervised | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ |
| [CLIP](https://github.com/openai/CLIP) | Vision-Language | [demo](examples/clip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [BLIP](https://github.com/salesforce/BLIP) | Vision-Language | [demo](examples/blip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [DB](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | ✅ | | ✅ | ✅ |
| [SVTR](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | ✅ | | ✅ | ✅ |
| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | ✅ | | | |
| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic driving Perception | [demo](examples/yolop) | ✅ | ✅ | ✅ | ✅ |
| Model | Task / Type | Example | CUDA<br />f32 | CUDA<br />f16 | TensorRT<br />f32 | TensorRT<br />f16 |
| :---------------------------------------------------------------: | :----------------------: |:----------------------: | :-----------: | :-----------: | :------------------------: | :-----------------------: |
| **[YOLOv8-detection](https://github.com/ultralytics/ultralytics)** | Object Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
| **[YOLOv8-pose](https://github.com/ultralytics/ultralytics)** | Keypoint Detection | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
| **[YOLOv8-classification](https://github.com/ultralytics/ultralytics)** | Classification | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
| **[YOLOv8-segmentation](https://github.com/ultralytics/ultralytics)** | Instance Segmentation | [demo](examples/yolov8) | ✅ | ✅ | ✅ | ✅ |
| **[YOLOv9](https://github.com/WongKinYiu/yolov9)** | Object Detection | [demo](examples/yolov9) | ✅ | ✅ | ✅ | ✅ |
| **[RT-DETR](https://arxiv.org/abs/2304.08069)** | Object Detection | [demo](examples/rtdetr) | ✅ | ✅ | ✅ | ✅ |
| **[FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM)** | Instance Segmentation | [demo](examples/fastsam) | ✅ | ✅ | ✅ | ✅ |
| **[YOLO-World](https://github.com/AILab-CVC/YOLO-World)** | Object Detection | [demo](examples/yolo-world) | ✅ | ✅ | ✅ | ✅ |
| **[DINOv2](https://github.com/facebookresearch/dinov2)** | Vision-Self-Supervised | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ |
| **[CLIP](https://github.com/openai/CLIP)** | Vision-Language | [demo](examples/clip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| **[BLIP](https://github.com/salesforce/BLIP)** | Vision-Language | [demo](examples/blip) | ✅ | ✅ | ✅ visual<br />❌ textual | ✅ visual<br />❌ textual |
| [**DB**](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | ✅ | | ✅ | ✅ |
| [**SVTR**](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | ✅ | | ✅ | ✅ |
| [**RTMO**](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | ✅ | ✅ | | |


## Solution Models

Additionally, this repo also provides some solution models.

| Model | Example | Result |
| :------------------------------------------------------------: | :------------------------------: | :------------------------------: |
| Lane Line Segmentation<br /> Drivable Area Segmentation<br />Car Detection<br />车道线-可行驶区域-车辆检测 | [demo](examples/yolov8-plastic-bag) |<img src='examples/yolop/demo.png' width="220px" height="140px">|
| Face Parsing<br /> 人脸解析 | [demo](examples/face-parsing) |<img src='examples/face-parsing/demo.png' width="220px" height="200px"> |
| Text Detection<br />(PPOCR-det v3, v4)<br />通用文本检测 | [demo](examples/db) |<img src='examples/db/demo.jpg' width="250px" height="200px">|
| Text Recognition<br />(PPOCR-rec v3, v4)<br />中英文-文本识别 | [demo](examples/svtr) ||
| Face-Landmark Detection<br />人脸 & 关键点检测 | [demo](examples/yolov8-face) |<img src='examples/yolov8-face/demo.jpg' width="220px" height="180px">|
| Head Detection<br /> 人头检测 | [demo](examples/yolov8-head) |<img src='examples/yolov8-head/demo.jpg' width="220px" height="180px">|
| Fall Detection<br /> 摔倒检测 | [demo](examples/yolov8-falldown) | <img src='examples/yolov8-falldown/demo.jpg' width="220px" height="180px">|
| Trash Detection<br /> 垃圾检测 | [demo](examples/yolov8-plastic-bag) |<img src='examples/yolov8-trash/demo.jpg' width="250px" height="180px">|
| Model | Example |
| :--------------------------------------------------------------------------------: | :------------------------------: |
| **text detection<br />(PPOCR-det v3, v4)**<br />**通用文本检测** | [demo](examples/db) |
| **text recognition<br />(PPOCR-rec v3, v4)**<br />**中英文-文本识别** | [demo](examples/svtr) |
| **face-landmark detection**<br />**人脸 & 关键点检测** | [demo](examples/yolov8-face) |
| **head detection**<br /> **人头检测** | [demo](examples/yolov8-head) |
| **fall detection**<br /> **摔倒检测** | [demo](examples/yolov8-falldown) |
| **trash detection**<br /> **垃圾检测** | [demo](examples/yolov8-plastic-bag) |

## Demo

```
cargo run -r --example yolov8 # yolov9, blip, clip, dinov2, svtr, db, yolo-world...
cargo run -r --example yolov8 # fastsam, yolov9, blip, clip, dinov2, yolo-world...
```

## Installation
## Integrate into your own project

#### 1. Install [ort](https://github.com/pykeio/ort)

check **[ort guide](https://ort.pyke.io/setup/linking)**

Expand All @@ -59,16 +58,13 @@ check **[ort guide](https://ort.pyke.io/setup/linking)**

</details>


## Integrate into your own project

#### 1. Add `usls` as a dependency to your project's `Cargo.toml`
#### 2. Add `usls` as a dependency to your project's `Cargo.toml`

```shell
cargo add --git https://github.com/jamjamjon/usls
```

#### 2. Set `Options` and build model
#### 3. Set `Options` and build model

```Rust
let options = Options::default()
Expand All @@ -77,29 +73,32 @@ let mut model = YOLO::new(&options)?;
```

- If you want to run your model with TensorRT or CoreML
```Rust
let options = Options::default()
.with_trt(0) // using cuda by default
// .with_coreml(0)
```


```Rust
let options = Options::default()
.with_trt(0) // using cuda by default
// .with_coreml(0)
```
- If your model has dynamic shapes
```Rust
let options = Options::default()
.with_i00((1, 2, 4).into()) // dynamic batch
.with_i02((416, 640, 800).into()) // dynamic height
.with_i03((416, 640, 800).into()) // dynamic width
```

```Rust
let options = Options::default()
.with_i00((1, 2, 4).into()) // dynamic batch
.with_i02((416, 640, 800).into()) // dynamic height
.with_i03((416, 640, 800).into()) // dynamic width
```
- If you want to set a confidence level for each category
```Rust
let options = Options::default()
.with_confs(&[0.4, 0.15]) // person: 0.4, others: 0.15
```

```Rust
let options = Options::default()
.with_confs(&[0.4, 0.15]) // person: 0.4, others: 0.15
```
- Go check [Options](src/options.rs) for more model options.

#### 3. Prepare inputs, and then you're ready to go


#### 4. Prepare inputs, and then you're ready to go

- Build `DataLoader` to load images

Expand All @@ -120,9 +119,22 @@ let x = vec![DataLoader::try_read("./assets/bus.jpg")?];
let y = model.run(&x)?;
```

#### 4. Annotate and save results

#### 5. Annotate and save results
```Rust
let annotator = Annotator::default().with_saveout("YOLOv8");
annotator.annotate(&x, &y);
```


## Script: converte ONNX model from `float32` to `float16`

```python
import onnx
from pathlib import Path
from onnxconverter_common import float16

model_f32 = "onnx_model.onnx"
model_f16 = float16.convert_float_to_float16(onnx.load(model_f32))
saveout = Path(model_f32).with_name(Path(model_f32).stem + "-f16.onnx")
onnx.save(model_f16, saveout)
```
Binary file removed assets/car.jpg
Binary file not shown.
Binary file removed assets/nini.png
Binary file not shown.
8 changes: 0 additions & 8 deletions convert2f16.py

This file was deleted.

32 changes: 29 additions & 3 deletions examples/blip/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,41 @@
This demo shows how to use [BLIP](https://arxiv.org/abs/2201.12086) to do conditional or unconditional image captioning.


## Quick Start

```shell
cargo run -r --example blip
```

## BLIP ONNX Model
## Or you can manully


### 1. Donwload CLIP ONNX Model

[blip-visual-base](https://github.com/jamjamjon/assets/releases/download/v0.0.1/blip-visual-base.onnx)
[blip-textual-base](https://github.com/jamjamjon/assets/releases/download/v0.0.1/blip-textual-base.onnx)


### 2. Specify the ONNX model path in `main.rs`

```Rust
// visual
let options_visual = Options::default()
.with_model("VISUAL_MODEL") // <= modify this
.with_profile(false);

- [blip-visual-base](https://github.com/jamjamjon/assets/releases/download/v0.0.1/blip-visual-base.onnx)
- [blip-textual-base](https://github.com/jamjamjon/assets/releases/download/v0.0.1/blip-textual-base.onnx)
// textual
let options_textual = Options::default()
.with_model("TEXTUAL_MODEL") // <= modify this
.with_profile(false);

```

### 3. Then, run

```bash
cargo run -r --example blip
```


## Results
Expand Down
40 changes: 36 additions & 4 deletions examples/clip/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,37 @@ This demo showcases how to use [CLIP](https://github.com/openai/CLIP) to compute
cargo run -r --example clip
```

## CLIP ONNX Model
## Or you can manully


### 1.Donwload CLIP ONNX Model

[clip-b32-visual](https://github.com/jamjamjon/assets/releases/download/v0.0.1/clip-b32-visual.onnx)
[clip-b32-textual](https://github.com/jamjamjon/assets/releases/download/v0.0.1/clip-b32-textual.onnx)


### 2. Specify the ONNX model path in `main.rs`

```Rust
// visual
let options_visual = Options::default()
.with_model("VISUAL_MODEL") // <= modify this
.with_i00((1, 1, 4).into())
.with_profile(false);

// textual
let options_textual = Options::default()
.with_model("TEXTUAL_MODEL") // <= modify this
.with_i00((1, 1, 4).into())
.with_profile(false);
```

### 3. Then, run

```bash
cargo run -r --example clip
```

- [clip-b32-visual](https://github.com/jamjamjon/assets/releases/download/v0.0.1/clip-b32-visual.onnx)
- [clip-b32-textual](https://github.com/jamjamjon/assets/releases/download/v0.0.1/clip-b32-textual.onnx)


## Results
Expand All @@ -23,4 +50,9 @@ cargo run -r --example clip

(86.59852%) ./examples/clip/images/doll.jpg => There is a doll with red hair and a clock on a table
[0.07032883, 0.00053773675, 0.0006372929, 0.06066096, 0.0007378078, 0.8659852, 0.0011121632]
```
```


## TODO

* [ ] TensorRT support for textual model
21 changes: 18 additions & 3 deletions examples/db/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,25 @@
cargo run -r --example db
```

## ONNX Model
## Or you can manully

- [ppocr-v3-db-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v3-db-dyn.onnx)
- [ppocr-v4-db-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v4-db-dyn.onnx)
### 1. Donwload ONNX Model

[ppocr-v3-db-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v3-db-dyn.onnx)
[ppocr-v4-db-dyn](https://github.com/jamjamjon/assets/releases/download/v0.0.1/ppocr-v4-db-dyn.onnx)

### 2. Specify the ONNX model path in `main.rs`

```Rust
let options = Options::default()
.with_model("ONNX_PATH") // <= modify this
```

### 3. Run

```bash
cargo run -r --example db
```

### Speed test

Expand Down
2 changes: 1 addition & 1 deletion examples/db/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {

// annotate
let annotator = Annotator::default()
.with_polygon_color([255u8, 0u8, 0u8])
.without_name(true)
.without_polygons(false)
.with_mask_alpha(0)
.without_bboxes(false)
.with_saveout("DB-Text-Detection");
annotator.annotate(&x, &y);
Expand Down
Loading
Loading