Skip to content

Commit

Permalink
improve the segment-anything example (#385)
Browse files Browse the repository at this point in the history
* improve the example

* remove some text

* remove provider selection from ui

* point to sd-turbo

* make wasm work on int8
  • Loading branch information
guschmue authored Feb 28, 2024
1 parent e09a4a6 commit dfa685f
Show file tree
Hide file tree
Showing 7 changed files with 534 additions and 371 deletions.
2 changes: 2 additions & 0 deletions js/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,5 @@ Click links for README of each examples.
* [OpenAI Whisper](ort-whisper) - demonstrates how to run [whisper tiny.en](https://github.com/openai/whisper) in your browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime) and the browser's audio interfaces.

* [Facebook Segment-Anything](segment-anything) - demonstrates how to run [segment-anything](https://github.com/facebookresearch/segment-anything) in your browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime/js) with webgpu.

* [Stable Diffusion Turbo](sd-turbo) - demonstrates how to run [Stable Diffusion Turbo](https://huggingface.co/stabilityai/sd-turbo) in your browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime/js) with webgpu.
74 changes: 29 additions & 45 deletions js/segment-anything/README.md
Original file line number Diff line number Diff line change
@@ -1,63 +1,47 @@
# Run Segment-Anything in your browser using webgpu and onnxruntime-web
# Segment-Anything: Browser-Based Image Segmentation with WebGPU and ONNX Runtime Web

This example demonstrates how to run [Segment-Anything](https://github.com/facebookresearch/segment-anything) in your
browser using [onnxruntime-web](https://github.com/microsoft/onnxruntime) and webgpu.
This repository contains an example of running [Segment-Anything](https://github.com/facebookresearch/segment-anything), an encoder/decoder model for image segmentation, in a browser using [ONNX Runtime Web](https://github.com/microsoft/onnxruntime) with WebGPU.

Segment-Anything is a encoder/decoder model. The encoder creates embeddings and using the embeddings the decoder creates the segmentation mask.
You can try out the live demo [here](https://guschmue.github.io/ort-webgpu/segment-anything/index.html).

One can run the decoder in onnxruntime-web using WebAssembly with latencies at ~200ms.
## Model Overview

The encoder is much more compute intensive and takes ~45sec using WebAssembly what is not practical.
Using webgpu we can speedup the encoder ~50 times and it becomes visible to run it inside the browser, even on a integrated GPU.
Segment-Anything creates embeddings for an image using an encoder. These embeddings are then used by the decoder to create and update the segmentation mask. The decoder can run in ONNX Runtime Web using WebAssembly with latencies at ~200ms.

## Usage
The encoder is more compute-intensive, taking ~45sec in WebAssembly, which is not practical. However, by using WebGPU, we can speed up the encoder, making it feasible to run it inside the browser, even on an integrated GPU.

## Getting Started

### Prerequisites

Ensure that you have [Node.js](https://nodejs.org/) installed on your machine.

### Installation
First, install the required dependencies by running the following command in your terminal:

1. Install the required dependencies:

```sh
npm install
```

### Build the code
Next, bundle the code using webpack by running:
### Building the Project

1. Bundle the code using webpack:

```sh
npm run build
```
this generates the bundle file `./dist/bundle.min.js`

### Create an ONNX Model
This command generates the bundle file `./dist/index.js`.

We use [samexporter](https://github.com/vietanhdev/samexporter) to export encoder and decoder to onnx.
Install samexporter:
```sh
pip install https://github.com/vietanhdev/samexporter
```
Download the pytorch model from [Segment-Anything](https://github.com/facebookresearch/segment-anything). We use the smallest flavor (vit_b).
```sh
curl -o models/sam_vit_b_01ec64.pth https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
```
Export both encoder and decoder to onnx:
```sh
python -m samexporter.export_encoder --checkpoint models/sam_vit_b_01ec64.pth \
--output models/sam_vit_b_01ec64.encoder.onnx \
--model-type vit_b

python -m samexporter.export_decoder --checkpoint models/sam_vit_b_01ec64.pth \
--output models/sam_vit_b_01ec64.decoder.onnx \
--model-type vit_b \
--return-single-mask
```
### Start a web server
Use NPM package `light-server` to serve the current folder at http://localhost:8888/.
To start the server, run:
```sh
npx light-server -s . -p 8888
```
### The ONNX Model

### Point your browser at the web server
Once the web server is running, open your browser and navigate to http://localhost:8888/.
You should now be able to run Segment-Anything in your browser.
The model used in this project is hosted on [Hugging Face](https://huggingface.co/schmuell/sam-b-fp16). It was created using [samexporter](https://github.com/vietanhdev/samexporter).

## TODO
* add support for fp16
* add support for MobileSam
### Running the Project

Start a web server to serve the current folder at http://localhost:8888/. To start the server, run:

```sh
npm run dev
```
27 changes: 14 additions & 13 deletions js/segment-anything/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.2.3/dist/css/bootstrap.min.css" rel="stylesheet"
integrity="sha384-rbsA2VBKQhggwzxH7pPCaAqO46MgnOM80zW1RWuH61DGLwZJEdK2Kadq2F9CUG65" crossorigin="anonymous">
<script src="./dist/bundle.min.js"></script>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.1/dist/css/bootstrap.min.css" rel="stylesheet"
integrity="sha384-4bw+/aepP/YC94hEpVNVgiZdgIC5+VKNBQNGCHeKRQN+PtmoHDEXuppvnDJzQIu9" crossorigin="anonymous" />
<script type="module" src="dist/index.js"></script>

<style>
/* Add rounded corners to blocks */
Expand All @@ -23,7 +23,7 @@
left: 50%;
transform: translate(-50%, -50%);
padding: 5px 10px;
background-color: white;
background-color: #212529;
font-size: 18px;
}

Expand All @@ -38,7 +38,7 @@

</head>

<body>
<body data-bs-theme="dark">
<title>segment anything example</title>
<div class="container-fluid">
<h2>segment anything example</h2>
Expand Down Expand Up @@ -71,15 +71,16 @@ <h4>Latencies</h4>
accept=".jpg, .png, .jpeg, .gif, .bmp, .tif, .tiff|image/*">
</div>
</form>
<div class="form-group ">
<button id="cut-button" type="button" class="btn btn-primary">Cut</button>
<button id="clear-button" type="button" class="btn btn-primary">Clear</button>
</div>
<div style="margin-top: 30px;">
<div>Other providers:</div>
<a href="index.html?provider=wasm&model=sam_b_int8">wasm</a>
<a href="index.html?provider=webgpu&model=sam_b">webgpu</a>
</div>
</div>
<div style="margin-top: 30px;">
<div>Other providers:</div>
<a href="index.html?provider=wasm">wasm</a>
<a href="index.html?provider=webgpu">webgpu</a>
<a href="index.html?provider=webnn">webnn</a>
</div>

</div>

<p class="text-lg-start">
<div id="status" style="font: 1em consolas;"></div>
Expand Down
Loading

0 comments on commit dfa685f

Please sign in to comment.