Skip to content

Commit

Permalink
Update webllm figures
Browse files Browse the repository at this point in the history
  • Loading branch information
CharlieFRuan committed Jun 13, 2024
1 parent bab37e6 commit c2ba8be
Show file tree
Hide file tree
Showing 3 changed files with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ However, we demonstrate that WebLLM’s performance is close to native performan

<p align="center">
<img src="/img/webllm-engine/perf.png" width="60%">
<figcaption>Table 1. Decode speed comparison of WebGPU and native Metal. Run with 64 prefill tokens, decoding 128 tokens. Both models are 4-bit quantized.</figcaption>
<figcaption>Figure 9. Decode speed comparison of WebGPU and native Metal. Run with 64 prefill tokens, decoding 128 tokens. Both models are 4-bit quantized.</figcaption>
</p>

Our result shows that WebGPU can preserve up to 85% of the native performance. This is still an early stage of WebGPU support as most browsers just shipped it this year. We anticipate that the gap can continue to improve, as the WebGPU to native shader translation improves.
Expand Down
Binary file modified img/webllm-engine/arch-and-chat.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified img/webllm-engine/perf.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit c2ba8be

Please sign in to comment.