Skip to content

Commit

Permalink
fix
Browse files Browse the repository at this point in the history
  • Loading branch information
conglongli committed Nov 7, 2023
1 parent 0a5107a commit c01c4b9
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion blogs/deepspeed-fastgen/chinese/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ DeepSpeed-FastGen 利用分块 KV 缓存和动态分割融合连续批处理,
</div><br>

<div align="center">
<img src="../assets/images/throughput_latency_13B.png" alt="" width="850"/><br>
<img src="../assets/images/throughput_latency_13B_no_arrow.png" alt="" width="850"/><br>

*图 3: 使用 Llama 2 13B 进行文本生成的吞吐量和延迟(A100-80GB GPU,无张量并行)。提示和生成长度遵循正态分布,平均值分别为 1200/2600 和 60/128,并且有 30% 的方差*
</div>
Expand Down

0 comments on commit c01c4b9

Please sign in to comment.