Skip to content

Commit

Permalink
对应脚本修改,调整FAQ
Browse files Browse the repository at this point in the history
  • Loading branch information
TylunasLi committed Jul 19, 2024
1 parent e58ca77 commit 40a9622
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 20 deletions.
20 changes: 6 additions & 14 deletions docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,22 +27,14 @@ cmake .. -DUSE_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES=native
**解决办法:**

手动修改 CMakeLists.txt,根据GPU型号手动指定GPU的[Compute Capability](https://developer.nvidia.com/cuda-gpus)。如:

``` diff
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -52,7 +52,7 @@
#message(${CMAKE_CUDA_IMPLICIT_LINK_DIRECTORIES})
set(FASTLLM_CUDA_SOURCES src/devices/cuda/cudadevice.cpp src/devices/cuda/cudadevicebatch.cpp src/devices/cuda/fastllm-cuda.cu)
set(FASTLLM_LINKED_LIBS ${FASTLLM_LINKED_LIBS} cublas)
- set(CMAKE_CUDA_ARCHITECTURES "native")
+ set(CMAKE_CUDA_ARCHITECTURES 61 75 86 89)
endif()

if (PY_API)
根据GPU型号手动指定GPU的[Compute Capability](https://developer.nvidia.com/cuda-gpus)。如:

```shell
cmake .. -DUSE_CUDA=ON -DCUDA_ARCH="61;75;86;89"
```

若需要支持多种GPU架构,请使用“;”分隔(如上面例子)。

### identifier "__hdiv" is undefined

**现象:**
Expand Down
13 changes: 7 additions & 6 deletions docs/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,26 +4,26 @@

目前Fastllm加载模型有以下几种方式。

* **加载后转换(两行加速模式)** (convert on-the-fly)
* **加载后转换(两行加速模式)** (convert on-the-fly)
将原始模型加载为HuggingFace模型,再通过`from_hf()`方法,转换并加速,这种方法内存占用大且速度慢,目前不再推荐。

* **离线转换** (convert offline)
* **离线转换** (convert offline)
将原始模型转换为.flm格式的模型,一些[模型](#flm模型库)已经转换好。

* **直接读取** (load from Huggingface .safetensors)
* **直接读取** (load from Huggingface .safetensors)
直接读取HuggingFace上发布的模型,仅支持.safetensors格式的模型。


## 支持模型一览 Model List


* ✔ 表示支持该方式,并测试通过;
* ✔ 表示支持该方式,并测试通过;
✔ means supports this mode and passes the test.

* ❌ 表示本应该支持该方式,但实际测试后发现本功能并不受支持,可能在后续版本修复
* ❌ 表示本应该支持该方式,但实际测试后发现本功能并不受支持,可能在后续版本修复
❌ means this method is supposed to be supported, but failed after actual testing.

* √ 表示支持,但是还没有测试过
* √ 表示支持,但是还没有测试过
√ means supported, but not tested.

### GLM系列
Expand Down Expand Up @@ -61,6 +61,7 @@
| Qwen/Qwen1.5-14B-Chat | [](#其它模型) | [](#qwen模型导出) | ✔<sup>3</sup> |
| Qwen/Qwen1.5-72B-Chat | [](#其它模型) | [](#qwen模型导出) | ✔<sup>3</sup> |
| Qwen/Qwen1.5-32B-Chat | [](#其它模型) | [](#qwen模型导出) | ✔<sup>3</sup> |
| Qwen/Qwen1.5-110B-Chat | [](#其它模型) | [](#qwen模型导出) | √<sup>3</sup> |
| Qwen/CodeQwen1.5-7B-Chat | [](#其它模型) | [](#qwen模型导出) ||
| Qwen/Qwen2-0.5B-Instruct | [](#其它模型) | [](#qwen模型导出) ||
| Qwen/Qwen2-1.5B-Instruct | [](#其它模型) | [](#qwen模型导出) ||
Expand Down

0 comments on commit 40a9622

Please sign in to comment.