-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMD] Fix compilation issue with ROCm #137
base: main
Are you sure you want to change the base?
Conversation
i can confirm. Compile without a issue now. (rocm nightly and old vega64 :) |
Hello, Installing done without error but when trying to prompt the model using llm = AutoModelForCausalLM.from_pretrained("/models/llama-7b.Q3_K_M.gguf", model_type="llama" , local_files_only=True, gpu_layers=100)
print(llm("AI is going to")) exits with error CUDA error 98 at ~/ctransformers/models/ggml/ggml-cuda.cu:6045: invalid device function |
When using ROCm, "CUDA error 98 ....... invalid device function" (as far as I know) usually means version implementation problems in the HIP stack. Most likely it's solvable with
|
Hi, apt show rocm-libs -a
But in "rocm/pytorch:latest-release"
Raises the" invalid device function" Error. NOTE: NOTE: I assume the "invalid device function" depends on the environment's library version(s). /lgtm |
CUDA error 98 at /home/gingi/github/ctransformers/models/ggml/ggml-cuda.cu:6045: invalid device function I have export HSA_OVERRIDE_GFX_VERSION=11.0.0 |
the below command worked well! |
Problem:
Unable to install the package on a Linux machine with an AMD 6800XT GPU using ROCm.
Error logs: https://gist.github.com/bhargav/7f8c2984ba32ff99ce8e93433d9059a6
Solution:
Failures are due to references to CUDA library imports instead of using the HIP versions when compiled for AMD.
Verified that the project can build with the fixes.
Build log: https://gist.github.com/bhargav/65bbbd039bda6f39504448656e88ab6b
Package installs successfully and I was able to run a model inference on GPU.