You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Transformers fails with the following error, when trying to use AWQ with TGI / neural compression enginer, or optimum habana
ValueError: AWQ is only available on GPU
Information
The official example scripts
My own modified scripts
Tasks
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
.
Expected behavior
.
The text was updated successfully, but these errors were encountered:
The primary goal is to get llama405b on a single gaudi node
I had read originally that huggingface TGI was supposed to use awq, but i was unable to use any sort of quantization method at all, provided by huggingface quants, including GPTQ, uint4, etc, its just spread amongst different issues.
System Info
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
.
Expected behavior
.
The text was updated successfully, but these errors were encountered: