-
Notifications
You must be signed in to change notification settings - Fork 41
Issues: vllm-project/llm-compressor
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
AttributeError: 'MllamaConfig' object has no attribute 'use_cache'
bug
Something isn't working
#688
opened Sep 26, 2024 by
mgoin
SmoothQuant doesn't respect ignored modules for VLMs
bug
Something isn't working
#687
opened Sep 26, 2024 by
mgoin
KV Cache Quantization example cause problem
bug
Something isn't working
#660
opened Sep 25, 2024 by
weicheng59
Qwen1.5-MoE-A2.7B-Chat w4a16 Quantization Failed
bug
Something isn't working
#189
opened Sep 20, 2024 by
donpromax
DeepseekV2-w8a8 weight needed in HF
enhancement
New feature or request
#175
opened Sep 13, 2024 by
Eviannn
[USAGE] FP8 W8A8 (+KV) with LORA Adapters
enhancement
New feature or request
#164
opened Sep 11, 2024 by
paulliwog
Error in the file 2:4_w4a16_group-128_recipe.yaml
bug
Something isn't working
#154
opened Sep 10, 2024 by
carrot-o0o
When can we support w8a8 fp8 quantization and sparse2:4 llm compress and adapt it on vllm?
enhancement
New feature or request
#148
opened Sep 9, 2024 by
leoyuppieqnew
[Usage] Where to set the tokenizer in the sparse example?
#129
opened Aug 29, 2024 by
CharlesRiggins
SmoothQuant doesn't work with cpu offloading
bug
Something isn't working
#107
opened Aug 23, 2024 by
anmarques
[Bug]: Index Error tuple out of range
bug
Something isn't working
#106
opened Aug 23, 2024 by
SeanIsYoung
Yaml parsing fails with a custom mapping provided to SmoothQuantModifier recipe
bug
Something isn't working
#105
opened Aug 22, 2024 by
aatkinson
[Feature] Add a New feature or request
"version": x.x.x
entry in the final config.json
enhancement
#93
opened Aug 15, 2024 by
mgoin
Layers not skipped with ignore=[ "re:.*"]
bug
Something isn't working
#91
opened Aug 15, 2024 by
horheynm
lm_eval compatibility with generated model
bug
Something isn't working
#83
opened Aug 13, 2024 by
horheynm
Llava model quantization seems not be supported
bug
Something isn't working
#73
opened Aug 10, 2024 by
caojinpei
Saving an existing 2:4 model as compressed doesn't produce a usable "quant_config"
bug
Something isn't working
#72
opened Aug 9, 2024 by
mgoin
MODEL REQUESTS
enhancement
New feature or request
#69
opened Aug 8, 2024 by
robertgshaw2-neuralmagic
FEATURE REQUESTS
enhancement
New feature or request
#68
opened Aug 8, 2024 by
robertgshaw2-neuralmagic
Q3 ROADMAP
roadmap
Items planned to be worked on
#30
opened Jul 22, 2024 by
robertgshaw2-neuralmagic
8 of 21 tasks
ProTip!
Adding no:label will show everything without a label.