vllm-project / llm-compressor Public

Notifications You must be signed in to change notification settings
Fork 41
Star 512

Code
Issues 24
Pull requests 22
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Issues: vllm-project/llm-compressor

FEATURE REQUESTS

#68 opened Aug 8, 2024 by robertgshaw2-neuralmagic

Open 1

MODEL REQUESTS

#69 opened Aug 8, 2024 by robertgshaw2-neuralmagic

Open 53

Q3 ROADMAP

#30 opened Jul 22, 2024 by robertgshaw2-neuralmagic

Open 4

Labels 11 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

24 Open 35 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

AttributeError: 'MllamaConfig' object has no attribute 'use_cache' bug

Something isn't working

#688 opened Sep 26, 2024 by mgoin

SmoothQuant doesn't respect ignored modules for VLMs bug

Something isn't working

#687 opened Sep 26, 2024 by mgoin

KV Cache Quantization example cause problem bug

Something isn't working

#660 opened Sep 25, 2024 by weicheng59

Qwen1.5-MoE-A2.7B-Chat w4a16 Quantization Failed bug

Something isn't working

#189 opened Sep 20, 2024 by donpromax

DeepseekV2-w8a8 weight needed in HF enhancement

New feature or request

#175 opened Sep 13, 2024 by Eviannn

[USAGE] FP8 W8A8 (+KV) with LORA Adapters enhancement

New feature or request

#164 opened Sep 11, 2024 by paulliwog

Error in the file 2:4_w4a16_group-128_recipe.yaml bug

Something isn't working

#154 opened Sep 10, 2024 by carrot-o0o

When can we support w8a8 fp8 quantization and sparse2:4 llm compress and adapt it on vllm? enhancement

New feature or request

#148 opened Sep 9, 2024 by leoyuppieqnew

KeyError with torch.float8_e4m3fn bug

Something isn't working

#138 opened Sep 2, 2024 by Lue-C

[Usage] Can I do GPTQ with FP8 KV cache scheme?

#137 opened Sep 2, 2024 by CharlesRiggins

[Usage] Where to set the tokenizer in the sparse example?

#129 opened Aug 29, 2024 by CharlesRiggins

[Performance]: SmoothQuant quantization is too slow

#112 opened Aug 26, 2024 by zxy1119

1 task done

convert model to FP8 error bug

Something isn't working

#110 opened Aug 26, 2024 by kuangdao

SmoothQuant doesn't work with cpu offloading bug

Something isn't working

#107 opened Aug 23, 2024 by anmarques

[Bug]: Index Error tuple out of range bug

Something isn't working

#106 opened Aug 23, 2024 by SeanIsYoung

Yaml parsing fails with a custom mapping provided to SmoothQuantModifier recipe bug

Something isn't working

#105 opened Aug 22, 2024 by aatkinson

[Feature] Add a "version": x.x.x entry in the final config.json enhancement

New feature or request

#93 opened Aug 15, 2024 by mgoin

Layers not skipped with ignore=[ "re:.*"] bug

Something isn't working

#91 opened Aug 15, 2024 by horheynm

lm_eval compatibility with generated model bug

Something isn't working

#83 opened Aug 13, 2024 by horheynm

Llava model quantization seems not be supported bug

Something isn't working

#73 opened Aug 10, 2024 by caojinpei

Saving an existing 2:4 model as compressed doesn't produce a usable "quant_config" bug

Something isn't working

#72 opened Aug 9, 2024 by mgoin

MODEL REQUESTS enhancement

New feature or request

#69 opened Aug 8, 2024 by robertgshaw2-neuralmagic

FEATURE REQUESTS enhancement

New feature or request

#68 opened Aug 8, 2024 by robertgshaw2-neuralmagic

Q3 ROADMAP roadmap

Items planned to be worked on

#30 opened Jul 22, 2024 by robertgshaw2-neuralmagic

8 of 21 tasks

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly