forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 62
Pull requests: HabanaAI/vllm-fork
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Port "Fix recompilations due to different batch_sizes in MSS PR637"
#657
opened Dec 20, 2024 by
iboiko-habana
Loading…
Support throughput benchmarking for mllama with vision input
#647
opened Dec 19, 2024 by
yisonzhu
Loading…
Fix: selecting correct backend for MultiHeadAttention
habana
Issues or PRs submitted by Habana Labs
#645
opened Dec 18, 2024 by
adobrzyniewicz-habana
Loading…
Fix model OOM issue in llama-405 and mixtral - 2nd attempt
habana
Issues or PRs submitted by Habana Labs
#644
opened Dec 18, 2024 by
afierka-intel
Loading…
Multimodality fix for llava
habana
Issues or PRs submitted by Habana Labs
#641
opened Dec 17, 2024 by
adobrzyniewicz-habana
Loading…
[WIP] Add HPU support to vLLM v1 - cont.
#609
opened Dec 10, 2024 by
kzawora-intel
Loading…
21 of 23 tasks
Optimize for topk=1 case if we do not handle duplicates
#603
opened Dec 9, 2024 by
ssarkar2
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.