Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Enable splitting qkv and gate_up
#662 opened Dec 24, 2024 by tianmu-li Loading…
[SW-197036] - use torch._scaled_mm with hpu
#660 opened Dec 22, 2024 by nirda7 Loading…
Draft: Delayed prompts
#659 opened Dec 20, 2024 by kamil-kaczor Draft
Chunked Prefill
#656 opened Dec 20, 2024 by hlahkar Draft
Lora manager tests fix
#652 opened Dec 19, 2024 by rsshaik1 Draft
Add mark_step for encoder layers
#650 opened Dec 19, 2024 by yisonzhu Loading…
[bugfix] fix RuntimeError on apc
#648 opened Dec 19, 2024 by kkimmk Loading…
Fix: selecting correct backend for MultiHeadAttention habana Issues or PRs submitted by Habana Labs
#645 opened Dec 18, 2024 by adobrzyniewicz-habana Loading…
Fix model OOM issue in llama-405 and mixtral - 2nd attempt habana Issues or PRs submitted by Habana Labs
#644 opened Dec 18, 2024 by afierka-intel Loading…
Selective merged prefill
#643 opened Dec 18, 2024 by xuechendi Loading…
Multimodality fix for llava habana Issues or PRs submitted by Habana Labs
#641 opened Dec 17, 2024 by adobrzyniewicz-habana Loading…
Add inc fp8 qunatization documentation
#635 opened Dec 16, 2024 by nirda7 Loading…
Device Type HPU support for torch.generator() API
#628 opened Dec 13, 2024 by nageshdn Loading…
Fix long contexts in LoRA
#624 opened Dec 12, 2024 by SanjuCSudhakaran Loading…
[BUG fix] Rebase caused spec decode fix
#613 opened Dec 11, 2024 by xuechendi Loading…
[WIP] Add HPU support to vLLM v1 - cont.
#609 opened Dec 10, 2024 by kzawora-intel Loading…
21 of 23 tasks
Add in Dockerfile.hpu.ubi
#602 opened Dec 9, 2024 by Xaenalt Loading…
Add real BS & seq_len to profiling
#601 opened Dec 9, 2024 by kamil-kaczor Loading…
Documentation update for 1.19
#597 opened Dec 5, 2024 by PatrykWo Loading…
ProTip! Add no:assignee to see everything that’s not assigned.