Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add VLLM support? #1010

Open
cboettig opened this issue Sep 22, 2024 · 1 comment
Open

add VLLM support? #1010

cboettig opened this issue Sep 22, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@cboettig
Copy link

Problem/Solution

It has been great to see Ollama added as a first-class option in #646, this has made it easy to access a huge variety of models and has been working very well us.

I increasingly see groups and university providers using VLLM for this as well. I'm out of my depth but I understand VLLM is considered better suited when a group is serving a local model to multiple users (e.g. from a local GPU cluster, rather than everyone running an independent Ollama). It gets passing mention in some threads here as well. I think supporting more providers is all to the good and would love to see support for this as a backend similar to the existing Ollama support, though maybe I'm not understanding the details and that is unnecessary? (i.e. it looks like it might be possible to simply use the OpenAI configuration with alternative endpoint to access a VLLM server?)

@cboettig cboettig added the enhancement New feature or request label Sep 22, 2024
@cboettig
Copy link
Author

cboettig commented Oct 3, 2024

It looks like the team at the National Research Platform has a nice work-around for this at the moment using LiteLLM via it's OpenAI-compatible API (https://docs.litellm.ai/docs/proxy/user_keys) This works, though isn't really the same as direct VLLM support, but thought it was worth mentioning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant