Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama.cpp llama-server support #25

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

rain-1
Copy link

@rain-1 rain-1 commented Sep 20, 2024

I run ./llama-server --host 0.0.0.0 -m ./models/gpt2-xl.Q4_K_M.gguf. It proves an text completion API that is similar but not quite compatable with OpenAI. It only supports one completion at a time. So using control-shift-space a lot is a must.

I split the OpenAI Compat function into a helper function that both the openai-compat and llama.cpp support make use of.

Here is a working loom config:

image

Note: llama-server requires the endpoint to be /v1/completions, a trailing slash /v1/completions/ will not work.

Future work: Get llama.cpp to support multiple completions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant