llama.cpp llama-server support #25

rain-1 · 2024-09-20T11:21:40Z

I run ./llama-server --host 0.0.0.0 -m ./models/gpt2-xl.Q4_K_M.gguf. It proves an text completion API that is similar but not quite compatable with OpenAI. It only supports one completion at a time. So using control-shift-space a lot is a must.

I split the OpenAI Compat function into a helper function that both the openai-compat and llama.cpp support make use of.

Here is a working loom config:

Note: llama-server requires the endpoint to be /v1/completions, a trailing slash /v1/completions/ will not work.

Future work: Get llama.cpp to support multiple completions.

rain-1 added 2 commits September 20, 2024 12:08

Add llama.cpp support

6b5df1b

llama.cpp support: Add the missing URL field

9b57d10

cognitivetech mentioned this pull request Nov 5, 2024

support for settings.n in ollama #31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama.cpp llama-server support #25

llama.cpp llama-server support #25

rain-1 commented Sep 20, 2024

llama.cpp llama-server support #25

Are you sure you want to change the base?

llama.cpp llama-server support #25

Conversation

rain-1 commented Sep 20, 2024