Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I run
./llama-server --host 0.0.0.0 -m ./models/gpt2-xl.Q4_K_M.gguf
. It proves an text completion API that is similar but not quite compatable with OpenAI. It only supports one completion at a time. So using control-shift-space a lot is a must.I split the OpenAI Compat function into a helper function that both the openai-compat and llama.cpp support make use of.
Here is a working loom config:
Note: llama-server requires the endpoint to be
/v1/completions
, a trailing slash/v1/completions/
will not work.Future work: Get llama.cpp to support multiple completions.