What's the best way to add arbitrary values to a request? #471

tillydray · 2024-11-10T16:37:46Z

tillydray
Nov 10, 2024

I'm using gpt4all and chatgpt as backends. I discovered that the model I'm using with gpt4all (llama 3 8b instruct) needs a min_p value set. 0.1 is working in my case. If it's 0 then llama responds with gibberish, like

me: say cow

llama: I the samoooo=J}{ f l.......

chatgpt however doesn't want that value and responds with an http error if it's included. so i want to be able to include min_p conditionally. right now there's no nice way to do that, at least not that I could find. Here's my solution for now with a minimal use-package block

(use-package! gptel
  :config
  (setq! gptel-model 'Meta-Llama-3-8B-Instruct.Q4_0.gguf
         gptel-backend (gptel-make-gpt4all "llama chat"
                         :protocol "http"
                         :host "localhost:4891"
                         :models '(Meta-Llama-3-8B-Instruct.Q4_0.gguf gpt-4o)
                         ))
  (defun my-gptel--request-data (original-fn backend prompts)
    "Wrap ORIGINAL-FN to conditionally include min_p in the request data for BACKEND."
    (let* ((data (funcall original-fn backend prompts))
           (model (plist-get data :model)))
      (when (string= model "Meta-Llama-3-8B-Instruct.Q4_0.gguf") ;; brittle and doesn't scale
        (plist-put data :min_p 0.1))
      data))
  (advice-add 'gptel--request-data :around #'my-gptel--request-data)
  )

My ideas so far:

add gptel-min-p variable that is then included in the request or not, depending on the model's needs. This probably gets tricky to maintain as there are so many models and I have no idea which ones need min_p, and it's possible that other models need other values to work well like top_p or top_k etc.
add gptel-extra-request-data variable that takes a list of key/value pairs that will be added to the request. not as nice for users but definitely easier to maintain and more scalable.

Answered by karthink

Nov 11, 2024

I've added support for :request-params. Either of the above specifications (backend or model-specific configuration) should now work. Please test and let me know if it's satisfactory.

View full answer

karthink · 2024-11-10T22:41:08Z

karthink
Nov 10, 2024
Maintainer

Thanks for bringing this up, this has been bothering me for a while. This feature is also needed by others, such as #330, #393. It's clear that we need a general way to do this.

Adding variables like gptel-min-p will not scale, and is also the wrong level at which to address these issues. I can think of two ways to do this:

Add a backend-specific parameter:

(gptel-make-gpt4all "llama chat"
  :protocol "http"
  :host "localhost:4891"
  :models '(Meta-Llama-3-8B-Instruct.Q4_0.gguf)
  :params '(:min_p 0.1)) ;; <-- HERE

OR add a model-specific parameter:

(gptel-make-gpt4all "llama chat"
  :protocol "http"
  :host "localhost:4891"
  :models '((Meta-Llama-3-8B-Instruct.Q4_0.gguf :params '(:min_p 0.1)) ;; <-- HERE
            gpt-4o))

OR allow it to be added in both places, with the model-specific request :params taking precedence.

What do you think?

9 replies

karthink Nov 11, 2024
Maintainer

Ah I see. We do handle these things internally for the web API when accessing big models (gpt-4/Claude/Gemini), but there are too many variations possible if you use llama-cpp/ollama/gpt4all etc.

I'll go ahead and implement this, possibly with a different key than :params, because all the fields in gptel-make-* are technically parameters of the backend. Perhaps :request-params? If you have any better ideas please let me know.

tillydray Nov 11, 2024
Author

request-params makes sense to me. The only other names I can think of are data-request-params or request-data-params. Shorter is better when it is still clear, and I think request-params is clear

karthink Nov 11, 2024
Maintainer

There's one more thing I'd like your opinion on. Should the parameters specified this way, in the backend/model specification, be the default values or the overriding values for queries?

For example, suppose you set min_p to 0.1 here,

(gptel-make-gpt4all "llama chat"
  :protocol "http"
  :host "localhost:4891"
  :models '(Meta-Llama-3-8B-Instruct.Q4_0.gguf)
  :params '(:min_p 0.1)) ;; <-- HERE

AND suppose gptel provides a variable or a menu option gptel-min-p to set this as well. (This won't be the case for min_p, but it's already the case for other request parameters, like gptel-max-tokens and gptel-stream.)

Which setting of min_p takes precedence?

If gptel's variable/menu item takes precedence, the backend/model specification is just the default value of min_p and can be changed dynamically.
If the backend/model specification takes precedence, min_p is always set to 0.1, so it's overriding and can be changed by redefining the backend/model.

The former behavior is more flexible, but there are also times when you want the model specification to override, such as when one model (in a backend with many models) doesn't support streaming responses, say. Otherwise it would be up to you to ensure that you turn off streaming manually after switching to that model.

tillydray Nov 11, 2024
Author

I think it should go in this order

1. model-specific params take precedence over anything else for that model

(gptel-make-gpt4all "llama chat"
  :protocol "http"
  :host "localhost:4891"
  :models '((Meta-Llama-3-8B-Instruct.Q4_0.gguf :params '(:min_p 0.1)) ;; <-- HERE
            gpt-4o))

2. backend-specific params take precedence over a variable or menu option

(gptel-make-gpt4all "llama chat"
  :protocol "http"
  :host "localhost:4891"
  :models '(Meta-Llama-3-8B-Instruct.Q4_0.gguf)
  :params '(:min_p 0.1)) ;; <-- HERE

3. menu option takes precedence over a variable

4. and lastly, the variable

(setq gptel-min-p 0.1)

karthink Nov 11, 2024
Maintainer

Yes, this is how it works right now:

model-specific > backend-specific > menu option = variable

(Most gptel menu options just set variables)

karthink · 2024-11-11T17:42:55Z

karthink
Nov 11, 2024
Maintainer

I've added support for :request-params. Either of the above specifications (backend or model-specific configuration) should now work. Please test and let me know if it's satisfactory.

3 replies

tillydray Nov 11, 2024
Author

I might be using it incorrectly, but it isn't working for me. I also can't see all the information I would expect with (setq gptel-log-level "debug"). I expect to see the response body in gptel-log but it isn't there, but that's unrelated to using the new :request-params

gptel-log

{
  "gptel": "response body",
  "timestamp": "2024-11-11 17:53:43"
}
{
  "choices": [
    {
      "finish_reason": "length",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "...buddy!' Can-do partner cows just stand still... ... Cow is in",
        "role": "assistant"
      },
      "references": null
    }
  ],
  "created": 1731365623,
  "id": "placeholder",
  "model": "Llama 3 8B Instruct",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 16,
    "prompt_tokens": 13,
    "total_tokens": 29
  }
}

logging variables

Backend: #s(gptel-openai llama chat localhost:4891 #[0 \300 \211\205�\301\302�PBC\207 [gptel--get-api-key Authorization Bearer ] 4] http nil /v1/chat/completions nil (Meta-Llama-3-8B-Instruct.Q4_0.gguf :request-params quote gpt-4o) http://localhost:4891/v1/chat/completions nil nil)
Prompts: ((:role system :content You are an expert software engineer ...) (:role user :content say cow))
Model: Meta-Llama-3-8B-Instruct.Q4_0.gguf
Stream: :json-false
Temperature: 0.7
Max Tokens: nil

config

  (setq! gptel-backend (gptel-make-gpt4all "llama chat"
                         :protocol "http"
                         :host "localhost:4891"
                         :models '(Meta-Llama-3-8B-Instruct.Q4_0.gguf
                                   :request-params '((:min_p . 0.1) ;; <----- is this correct?
                                                     (:max_length . 4096))
                                   gpt-4o)))

karthink Nov 11, 2024
Maintainer

gptel-log-level should be a symbol, info, debug or nil
Instead of the log, I suggest turning on gptel-expert-commands and looking at the dry-run output (from the transient menu), that should show you the full request payload.
The :request-params format is a plist, like the others.
If adding model configuration, you need to enclose it in a list to separate it from other models.

Here's the corrected config:

(setq! gptel-backend
       (gptel-make-gpt4all "llama chat"
         :protocol "http"
         :host "localhost:4891"
         :models '((Meta-Llama-3-8B-Instruct.Q4_0.gguf
                    :request-params '(:min_p 0.1 :max_length 4096)))
                   gpt-4o))

A brief explanation to help you make sense of these (common Elisp) rules/conventions:

A plist is a list containing alternating keys and values. the keys are symbols prefixed with :.

Example: (:min_p 0.1 :max_length 4096)
A model specification is a cell with the model name followed by the plist:

;; A cons cell with two entries:
   (Meta-Llama-3-8B-Instruct . (:min_p 0.1 :max_length 4096))
;; CAR of the cell ↵          CDR of the cell ↵
;; which is the same as:
   (Meta-Llama-3-8B-Instruct :min_p 0.1 :max_length 4096))
;; because of what lists are in Elisp.

You'll notice that the gptel-make-gpt4all function call has the same format -- a name ("llama-chat") followed by a plist (:protocol "http" :host "localhost:4891" :models ...).

The :models field is a list of model specifications:

:models '((model1 :key1 value1 :key2 value2)
          (model2 :key1 value1 :key2 value2))

If there is no model-specific configuration for model2, this looks like

:models '((model1 :key1 value1 :key2 value2)
          (model2))

For convenience, you can specify model2 here without the list

:models '((model1 :key1 value1 :key2 value2)
          model2)

This is the format that your model specification above is using, where model1 is Meta-Llama-3-8B-Instruct.Q4_0.gguf and model2 is gpt-4o.

tillydray Nov 12, 2024
Author

thank you for the thorough reply! it's working now of course :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the best way to add arbitrary values to a request? #471

{{title}}

Replies: 2 comments 12 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

What's the best way to add arbitrary values to a request? #471

tillydray Nov 10, 2024

Replies: 2 comments · 12 replies

karthink Nov 10, 2024 Maintainer

karthink Nov 11, 2024 Maintainer

tillydray Nov 11, 2024 Author

karthink Nov 11, 2024 Maintainer

tillydray Nov 11, 2024 Author

1. model-specific params take precedence over anything else for that model

2. backend-specific params take precedence over a variable or menu option

3. menu option takes precedence over a variable

4. and lastly, the variable

karthink Nov 11, 2024 Maintainer

karthink Nov 11, 2024 Maintainer

tillydray Nov 11, 2024 Author

gptel-log

logging variables

config

karthink Nov 11, 2024 Maintainer

tillydray Nov 12, 2024 Author

tillydray
Nov 10, 2024

Replies: 2 comments 12 replies

karthink
Nov 10, 2024
Maintainer

karthink Nov 11, 2024
Maintainer

tillydray Nov 11, 2024
Author

karthink Nov 11, 2024
Maintainer

tillydray Nov 11, 2024
Author

karthink Nov 11, 2024
Maintainer

karthink
Nov 11, 2024
Maintainer

tillydray Nov 11, 2024
Author

karthink Nov 11, 2024
Maintainer

tillydray Nov 12, 2024
Author