Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Looking for feedback] Add support for Function Calls #209

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

isaacphi
Copy link

@isaacphi isaacphi commented Feb 7, 2024

Demo

I made a quick screen recording showing the proof of concept:
https://screenapp.io/app/#/shared/6d146099-4bc3-4b34-afd7-bff3d16ee0ed

Overview

I've added support for Function Calling to gptel. This is just a proof of concept that I wanted to share before going any further.

First I want to thank you for making this package! It's become a seamless part of my emacs workflow because of how well it blends in, in particular using gptel-send for some of my own scripts. I think that supporting function calls would fit in well with the design philosophy of this package, allowing users to define their own callable functions for their own use cases. Some use cases I can think of off hand:

  • Asking GPT to create a file or files for you based on the context of an above chat
  • Requesting a diff of only specific lines of a file and interactively merging the diff
  • Orchestrating "Retrieval Augmented Generation" including context from other files in your project or from the internet.
    I think that because of how flexible emacs is, this could be the basis of a copilot that's more capable than any other editor.

I should note that function calls are not specific to OpenAI, they're available many if not all of the other models you support.

Before moving on, I'd like to know if this is even something you'd be interested in including in gptel. I'd also like to know your thoughts on how the user would use this feature. So far this is what I'm thinking:

  • User defines a list of global functions that will be provided to the function api. Gptel can introspect these functions and automatically generate the structured information about them that the API requires.
  • In the transient, function calls can be toggled on or off, or a specific function can be forced to be called (the API allows for this)

How to test this PR

  1. Run all the elisp in the file I included called test-user-config.el
  2. Open a chat using the model I called "OpenAI with function calls" (I turned off streaming mode because I only support that for now)
  3. Ask the chat to create new files or to have a cow say something. If it doesn't recognize your prompt as relating to either of those topics, it would work regularly.

Let me know what you think! Feel free to open a discussion if you think that's a more appropriate venue for this.
Related issue: #76

@@ -386,6 +386,9 @@ To set the model for a chat session interactively call
(const :tag "GPT 4 32k" "gpt-4-32k")
(const :tag "GPT 4 1106 (preview)" "gpt-4-1106-preview")))

(defcustom gptel-callable-functions nil
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure where to put this

@@ -73,16 +74,24 @@
(apply #'concat (nreverse content-strs))))

(cl-defmethod gptel--parse-response ((_backend gptel-openai) response _info)
(map-nested-elt response '(:choices 0 :message :content)))
;; If the reply specifies a function call, parse and return it instead of the message
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If GPT decides a function should be called, the structured data is included in tool_calls instead of content so that parsing needs to be updated.
For now I'm only parsing the full response, streaming is not supported yet (but could be)

Comment on lines +93 to +94
(when gptel-callable-functions
(plist-put prompts-plist :tools gptel-callable-functions))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main important change. This could be toggled on or off based on a setting in the transient. A certain function call can also be forced by using the tool_choice parameter


;;; Callable function schema
(setq! gptel-callable-functions
;; Hard coded variable specifying callable functions. This could be defined in a user's configuration
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a list of emacs functions, we should be able to introspect them to generate this schema data if that's preferable.

(cdr item))))
plist))

(cl-defun gptel-run-function-on-region (beg end)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now this is just a post-response hook function, but this would likely be more appropriate in the core gptel source. I'm not sure exactly where though.

@karthink
Copy link
Owner

karthink commented Feb 7, 2024

This is fantastic -- and quite a small change too! Thanks for the PR, this looks quite promising.

I'm not familiar with the OpenAI function-calling API. I'll take a look at it soon and then think about the UI, it'll probably be a few days from now.

@cosmicz
Copy link

cosmicz commented Oct 7, 2024

This PR would be nice to have for my use case. Anything I can do to push it along?

@karthink
Copy link
Owner

karthink commented Oct 7, 2024

This PR would be nice to have for my use case. Anything I can do to push it along?

Function calling is planned to be part of the features I'm working on in the feature-capabilities branch. This branch adds the infrastructure needed for per-model capability specification in gptel, as well as image support for supported models. The design is not final yet.

If you're interested, you could switch to the feature-capabilities branch and add a tool-use capability to the model specification. Then we need to handle the different function calling APIs (as provided by OpenAI, Anthropic, Gemini, Ollama) and create a uniform interface for gptel. I'm not sure how much of this PR can carry over as I haven't looked at it in a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants