Replies: 10 comments 29 replies
-
Since you asked for feedback: I have two mental models when working in Emacs:
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
> I assume by "second model" you mean the one where anything typed by the user (anywhere in the buffer) is unconditionally assigned the user role, which is gptel's current behavior.
Seems like I got a lot of the models mixed between multiple replies here, so I'll just make it clear: I would prefer that any change I make to the chat buffer would be seen as a change made to the conversational history. If I add text to the assistant's response, it still is his response. If I yank part of his text around, and add it in MY response, it will be MY response.
The current behavior is something I usually try to avoid.
Right. Implicitly you are assuming that the response == a region of the buffer, and not the contents of the response text itself. This is the behavior you would expect, for example, if you used an overlay to track the response bounds. I understand the appeal of this approach, but when you implement it it turns out not to work very well.
gptel used to work this way and it was fine some of the time (~80%), but there were too many edge cases, and it was easy to lose this overlay (or overlay-like behavior) because of the various things Emacs and other minor modes do in buffers. If I can find a way to implement this behavior robustly I'll add it again.
|
Beta Was this translation helpful? Give feedback.
-
potential need for quoting tagsIndeed, we really need to be able to edit the "response" part, which in the current case will fragment it and require Also, you might want to use part of the response in your request. Currently, I paste it in the terminal with I think the best solution would be a visible structure, similar to HTML/XML tags. Similar to JSON. Similar to Emacs' S-expressions. I mean using opening and closing tags. That structure could be orthogonal to the existing structure in the document: It would be ignored by So that would be There would be no ambiguity anymore: We could edit, copy, or do whatever we want, provided we don't meddle with the intertwined new tags. (Note: if we want to be able to quote the tags of the new structure, which is far from unimaginable, we can add a UUID to the tags: |
Beta Was this translation helpful? Give feedback.
-
@Inkbottle007, @daedsidog I think you are taking the "response == buffer region" semantic model for granted. This is why indicating responses visually, copying text (etc) doesn't work how you expect. The "response == buffer region" model is fine, and I'm okay switching to it since it's the popular one so far. The question is how to implement it. The behavior you want maps 1:1 to overlays, so the easy fix would be to use overlays instead of (or in addition to) text properties to demarcate response boundaries. Then everything, including copying response text to other buffers and visual indications of response regions will work as you expect. Switching to overlays is still a fair amount of work, though. As a precursor and preview to that, you can try the following: (setf (alist-get 'gptel text-property-default-nonsticky nil 'remove) nil) This should do most of what you want, but it introduces some hidden gotchas, especially in markdown-mode. |
Beta Was this translation helpful? Give feedback.
-
What about using invisible open/close tags? Just as the This approach would address the issue where subsequent user text is mistaken for a response, as discussed earlier. Importantly, it wouldn't appear in the document's actual text, maintaining the goal of non-intrusiveness. Of course, if the user meddles with those characters, everything will end up mixed up again. That's why I thought that showing those boundaries would help. However, I understand the intention to keep gptel seamless and invisible within the workflow. |
Beta Was this translation helpful? Give feedback.
-
In the
It needs some testing, so if you're interested please switch to the |
Beta Was this translation helpful? Give feedback.
-
I've found the first problem with using overlays to track responses -- if you kill text and undo, or even just undo and redo, the overlays don't come back so the tracking is gone. |
Beta Was this translation helpful? Give feedback.
-
I have a workflow that works. I use a simple function that unambiguously highlights the attributions and then you just have to fix the inconsistencies by hand (using I think we should keep the text property based attribution system. Since the text is natural language, I don't think there is a single (unified) solution that addresses all cases. However a hybrid solution like the workflow described here should be sufficient. There is the question of hooking (the highlighting function on changes in the buffer) but I don't know how this could be done without being CPU intensive. (defun gptel-highlight-responses ()
"Highlight response segments with overlays."
(interactive)
(save-excursion
(goto-char (point-max))
(while (setq prop (text-property-search-backward
'gptel 'response
(when (get-char-property (max (point-min) (1- (point)))
'gptel)
t)))
(let ((role (if (prop-match-value prop) "assistant" "user"))
(overlay (make-overlay (prop-match-beginning prop)
(prop-match-end prop))))
(overlay-put overlay 'face
`(:background ,(if (equal role "assistant")
"lightblue"
"lightgreen")
:extend t))
(overlay-put overlay 'gptel-response-overlay t))))) Additional information can be found here. |
Beta Was this translation helpful? Give feedback.
-
I believe the use of highlighting is the right answer to this question. I've been using @daedsidog's take on this solution for two weeks, but have reverted to my own implementation based on overlays that "I have reason to believe" are best for the intended scenario. My version seems very robust, perhaps to the point of being a drawback, and requires some tweaking. However, you can use it already if you want, because it is very convenient. Note that I am using the default convention of (I already have specific ideas on how to do the tweaking and will work on it asap.) |
Beta Was this translation helpful? Give feedback.
-
Currently gptel tags the text of LLM responses so it can distinguish between its responses and user prompts. The exact way it does this in Elisp is irrelevant (or not yet relevant) to this discussion. As it turns out, there are several subtleties to this behavior that are unresolved.
To figure these out, I would like your input and feedback on the following two questions:
If you move the cursor into a response region and type in text, should that new text be considered part of the response, or should it break the response into two regions separated by a new user prompt?
If you copy some text from a response region and yank it -- elsewhere into this buffer or into another one -- should it continue to be recognized by gptel as an LLM response, or is it now part of the user prompt?
Before you reply: I've heard from users who believe it should obviously work this way, and would not understand why anyone would want the opposite behavior... for both values of this. Consider that there are situations where both possible behaviors are useful. The question is about your mental model of the response: is the LLM response a feature of the text itself, or is it a feature of the position and context of the text in the buffer? (As you might expect, these correspond roughly to two ways of marking text in Emacs buffers, with text-properties or overlays.)
Beta Was this translation helpful? Give feedback.
All reactions