Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rules to promptmixing #421

Closed
wants to merge 9 commits into from
Closed

add rules to promptmixing #421

wants to merge 9 commits into from

Conversation

abeatrix
Copy link
Contributor

@abeatrix abeatrix commented Jul 27, 2023

RE: https://github.com/sourcegraph/sourcegraph/issues/52061

Issue: cody forgetting its identity and the rules (that prevent hallucination) we have defined for the LLM when the conversation has gotten longer, affecting the output quality.

Background context: I was playing around Custom Recipes and noticed Cody did very well when we provided detailed instructions in each prompt. However, it would forget its name after a few questions and a dozen contexts were added to the transcript. BUT if I add "replying as Cody" in each recipe and ask Cody for its name, it would always answer Cody. This is when I notice this is because we define Cody at the very beginning of the conversation, but once the attention has shifted when the conversation has gotten longer, what is defined at the beginning doesn't matter anymore. This is why the rule added in this PR in PromptMixin would help improve Cody's response, as it is added to the beginning of every request submitted by the user to remind Cody about the format it should follow.

Here are the rules we will be adding, following the guideline from anthropic to prevent hallucinations and give it room to think for better response quality:
Provide full workable code as code snippets. Reference only verified file names/paths. Don't make assumptions or fabricate information. Answer only if certain or tell me you don't know. Think step-by-step.

Changes in this PR should fix issues where Cody:

  • forgot its identity
  • hallucinate answers (TBC)
  • Not providing full code when answering code-related questions

Test plan

Before (v0.4.2) After

Here is the output from the same prompt, where After includes the Provide full workable code rule added to PromptMixin:

v0.4.2
image

After
image

The prompt used for the test:

Please analyze the code and suggest constructive edits that follow best practices and improve code quality and readability. Focus your responses on being clear, thoughtful, coherent and easy for me to understand. Do not make changes that alter intended functionality without explaining why. Avoid responses that are overly complex or difficult to implement. Strive to provide helpful recommendations based on other shared code snippests.


export function withPreselectedOptions(editor: Editor, preselectedOptions: PrefilledOptions): Editor {
    const proxy = new Proxy<Editor>(editor, {
        get(target: Editor, property: string, receiver: unknown) {
            if (property === 'showQuickPick') {
                return async function showQuickPick(options: string[]): Promise<string | undefined> {
                    for (const [preselectedOption, selectedOption] of preselectedOptions) {
                        if (preselectedOption === options) {
                            return Promise.resolve(selectedOption)
                        }
                    }
                    return target.showQuickPick(options)
                }
            }
            return Reflect.get(target, property, receiver)
        },
    })

    return proxy
}

@abeatrix abeatrix requested review from a team July 27, 2023 23:13
Copy link
Contributor

@dominiccooney dominiccooney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your test plan appears to be about naming as Cody, but it looks like that change is already in. My gut feeling is, yes, reiterating some rules makes sense. But I think we need to do a more thorough evaluation of the effect of this on some different queries to be sure.

@@ -27,11 +27,13 @@ export class PromptMixin {
* Prepends all mixins to `humanMessage`. Modifies and returns `humanMessage`.
*/
public static mixInto(humanMessage: InteractionMessage): InteractionMessage {
const rules =
"Rules: Provide full workable code as code snippets. Reference only verified file names/paths. Don't make assumptions or fabricate information. Think step-by-step. Answer only if certain or tell me you don't know."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure about this. I have seen the agent take prompts like "Provide full workable code as code snippets" and other input which doesn't require code generation, and have it "force it" by generating weird code, or complaining about not being able to generate code.

Re: don't make assumptions or fabricate information, my anecdotal experience is that it's usually more reliable to tell it what to do. For example, for improved explain-code, I have been asking it to write down facts and then reference those facts when it summarizes.

Re: "Think step-by-step" Anthropic recommends that has the agent ask to think step-by-step, and then the human says "OK."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference for the Anthropic step-by-step recommendation: https://docs.anthropic.com/claude/docs/ask-claude-to-think-step-by-step

Copy link
Contributor Author

@abeatrix abeatrix Jul 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your test plan appears to be about naming as Cody, but it looks like that change is already in.

@dominiccooney The transcript before the naming as Cody shows the differences after adding the Provide full workable code when applicable rule (which i later changed to Provide full workable code as code snippets). Will make an update to be more specific.

My gut feeling is, yes, reiterating some rules makes sense. But I think we need to do a more thorough evaluation of the effect of this on some different queries to be sure.

Something I'm trying to understand: if we are already adding the same rules at the start of the conversation, why reiterating the same rules would be a problem when those are (almost) the same rules we want Cody to follow initially?

IMO adding the basic rules (e.g. asking it not to make stuff up) that apply to most questions would help the chat experience overall, as it sets the standard/fallback for each response. Plus they can be easily overwritten by user input.

For Example, right now we have the Reply as Cody rule (which is already included in the stable release), and you can ask Cody to:Reply as Rick from Rick and Morty and tell me how to create a button in react.

Now Cody will now reply as Rick:

image

(An uneducated guess) it seems to me if we ever want Cody to ignore the "rules", we just need to be more specific in the prompts we build for each use case? Am I misunderstanding how this works? :think:

What are your thoughts on this?

Re: don't make assumptions or fabricate information

I was trying to shorten the example provided by Anthropic Answer the following question only if you know the answer or can make a well-informed guess; otherwise tell me you don't know it. 😆 I guess the Answer only if certain or tell me you don't know. should be enough lol ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants