-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
268 enable configuration of the llm evalutation defence in sandbox #336
268 enable configuration of the llm evalutation defence in sandbox #336
Conversation
…end, which allows the inputs to appear in the defence panel
…th default values
…ctionEvalPrePrompt
…edDefences to init model
…t and main prompt for model init
Happy to undo those name changes btw if y'all disagree with them. I mostly did it for my own benefit |
…on-defence-in-sandbox
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Just a couple of things to tweak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions, and some other general comments on the code where we are making life a bit difficult for ourselves. I will add a few new issues to cover the latter.
One final comment - our separation of concerns in the back-end is a bit fuzzy, and it's making it difficult to see what each type's responsibility is. I'll add an issue to look into that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 👍
6ddaade
to
f7ea245
Compare
f7ea245
to
614902e
Compare
@pmarsh-scottlogic Pushed some minor code cleanup, and solved the jest problem. This one's ready to go. |
0c1f205
to
22fa317
Compare
…on-defence-in-sandbox
) --------- Co-authored-by: Chris Wilton-Magras <[email protected]>
completes #268
changes
This PR works to make the instructions for the evaluation LLM cofigurable in the sandbox. See the screenshot
LLM_EVALUATION
toEVALUATION_LLM_INSTRUCTIONS
DefenceInfo
objects forEVALUATION_LLM_INSTRUCTIONS
Language diambiguation
Evaluation LLM
as part of our langchain architectureprompt injection eval
andmalicious prompt eval
prompt template
. This is like a prompt, but with placeholders which then become replaced dynamically with concrete values by langchain at runtime, to make aprompt value
prompt value
is the concrete prompt with no placeholders that is given the the model.pre prompt
prepended to amain prompt
.pre prompt
is what is configurable by the user. It is what tell the model how to behave.main prompt
contains the actual question asked to the LLM, with placeholder values. SeepromptTemplates.ts
for examplesconcerns with the PR
a potential helpful refactor for a different ticket