-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Assistant] Vertex chat model #193032
Conversation
Pinging @elastic/security-solution (Team: SecuritySolution) |
@@ -395,9 +395,7 @@ const formatGeminiPayload = ({ | |||
temperature, | |||
maxOutputTokens: DEFAULT_TOKEN_LIMIT, | |||
}, | |||
...(systemInstruction | |||
? { system_instruction: { role: 'user', parts: [{ text: systemInstruction }] } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pgayvallet Was there a reason for adding role
? I don't see it in the API docs. I think all we need is parts
"The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools."; | ||
const ALLENS_PROMPT = | ||
'You are an assistant that is an expert at using tools and Elastic Security, doing your best to use these tools to answer questions or follow instructions. It is very important to use tools to answer the question or follow the instructions rather than coming up with your own answer. Tool calls are good. Sometimes you may need to make several tool calls to accomplish the task or get an answer to the question that was asked. Use as many tool calls as necessary.'; | ||
const KB_CATCH = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
export const GEMINI_SYSTEM_PROMPT = | ||
`ALWAYS use the provided tools, as they have access to the latest data and syntax.` + | ||
"The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools."; | ||
const ALLENS_PROMPT = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rename to something like GEMINI_MAIN_SYSTEM_PROMP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the code with Steph and ran evaluations. Results for ES|QL generation for Gemini improved. Results for other models like gpt-4o and Sonnet 3.5 remained consistently high. So no regressions on other models. Evaluation results for custom knowledge improved for Gemini as well. Thanks for the great work, Steph!
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
Starting backport for target branches: 8.x |
(cherry picked from commit aae8c50)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
# Backport This will backport the following commits from `main` to `8.x`: - [[Security Assistant] Vertex chat model (#193032)](#193032) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Steph Milovic","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-10-04T13:39:46Z","message":"[Security Assistant] Vertex chat model (#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team: SecuritySolution","backport:prev-minor","Team:Security Generative AI","v8.16.0"],"title":"[Security Assistant] Vertex chat model","number":193032,"url":"https://github.com/elastic/kibana/pull/193032","mergeCommit":{"message":"[Security Assistant] Vertex chat model (#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193032","number":193032,"mergeCommit":{"message":"[Security Assistant] Vertex chat model (#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Steph Milovic <[email protected]>
Summary
Works towards addressing #189771 by changing the Security Assistant over from
ActionsClientGeminiChatModel
toActionsClientChatVertexAI
.Adds a new chat model
ActionsClientChatVertexAI
that extendsChatVertexAI
. This is the model meant to be used with Gemini JSON auth. Our current Gemini chat model (ActionsClientGeminiChatModel
) extendsChatGoogleGenerativeAI
which does not support the same authentication methods as we use with Gemini. Additionally,ChatVertexAI
uses the proper request body format whileChatGoogleGenerativeAI
uses something close that puts the system prompt as a user message rather than in the appropriatesystemInstruction
property. Moving the system prompt to the proper field makes a big difference in result quality.Prompt improvements
Thanks to help from @afirstenberg, we have a shiny new system prompt for Gemini that is working much better for us. The prompt hammers home how great tool use is, and reinforces behavior with positive statements rather than negative. I also applied this positive reinforcement strategy to the Gemini prompt in
generate_chat_title.ts
and the prompt for thenl-to-esql-tool
.User prompt
The strategy Allen suggested also includes prepending a "user prompt" to the last user message in the conversation. The "user prompt" does not get saved in persistent history. When the history of the conversation is sent with a follow up question, you'll only see the "user prompt" in the last user message that is sent to Gemini.
You can see an example of the "user prompt" in place within a conversation on this trace. In this trace we see the system prompt at the top as expected, then the conversation history, then only the most recent message has the user prompt prepended:
LangSmith tests
I have my own tests that I run with the assistant, and then tests with the evaluator. In my tests, the first row is the old chat model
ActionsClientGeminiChatModel
. The next 2 rows areActionsClientChatVertexAI
without streaming andActionsClientChatVertexAI
with streaming.ActionsClientChatVertexAI
successfully passes each test, and outperformsActionsClientGeminiChatModel
in some cases.Code
💚 = Improvement
💛 = Same
❌ = Error Result
ESQLKnowledgeBaseTool
AlertsCountTool
OpenAndAcknowledgedAlertsTool
I ran the ES|QL Generation Regression dataset against the new chat model. There is a significant boost in correctness and the Vertex model is indeed more regularly invoking the tools than the previous model. We are also not hitting the
GraphRecursionError
we see getting hit with the previous model. The following are screenshots of the ES|QL Generation Regression for each model usinggemini-1.5-pro-002
. You see significant improvements in Vertex.ES|QL Generation Regression dataset
LangSmith Playground
An advantage to extending
ChatVertexAI
is that since it will work with our APIcredentialsJson
, we can use the LangSmith playground to test prompts and iterate. To do so, select anActionsClientChatVertexAI
model run and hit the playground button:Once in playground, ensure VertexAI is the selected model and you've entered our valid
credentialsJson
in the Secrets & API Keys area. Now you can iterate on the system prompt to ensure desired results.To test
Select Gemini as your conversation connector. Run through the prompts from my tests above, or your own prompts that you've found challenged Gemini.