Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Assistant] Vertex chat model #193032

Merged
merged 32 commits into from
Oct 4, 2024

Conversation

stephmilovic
Copy link
Contributor

@stephmilovic stephmilovic commented Sep 16, 2024

Summary

Works towards addressing #189771 by changing the Security Assistant over from ActionsClientGeminiChatModel to ActionsClientChatVertexAI.

Adds a new chat model ActionsClientChatVertexAI that extends ChatVertexAI. This is the model meant to be used with Gemini JSON auth. Our current Gemini chat model (ActionsClientGeminiChatModel) extends ChatGoogleGenerativeAI which does not support the same authentication methods as we use with Gemini. Additionally, ChatVertexAI uses the proper request body format while ChatGoogleGenerativeAI uses something close that puts the system prompt as a user message rather than in the appropriate systemInstruction property. Moving the system prompt to the proper field makes a big difference in result quality.

Prompt improvements

Thanks to help from @afirstenberg, we have a shiny new system prompt for Gemini that is working much better for us. The prompt hammers home how great tool use is, and reinforces behavior with positive statements rather than negative. I also applied this positive reinforcement strategy to the Gemini prompt in generate_chat_title.ts and the prompt for the nl-to-esql-tool.

User prompt

The strategy Allen suggested also includes prepending a "user prompt" to the last user message in the conversation. The "user prompt" does not get saved in persistent history. When the history of the conversation is sent with a follow up question, you'll only see the "user prompt" in the last user message that is sent to Gemini.

You can see an example of the "user prompt" in place within a conversation on this trace. In this trace we see the system prompt at the top as expected, then the conversation history, then only the most recent message has the user prompt prepended:
Screenshot 2024-10-03 at 12 28 45 PM

LangSmith tests

I have my own tests that I run with the assistant, and then tests with the evaluator. In my tests, the first row is the old chat model ActionsClientGeminiChatModel. The next 2 rows are ActionsClientChatVertexAI without streaming and ActionsClientChatVertexAI with streaming. ActionsClientChatVertexAI successfully passes each test, and outperforms ActionsClientGeminiChatModel in some cases.

Code

💚 = Improvement
💛 = Same
❌ = Error Result

ESQLKnowledgeBaseTool AlertsCountTool OpenAndAcknowledgedAlertsTool Alert Summary Remediate Conversation p1 p2 p3
ActionsClientGeminiChatModel ActionsClientGeminiChatModel ActionsClientGeminiChatModel ActionsClientGeminiChatModel ActionsClientGeminiChatModel ActionsClientGeminiChatModel Could not complete as previous step errored
💚 ActionsClientChatVertexAI 💛 ActionsClientChatVertexAI 💚 ActionsClientChatVertexAI 💛 ActionsClientChatVertexAI 💛 ActionsClientChatVertexAI 💚 ActionsClientChatVertexAI 💚 ActionsClientChatVertexAI
💚 ActionsClientChatVertexAI streaming 💛 ActionsClientChatVertexAI streaming 💚 ActionsClientChatVertexAI streaming 💛 ActionsClientChatVertexAI streaming 💛 ActionsClientChatVertexAI streaming 💚 ActionsClientChatVertexAI streaming 💚 ActionsClientChatVertexAI streaming

I ran the ES|QL Generation Regression dataset against the new chat model. There is a significant boost in correctness and the Vertex model is indeed more regularly invoking the tools than the previous model. We are also not hitting the GraphRecursionError we see getting hit with the previous model. The following are screenshots of the ES|QL Generation Regression for each model using gemini-1.5-pro-002. You see significant improvements in Vertex.

ES|QL Generation Regression dataset

Screenshot 2024-10-03 at 1 06 20 PM

LangSmith Playground

An advantage to extending ChatVertexAI is that since it will work with our API credentialsJson, we can use the LangSmith playground to test prompts and iterate. To do so, select an ActionsClientChatVertexAI model run and hit the playground button:
Screenshot 2024-09-16 at 6 12 20 PM

Once in playground, ensure VertexAI is the selected model and you've entered our valid credentialsJson in the Secrets & API Keys area. Now you can iterate on the system prompt to ensure desired results.
Screenshot 2024-09-16 at 6 13 37 PM

To test

Select Gemini as your conversation connector. Run through the prompts from my tests above, or your own prompts that you've found challenged Gemini.

@stephmilovic stephmilovic added release_note:skip Skip the PR/issue when compiling release notes Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Security Generative AI Security Generative AI v8.16.0 labels Sep 16, 2024
@stephmilovic stephmilovic marked this pull request as ready for review September 17, 2024 00:40
@stephmilovic stephmilovic requested a review from a team as a code owner September 17, 2024 00:40
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@stephmilovic stephmilovic requested review from a team as code owners October 3, 2024 17:19
@stephmilovic stephmilovic added the backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) label Oct 3, 2024
@@ -395,9 +395,7 @@ const formatGeminiPayload = ({
temperature,
maxOutputTokens: DEFAULT_TOKEN_LIMIT,
},
...(systemInstruction
? { system_instruction: { role: 'user', parts: [{ text: systemInstruction }] } }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgayvallet Was there a reason for adding role? I don't see it in the API docs. I think all we need is parts

"The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools.";
const ALLENS_PROMPT =
'You are an assistant that is an expert at using tools and Elastic Security, doing your best to use these tools to answer questions or follow instructions. It is very important to use tools to answer the question or follow the instructions rather than coming up with your own answer. Tool calls are good. Sometimes you may need to make several tool calls to accomplish the task or get an answer to the question that was asked. Use as many tool calls as necessary.';
const KB_CATCH =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found if I do not add this catch and Gemini gets empty knowledge base results from a user question, it will not try to answer. See the differences is these traces for before and after I added KB_CATCH

export const GEMINI_SYSTEM_PROMPT =
`ALWAYS use the provided tools, as they have access to the latest data and syntax.` +
"The final response is the only output the user sees and should be a complete answer to the user's question. Do not leave out important tool output. The final response should never be empty. Don't forget to use tools.";
const ALLENS_PROMPT =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename to something like GEMINI_MAIN_SYSTEM_PROMP

Copy link
Contributor

@peluja1012 peluja1012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the code with Steph and ran evaluations. Results for ES|QL generation for Gemini improved. Results for other models like gpt-4o and Sonnet 3.5 remained consistently high. So no regressions on other models. Evaluation results for custom knowledge improved for Gemini as well. Thanks for the great work, Steph!

@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #68 / discover/esql discover esql view switch modal should not show switch modal when switching to a data view while a saved search is open

Metrics [docs]

Unknown metric groups

ESLint disabled line counts

id before after diff
@kbn/langchain 1 2 +1

Total ESLint disabled count

id before after diff
@kbn/langchain 1 2 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@stephmilovic stephmilovic merged commit aae8c50 into elastic:main Oct 4, 2024
41 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/11181020258

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 4, 2024
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Oct 4, 2024
# Backport

This will backport the following commits from `main` to `8.x`:
- [[Security Assistant] Vertex chat model
(#193032)](#193032)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Steph
Milovic","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-10-04T13:39:46Z","message":"[Security
Assistant] Vertex chat model
(#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:
SecuritySolution","backport:prev-minor","Team:Security Generative
AI","v8.16.0"],"title":"[Security Assistant] Vertex chat
model","number":193032,"url":"https://github.com/elastic/kibana/pull/193032","mergeCommit":{"message":"[Security
Assistant] Vertex chat model
(#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193032","number":193032,"mergeCommit":{"message":"[Security
Assistant] Vertex chat model
(#193032)","sha":"aae8c50f4083208508a49ab5324b31047aea5e68"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Steph Milovic <[email protected]>
tiansivive pushed a commit to tiansivive/kibana that referenced this pull request Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) release_note:skip Skip the PR/issue when compiling release notes Team:Security Generative AI Security Generative AI Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants