[8.x] [Auto Import] Use larger number of samples on the backend (#196233) #196386

kibanamachine · 2024-10-15T16:27:14Z

Backport

This will backport the following commits from main to 8.x:

[Auto Import] Use larger number of samples on the backend (#196233)

Questions ?

Please refer to the Backport tool documentation

…6233) ## Release Notes Automatic Import now analyses larger number of samples to generate an integration. ## Summary Closes elastic/security-team#9844 **Added: Backend Sampling** We pass 100 rows (these numeric values are adjustable) to the backend [^1] [^1]: As before, deterministically selected on the frontend, see elastic#191598 The Categorization chain now processes the samples in batches, performing after initial categorization a number of review cycles (but not more than 5, tuned so that we stay under the 2 minute limit for a single API call). To decide when to stop processing we keep the list of _stable_ samples as follows: 1. The list is initially empty. 2. For each review we select a random subset of 40 samples, preferring to pick up the not-stable samples. 3. After each review – when the LLM potentially gives us new or changes the old processors – we compare the new pipeline results with the old pipeline results. 4. Those reviewed samples that did not change their categorization are added to the stable list. 5. Any samples that have changed their categorization are removed from the stable list. 6. If all samples are stable, we finish processing. **Removed: User Notification** Using 100 samples provides a balance between expected complexity and time budget we work with. We might want to change it in the future, possibly dynamically, making the specific number of no importance to the user. Thus we remove the truncation notification. **Unchanged:** - No batching is made in the related chain: it seems to work as-is. **Refactored:** - We centralize the sizing constants in the `x-pack/plugins/integration_assistant/common/constants.ts` file. - We remove the unused state key `formattedSamples` and combine `modelJSONInput` back into `modelInput`. > [!NOTE] > I had difficulty generating new graph diagrams, so they remain unchanged. (cherry picked from commit fc3ce54)

elasticmachine · 2024-10-15T18:30:32Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: 683d576

Failed CI Steps

FTR Configs #11

Test Failures

[job] [logs] FTR Configs #11 / dashboard Export import saved objects between versions should render all panels on the dashboard

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`integrationAssistant`	55	56	+1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`integrationAssistant`	961.0KB	960.7KB	-297.0B

Unknown metric groups

API count

id	before	after	diff
`integrationAssistant`	66	71	+5

cc @ilyannn

kibanamachine assigned ilyannn Oct 15, 2024

kibanamachine added the backport label Oct 15, 2024

kibanamachine enabled auto-merge (squash) October 15, 2024 16:27

kibanamachine mentioned this pull request Oct 15, 2024

[Auto Import] Use larger number of samples on the backend #196233

Merged

2 tasks

github-actions bot approved these changes Oct 15, 2024

View reviewed changes

kibanamachine merged commit a4938bc into elastic:8.x Oct 15, 2024
26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[8.x] [Auto Import] Use larger number of samples on the backend (#196233) #196386

[8.x] [Auto Import] Use larger number of samples on the backend (#196233) #196386

kibanamachine commented Oct 15, 2024

elasticmachine commented Oct 15, 2024

API count

[8.x] [Auto Import] Use larger number of samples on the backend (#196233) #196386

[8.x] [Auto Import] Use larger number of samples on the backend (#196233) #196386

Conversation

kibanamachine commented Oct 15, 2024

Backport

Questions ?

elasticmachine commented Oct 15, 2024

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

Metrics [docs]

Public APIs missing comments

Async chunks

API count