[ML] Use field caps option `include_empty_fields` to identify populated fields. #178606

walterra · 2024-03-13T09:44:59Z

As of elastic/elasticsearch#103651 there is a new field caps option include_empty_fields. Discover is making use of this already: #174063

We have various places where we use custom code to identify populated fields of an index by getting a random sample of docs and then check which fields are populated. These queries use random_score which can be a heavy query. We should migrate to the new field caps option which will be available as of 8.13.

`plugins/ml`

x-pack/plugins/ml/public/application/data_frame_analytics/pages/analytics_creation/hooks/use_index_data.ts
Code that identifies populated fields for data grid.
x-pack/plugins/ml/public/application/components/field_stats_flyout/populated_fields/get_merged_populated_fields_query.ts

`plugins/aiops`

x-pack/plugins/aiops/server/routes/log_rate_analysis/queries/get_random_docs_request.ts [ML] AIOps: Use field caps option include_empty_fields=false instead of custom query. #178699 (8.14)

`plugins/transform`

x-pack/plugins/transform/public/app/hooks/use_index_data.ts
Code that identifies populated fields for data grid. [ML] Transforms: Improve data grid memoization. #195394 (8.16)
transforms get populated fields via field stats which still needs to be updated to use include_empty_fields.

`plugins/data_visualizer`

x-pack/plugins/data_visualizer/public/application/index_data_visualizer/hooks/use_overall_stats.ts
Code that identifies if fields are empty or not. [ML] Improve performance of Field stats / Index data visualizer by reducing requests for empty fields, make it convenient to add multi-field #178766 (8.14)
x-pack/plugins/data_visualizer/public/application/index_data_visualizer/hooks/esql/use_esql_overall_stats_data.ts [ML] Adds query history for the ES|QL Data visualizer #179098 (8.14)

`plugins/apm`

[APM] Correlations: Update field candidates request. #186182

The text was updated successfully, but these errors were encountered:

elasticmachine · 2024-03-13T09:45:01Z

Pinging @elastic/ml-ui (:ml)

…d of custom query. (#178699) ## Summary Part of #178606. As of elastic/elasticsearch#103651 there is a new field caps option `include_empty_fields`. This PR updates AIOps Log Rate Analysis to make use of this option instead of a custom query and code that identified populated fields. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5482 - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

## Summary Part of #178606 and #151664. - Removes some unused code related to identifying populated index fields. - Changes `useIndexData()` to accept just one config options arg instead of individual args. - Improves data grid memoziation. Improvements tested locally: #### `many_fields` dataset (no timestamp) - `main`: `~22s` and 10 data grid rerenders until many_fields data set loaded. The transform config dropdown are hardly usable and super slow, each edit causes 3 data grid rerenders. - This PR: `~4.5s` and 7 data grid rerenders until many_fields data set loaded. The transform config dropdowns are a bit slow but usable! #### `kibana_sample_data_logs` dataset (whole dataset in the past to test rerenders on load without data) - `main`: 5 rerenders. - This PR: 3 rerenders ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

## Summary Part of elastic#178606 and elastic#151664. - Removes some unused code related to identifying populated index fields. - Changes `useIndexData()` to accept just one config options arg instead of individual args. - Improves data grid memoziation. Improvements tested locally: #### `many_fields` dataset (no timestamp) - `main`: `~22s` and 10 data grid rerenders until many_fields data set loaded. The transform config dropdown are hardly usable and super slow, each edit causes 3 data grid rerenders. - This PR: `~4.5s` and 7 data grid rerenders until many_fields data set loaded. The transform config dropdowns are a bit slow but usable! #### `kibana_sample_data_logs` dataset (whole dataset in the past to test rerenders on load without data) - `main`: 5 rerenders. - This PR: 3 rerenders ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) (cherry picked from commit 869ceec)

) # Backport This will backport the following commits from `main` to `8.x`: - [[ML] Transforms: Improve data grid memoization. (#195394)](#195394)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Walter Rafelsberger <[email protected]>

walterra added :ml technical debt Improvement of the software architecture and operational architecture labels Mar 13, 2024

walterra self-assigned this Mar 13, 2024

walterra changed the title ~~[ML] Use `https://github.com/elastic/elasticsearch/pull/103651~~ [ML] Use field caps options include_fields_with_no_value to identify populated fields. Mar 13, 2024

peteharverson added the v8.14.0 label Mar 13, 2024

walterra changed the title ~~[ML] Use field caps options include_fields_with_no_value to identify populated fields.~~ [ML] Use field caps option include_fields_with_no_value to identify populated fields. Mar 14, 2024

walterra changed the title ~~[ML] Use field caps option include_fields_with_no_value to identify populated fields.~~ [ML] Use field caps option include_empty_fields to identify populated fields. Mar 14, 2024

walterra mentioned this issue Mar 14, 2024

[ML] AIOps: Use field caps option include_empty_fields=false instead of custom query. #178699

Merged

3 tasks

qn895 self-assigned this Mar 14, 2024

peteharverson added the Meta label Mar 27, 2024

peteharverson added v8.15.0 and removed v8.14.0 labels Apr 17, 2024

peteharverson mentioned this issue May 1, 2024

[ML][Meta] Technical debt and maintenance work for 8.15.0 #181603

Closed

qn895 removed their assignment May 9, 2024

walterra mentioned this issue Jul 5, 2024

[APM] Correlations: Update field candidates request. #186182

Merged

2 tasks

walterra added v8.16.0 and removed v8.15.0 labels Jul 5, 2024

peteharverson mentioned this issue Jul 8, 2024

[ML][Meta] Technical debt and maintenance work for 8.16.0 #187772

Closed

walterra mentioned this issue Oct 8, 2024

[ML] Transforms: Improve data grid memoization. #195394

Merged

3 tasks

peteharverson added v8.17.0 and removed v8.16.0 labels Oct 16, 2024

peteharverson mentioned this issue Oct 17, 2024

[ML][Meta] Technical debt and maintenance work for 8.17.0 #196657

Closed

peteharverson added v8.18.0 and removed v8.17.0 labels Nov 13, 2024

peteharverson mentioned this issue Nov 21, 2024

[ML][Meta] Technical debt and maintenance work for 8.18.0 #201131

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Use field caps option `include_empty_fields` to identify populated fields. #178606

[ML] Use field caps option `include_empty_fields` to identify populated fields. #178606

walterra commented Mar 13, 2024 •

edited by peteharverson

Loading

elasticmachine commented Mar 13, 2024

[ML] Use field caps option include_empty_fields to identify populated fields. #178606

[ML] Use field caps option include_empty_fields to identify populated fields. #178606

Comments

walterra commented Mar 13, 2024 • edited by peteharverson Loading

plugins/ml

plugins/aiops

plugins/transform

plugins/data_visualizer

plugins/apm

elasticmachine commented Mar 13, 2024

[ML] Use field caps option `include_empty_fields` to identify populated fields. #178606

[ML] Use field caps option `include_empty_fields` to identify populated fields. #178606

walterra commented Mar 13, 2024 •

edited by peteharverson

Loading

`plugins/ml`

`plugins/aiops`

`plugins/transform`

`plugins/data_visualizer`

`plugins/apm`