[Dev] Add High Cardinality Indexer to Kibana as kbn-data-forge #174559

simianhacker · 2024-01-09T23:55:05Z

Summary

This PR adds the High Cardinality Indexer to Kibana as a new package called kbn-data-forge. It also replaces kbn-infra-forge usage in the test and is the preferred way to generate data for Observability use cases, specifically for SLO testing.

Todo

Replace kbn-infra-forge usage
Create convenience functions for testing (generate and cleanup)
Make the logger (LoggingTool) configurable as an injected dependency
Make the Elasticsearch client (Client) configurable as an injected dependency
Fix the ECS Generate commands
Add CLI options via Commander

CLI Help Screen

Usage: data_forge.js [options]

A data generation tool that will create realistic data with different scenarios.

Options:
  --config <filepath>                  The YAML config file
  --lookback <datemath>                When to start the indexing (default: "now-15m")
  --events-per-cycle <number>          The number of events per cycle (default: 1)
  --payload-size <number>              The size of the ES bulk payload (default: 10000)
  --concurrency <number>               The number of concurrent connections to Elasticsearch (default: 5)
  --index-interval <milliseconds>      The interval of the data in milliseconds (default: 60000)
  --dataset <dataset>                  The name of the dataset to use. Valid options: "fake_logs", "fake_hosts", "fake_stack" (default: "fake_logs")
  --scenario <scenerio>                The scenario to label the events with (default: "good")
  --elasticsearch-host <address>       The address to the Elasticsearch cluster (default: "http://localhost:9200")
  --elasticsearch-username <username>  The username to for the Elasticsearch cluster (default: "elastic")
  --elasticsearch-password <password>  The password for the Elasticsearch cluster (default: "changeme")
  --elasticsearch-api-key <key>        The API key to connect to the Elasticsearch cluster
  --kibana-url <address>               The address to the Kibana server (default: "http://localhost:5601")
  --kibana-username <username>         The username for the Kibana server (default: "elastic")
  --kibana-password <password>         The password for the Kibana server (default: "changeme")
  --install-kibana-assets              This will install index patterns, visualizations, and dashboards for the dataset
  --event-template <template>          The name of the event template (default: "good")
  --reduce-weekend-traffic-by <ratio>  This will reduce the traffic on the weekends by the specified amount. Example: 0.5 will reduce the traffic by half (default: 0)
  --ephemeral-project-ids <number>     The number of ephemeral projects to create. This is only enabled for the "fake_stack" dataset. It will create project IDs that will last 5 to 12 hours. (default: 0)
  -h, --help                           output usage information

Testing an Example

Run the following command against a clean Kibana development enviroment:

node x-pack/scripts/data_forge.js --events-per-cycle 200 --lookback now-1h --install-kibana-assets --ephemeral-project-ids 10 --dataset fake_stack

This should install a handful of DataViews (Admin Console, Message Processor, Nginx Logs, Mongodb Logs) along with a few dashboards and visualizations.

apmmachine · 2024-01-09T23:55:20Z

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

/oblt-deploy : Deploy a Kibana instance using the Observability test environments.
/oblt-deploy-serverless : Deploy a serverless Kibana instance using the Observability test environments.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

simianhacker · 2024-01-11T15:01:07Z

/ci

elasticmachine · 2024-01-16T14:11:03Z

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

elasticmachine · 2024-01-22T21:33:16Z

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

- Remove default logger to allow user to override - removing HCI index prefixes; adding cleanup; adding generate; overriding client; overriding logger; - Updating test to replace infra-forge with data-forge - Adding codeowners for kbn-data-forge - Fixing the paths for the ECS generate command - Implimenting cli options - Fixing spelling errors - Running yarn kbn bootstrap - Removing depricated faker.random.numeric - Shaping the data correctly for the tests - Fixing config for each test - Fixing jest.config.js - second attempt at fixing jest.config.js - Attempting to fix the document count test - Attempting to fix the document count test - Fixing types - Removing depreciated installTemplate function and coresponding templates - Fixing tests to be more robust so they don't execute until the source documents are available. - Fixing typo - Adding changes to burn rate rule - Fixing document checks

simianhacker · 2024-01-22T21:55:16Z

Sorry... I had to force push. Somehow, the last merge commit changes a bunch of unrelated files which in turn triggered a bunch of reviews that were unnecessary.

…fix'

kibana-ci · 2024-01-23T00:58:24Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: 808ca33

Failed CI Steps

FTR Configs #58

Test Failures

[job] [logs] FTR Configs #58 / machine learning - short tests model management trained models for ML power user with imported models deletes the imported model pt_tiny_pass_through

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/data-forge`	-	26	+26

Unknown metric groups

API count

id	before	after	diff
`@kbn/data-forge`	-	26	+26

ESLint disabled line counts

id	before	after	diff
`@kbn/data-forge`	-	1	+1

Total ESLint disabled count

id	before	after	diff
`@kbn/data-forge`	-	1	+1

History

💛 Build #188641 was flaky c523e5b
💛 Build #188585 was flaky c70ae77a9cf1a98423f39bbc7f09037f8acd8a42
💔 Build #188501 failed 999015e91f5e562a1a22425491183b6c30bf5842
💔 Build #188322 failed eadca1666a94a89e5be51dbc778014f768e244ec
💔 Build #188306 failed 42d0c4bc1cb030c35c1a14be925823ade1ba3e9a
💔 Build #188299 failed cf8a7d72fd1a8a331d64613ea2c1793971aa336a

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

kdelemme · 2024-01-23T21:00:46Z

Tested locally and work as expected!

kdelemme

LGTM! Tested locally and reviewed the test changes 👍🏻

kdelemme · 2024-01-23T21:02:37Z

...k/test_serverless/api_integration/test_suites/observability/burn_rate_rule/burn_rate_rule.ts

+      dataForgeConfig = {
+        schedule: [
+          {
+            template: 'good',
+            start: 'now-15m',
+            end: 'now+5m',
+            metrics: [
+              { name: 'system.cpu.user.pct', method: 'linear', start: 2.5, end: 2.5 },
+              { name: 'system.cpu.total.pct', method: 'linear', start: 0.5, end: 0.5 },
+              { name: 'system.cpu.total.norm.pct', method: 'linear', start: 0.8, end: 0.8 },
+            ],
+          },
+        ],
+        indexing: { dataset: 'fake_hosts' as Dataset, eventsPerCycle: 1, interval: 10000 },
+      };
+      dataForgeIndices = await generate({ client: esClient, config: dataForgeConfig, logger });
+      await alertingApi.waitForDocumentInIndex({ indexName: DATA_VIEW, docCountTarget: 360 });


…ic#174559) ## Summary This PR adds the [High Cardinality Indexer](https://github.com/elastic/high-cardinality-cluster) to Kibana as a new package called `kbn-data-forge`. It also replaces `kbn-infra-forge` usage in the test and is the preferred way to generate data for Observability use cases, specifically for SLO testing. ### Todo - [x] Replace `kbn-infra-forge` usage - [x] Create convenience functions for testing (`generate` and `cleanup`) - [x] Make the logger (`LoggingTool`) configurable as an injected dependency - [x] Make the Elasticsearch client (`Client`) configurable as an injected dependency - [x] Fix the ECS Generate commands - [x] Add CLI options via Commander ### CLI Help Screen ``` Usage: data_forge.js [options] A data generation tool that will create realistic data with different scenarios. Options: --config <filepath> The YAML config file --lookback <datemath> When to start the indexing (default: "now-15m") --events-per-cycle <number> The number of events per cycle (default: 1) --payload-size <number> The size of the ES bulk payload (default: 10000) --concurrency <number> The number of concurrent connections to Elasticsearch (default: 5) --index-interval <milliseconds> The interval of the data in milliseconds (default: 60000) --dataset <dataset> The name of the dataset to use. Valid options: "fake_logs", "fake_hosts", "fake_stack" (default: "fake_logs") --scenario <scenerio> The scenario to label the events with (default: "good") --elasticsearch-host <address> The address to the Elasticsearch cluster (default: "http://localhost:9200") --elasticsearch-username <username> The username to for the Elasticsearch cluster (default: "elastic") --elasticsearch-password <password> The password for the Elasticsearch cluster (default: "changeme") --elasticsearch-api-key <key> The API key to connect to the Elasticsearch cluster --kibana-url <address> The address to the Kibana server (default: "http://localhost:5601") --kibana-username <username> The username for the Kibana server (default: "elastic") --kibana-password <password> The password for the Kibana server (default: "changeme") --install-kibana-assets This will install index patterns, visualizations, and dashboards for the dataset --event-template <template> The name of the event template (default: "good") --reduce-weekend-traffic-by <ratio> This will reduce the traffic on the weekends by the specified amount. Example: 0.5 will reduce the traffic by half (default: 0) --ephemeral-project-ids <number> The number of ephemeral projects to create. This is only enabled for the "fake_stack" dataset. It will create project IDs that will last 5 to 12 hours. (default: 0) -h, --help output usage information ``` ### Testing an Example Run the following command against a clean Kibana development enviroment: ``` node x-pack/scripts/data_forge.js --events-per-cycle 200 --lookback now-1h --install-kibana-assets --ephemeral-project-ids 10 --dataset fake_stack ``` This should install a handful of DataViews (Admin Console, Message Processor, Nginx Logs, Mongodb Logs) along with a few dashboards and visualizations. --------- Co-authored-by: kibanamachine <[email protected]>

simianhacker marked this pull request as ready for review January 16, 2024 13:57

simianhacker requested review from a team as code owners January 16, 2024 13:57

simianhacker added release_note:skip Skip the PR/issue when compiling release notes v8.13.0 Team:obs-ux-management Observability Management User Experience Team labels Jan 16, 2024

Ikuni17 approved these changes Jan 17, 2024

View reviewed changes

simianhacker requested review from a team as code owners January 22, 2024 21:33

simianhacker requested review from e40pud, pzl and ferullo January 22, 2024 21:33

botelastic bot added the Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team label Jan 22, 2024

simianhacker force-pushed the kbn-data-forge branch from 6f92aee to 52562a3 Compare January 22, 2024 21:53

simianhacker removed request for a team and pzl January 22, 2024 21:53

simianhacker removed request for e40pud and ferullo January 22, 2024 21:53

kibanamachine and others added 4 commits January 22, 2024 21:59

[CI] Auto-commit changed files from 'node scripts/lint_ts_projects --…

a9ff7e3

…fix'

[CI] Auto-commit changed files from 'node scripts/lint_packages --fix'

73f646f

[CI] Auto-commit changed files from 'node scripts/generate codeowners'

c523e5b

Indexing data 5m into the future

808ca33

kdelemme approved these changes Jan 23, 2024

View reviewed changes

simianhacker merged commit 5f72e78 into elastic:main Jan 23, 2024
35 checks passed

kibanamachine added the backport:skip This commit does not require backporting label Jan 23, 2024

simianhacker deleted the kbn-data-forge branch April 17, 2024 15:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dev] Add High Cardinality Indexer to Kibana as kbn-data-forge #174559

[Dev] Add High Cardinality Indexer to Kibana as kbn-data-forge #174559

simianhacker commented Jan 9, 2024 •

edited

Loading

apmmachine commented Jan 9, 2024

simianhacker commented Jan 11, 2024

elasticmachine commented Jan 16, 2024

elasticmachine commented Jan 22, 2024

simianhacker commented Jan 22, 2024

kibana-ci commented Jan 23, 2024

API count

ESLint disabled line counts

Total ESLint disabled count

kdelemme commented Jan 23, 2024

kdelemme left a comment

kdelemme Jan 23, 2024

[Dev] Add High Cardinality Indexer to Kibana as kbn-data-forge #174559

[Dev] Add High Cardinality Indexer to Kibana as kbn-data-forge #174559

Conversation

simianhacker commented Jan 9, 2024 • edited Loading

Summary

Todo

CLI Help Screen

Testing an Example

apmmachine commented Jan 9, 2024

🤖 GitHub comments

simianhacker commented Jan 11, 2024

elasticmachine commented Jan 16, 2024

elasticmachine commented Jan 22, 2024

simianhacker commented Jan 22, 2024

kibana-ci commented Jan 23, 2024

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

Metrics [docs]

Public APIs missing comments

API count

ESLint disabled line counts

Total ESLint disabled count

History

kdelemme commented Jan 23, 2024

kdelemme left a comment

Choose a reason for hiding this comment

kdelemme Jan 23, 2024

Choose a reason for hiding this comment

simianhacker commented Jan 9, 2024 •

edited

Loading