New `large-logs-dataset` challenge in `elastic/logs` #632

salvatore-campagna · 2024-07-25T12:56:36Z

Introduce a new large-logs-dataset challenge to elastic/logs track which duplicates data indexed by restoring
a snapshot multiple times. The number of snapshot restore operations is controlled by the variable snapshot_restore_counts which by default has a value of 100.

This would result in indexing raw_data_volume_per_day bytes multiplied by snapshot_restore_counts.
As an example if raw_data_volume_per_day is 50 GB then the index will have about 5 TB of raw data.
Note that the index, anyway, will include duplicated data.

This is meant to be used just as a fast way to increase the amount of data in an index skipping the expensive data
generation and indexing process.

Resolves #631

salvatore-campagna · 2024-07-25T13:03:33Z

elastic/logs/track.json

 {% set p_dsl_poll_interval = (dsl_poll_interval | default(false) ) %}
 {% set p_dsl_default_rollover = (dsl_default_rollover | default(false) ) %}
-
+{% set p_skip_fleet_globals = (skip_fleet_globals | default(false) ) %}


I just copied this from another PR. Will eventually remove after the other PR is merged.
This is necessary to avoid an error which happens when deleting component templates.

salvatore-campagna · 2024-07-26T14:30:52Z

@elastic/es-perf at the moment this challenge is failing with the following error:

esrally.exceptions.RallyAssertionError: Request returned an error. Error type: api, Description: repository_verification_exception ({'error': {'root_cause': [{'type': 'repository_verification_exception', 'reason': '[logging] path [observability/logging] is not accessible on master node'}], 'type': 'repository_verification_exception', 'reason': '[logging] path [observability/logging] is not accessible on master node', 'caused_by': {'type': 'i_o_exception', 'reason': 'Unable to upload object [observability/logging/tests-CcJdFLNpQ0SwoHLxwOkv7w/master.dat] using a single upload', 'caused_by': {'type': 'amazon_s3_exception', 'reason': 'Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: SZ54XD85DV32K3RX; S3 Extended Request ID: WdDrng9Ah0EmndL4q6y58b+vTMxCCmKgdffmUIALUXcSqpMrjAJS7enlwhkzkqWOQf3oYCW7ENg=; Proxy: null)'}}}, 'status': 500}), HTTP Status: 500

any idea how to fix it? It looks like some S3 configuration might be missing.

salvatore-campagna · 2024-07-26T16:47:58Z

I tried running another challenge in the elastic/logs track which uses snapshots and s3 and I see the same issue. I guess this means my user is lacking some permission.

esrally.exceptions.RallyError: Cannot run task [register-snapshot-repository]: Request returned an error. Error type: api, Description: repository_verification_exception ({'error': {'root_cause': [{'type': 'repository_verification_exception', 'reason': '[logging] path [observability/logging] is not accessible on master node'}], 'type': 'repository_verification_exception', 'reason': '[logging] path [observability/logging] is not accessible on master node', 'caused_by': {'type': 'i_o_exception', 'reason': 'Unable to upload object [observability/logging/tests-R6KKPdvrTTK3FnuBeibL1w/master.dat] using a single upload', 'caused_by': {'type': 'amazon_s3_exception', 'reason': 'Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: SWVHM6P139B4H9J7; S3 Extended Request ID: wsLQpvhYYCqMvFppInjYTLzXO83vxMvpZy3kIqlmKgRyZ88cvUCLOex1tFGOuHcro8zcjFAFzA1/WqO2RGCFwHvHS7s4ODV1; Proxy: null)'}}}, 'status': 500}), HTTP Status: 500

feature; new large-logs-datset challenge in elastic/logs

033876c

salvatore-campagna added the enhancement label Jul 25, 2024

salvatore-campagna self-assigned this Jul 25, 2024

salvatore-campagna commented Jul 25, 2024

View reviewed changes

salvatore-campagna added 5 commits July 26, 2024 10:08

fix: include missing comma in for loop

d94ed9c

fix: always have a comma after for loop

5b18d94

fix: wait for completion

3b6edb5

fix: add missing comma before for loop

c5b68e4

fix: commas and parenthesis are driving me crazy

5fb0a42

salvatore-campagna requested review from inqueue and gareth-ellis July 26, 2024 15:21

kkrik-es changed the title ~~New large-logs-datset challenge in elastic/logs~~ New large-logs-dataset challenge in elastic/logs Jul 30, 2024

kkrik-es mentioned this pull request Aug 2, 2024

Create large-logs-dataset challenge #634

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New `large-logs-dataset` challenge in `elastic/logs` #632

New `large-logs-dataset` challenge in `elastic/logs` #632

salvatore-campagna commented Jul 25, 2024 •

edited by kkrik-es

Loading

salvatore-campagna Jul 25, 2024

salvatore-campagna commented Jul 26, 2024

salvatore-campagna commented Jul 26, 2024

New large-logs-dataset challenge in elastic/logs #632

Are you sure you want to change the base?

New large-logs-dataset challenge in elastic/logs #632

Conversation

salvatore-campagna commented Jul 25, 2024 • edited by kkrik-es Loading

salvatore-campagna Jul 25, 2024

Choose a reason for hiding this comment

salvatore-campagna commented Jul 26, 2024

salvatore-campagna commented Jul 26, 2024

New `large-logs-dataset` challenge in `elastic/logs` #632

New `large-logs-dataset` challenge in `elastic/logs` #632

salvatore-campagna commented Jul 25, 2024 •

edited by kkrik-es

Loading