Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New large-logs-dataset challenge in elastic/logs #632

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

salvatore-campagna
Copy link
Contributor

@salvatore-campagna salvatore-campagna commented Jul 25, 2024

Introduce a new large-logs-dataset challenge to elastic/logs track which duplicates data indexed by restoring
a snapshot multiple times. The number of snapshot restore operations is controlled by the variable snapshot_restore_counts which by default has a value of 100.

This would result in indexing raw_data_volume_per_day bytes multiplied by snapshot_restore_counts.
As an example if raw_data_volume_per_day is 50 GB then the index will have about 5 TB of raw data.
Note that the index, anyway, will include duplicated data.

This is meant to be used just as a fast way to increase the amount of data in an index skipping the expensive data
generation and indexing process.

Resolves #631

{% set p_dsl_poll_interval = (dsl_poll_interval | default(false) ) %}
{% set p_dsl_default_rollover = (dsl_default_rollover | default(false) ) %}

{% set p_skip_fleet_globals = (skip_fleet_globals | default(false) ) %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied this from another PR. Will eventually remove after the other PR is merged.
This is necessary to avoid an error which happens when deleting component templates.

@salvatore-campagna
Copy link
Contributor Author

@elastic/es-perf at the moment this challenge is failing with the following error:

esrally.exceptions.RallyAssertionError: Request returned an error. Error type: api, Description: repository_verification_exception ({'error': {'root_cause': [{'type': 'repository_verification_exception', 'reason': '[logging] path [observability/logging] is not accessible on master node'}], 'type': 'repository_verification_exception', 'reason': '[logging] path [observability/logging] is not accessible on master node', 'caused_by': {'type': 'i_o_exception', 'reason': 'Unable to upload object [observability/logging/tests-CcJdFLNpQ0SwoHLxwOkv7w/master.dat] using a single upload', 'caused_by': {'type': 'amazon_s3_exception', 'reason': 'Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: SZ54XD85DV32K3RX; S3 Extended Request ID: WdDrng9Ah0EmndL4q6y58b+vTMxCCmKgdffmUIALUXcSqpMrjAJS7enlwhkzkqWOQf3oYCW7ENg=; Proxy: null)'}}}, 'status': 500}), HTTP Status: 500

any idea how to fix it? It looks like some S3 configuration might be missing.

@salvatore-campagna
Copy link
Contributor Author

I tried running another challenge in the elastic/logs track which uses snapshots and s3 and I see the same issue. I guess this means my user is lacking some permission.

esrally.exceptions.RallyError: Cannot run task [register-snapshot-repository]: Request returned an error. Error type: api, Description: repository_verification_exception ({'error': {'root_cause': [{'type': 'repository_verification_exception', 'reason': '[logging] path [observability/logging] is not accessible on master node'}], 'type': 'repository_verification_exception', 'reason': '[logging] path [observability/logging] is not accessible on master node', 'caused_by': {'type': 'i_o_exception', 'reason': 'Unable to upload object [observability/logging/tests-R6KKPdvrTTK3FnuBeibL1w/master.dat] using a single upload', 'caused_by': {'type': 'amazon_s3_exception', 'reason': 'Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: SWVHM6P139B4H9J7; S3 Extended Request ID: wsLQpvhYYCqMvFppInjYTLzXO83vxMvpZy3kIqlmKgRyZ88cvUCLOex1tFGOuHcro8zcjFAFzA1/WqO2RGCFwHvHS7s4ODV1; Proxy: null)'}}}, 'status': 500}), HTTP Status: 500

@kkrik-es kkrik-es changed the title New large-logs-datset challenge in elastic/logs New large-logs-dataset challenge in elastic/logs Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New large-logs-dataset challenge in elastic/logs track
1 participant