Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rate limiter for bulk request #567

Merged
merged 9 commits into from
Aug 16, 2024

Conversation

ykmr1224
Copy link
Collaborator

Description

  • Add rate limiter for bulk request
  • Without rate limit, it would send request until the destination OpenSearch throttle requests (especially when NONE refresh policy is used).
  • Added config param spark.datasource.flint.write.bulkRequestRateLimitPerNode to specify the rate limit per node.
  • The default is set to 0, which means no rate limit.

Issues Resolved

List any issues this PR will resolve, e.g. Closes [...].

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Tomoyuki Morita <[email protected]>
@ykmr1224 ykmr1224 marked this pull request as ready for review August 15, 2024 16:56
Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
Copy link
Collaborator

@penghuo penghuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During Index/mv refreshing, can user adjust this value if the customer under pressure.
My concern is that this configuration may be helpful during the testing stage but not in the production environment. We should investigate methods to dynamically control the ingestion rate.

docs/index.md Outdated
@@ -527,6 +527,7 @@ In the index mapping, the `_meta` and `properties`field stores meta and schema i
- `spark.datasource.flint.write.batch_size`: "The number of documents written to Flint in a single batch request. Default value is Integer.MAX_VALUE.
- `spark.datasource.flint.write.batch_bytes`: The approximately amount of data in bytes written to Flint in a single batch request. The actual data write to OpenSearch may more than it. Default value is 1mb. The writing process checks after each document whether the total number of documents (docCount) has reached batch_size or the buffer size has surpassed batch_bytes. If either condition is met, the current batch is flushed and the document count resets to zero.
- `spark.datasource.flint.write.refresh_policy`: default value is false. valid values [NONE(false), IMMEDIATE(true), WAIT_UNTIL(wait_for)]
- `spark.datasource.flint.write.bulkRequestRateLimitPerNode`: default value is 0, which disables rate limit.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.1 means 1 request per 10 seconds? could u add more in doc.

Copy link
Collaborator Author

@ykmr1224 ykmr1224 Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't accept decimal value. We could support if we modify the rate limit period, but I think it would become complicated considering someone might specify a value like 1.23. If we want to reduce the traffic less than 1 request/sec, we can reduce the size of batch instead. It would reduce the actual number of request.


private BulkRequestRateLimiterHolder() {}

public synchronized static BulkRequestRateLimiter getBulkRequestRateLimiter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does Holder necessary? move to BulkRequestRateLimiter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written in the comment, it is needed to use shared single instance, and make BulkRequestRateLimiter testable. If we directly make BulkRequestRateLimiter singleton, we cannot test it with different parameters.

return execute(OS_WRITE_OP_METRIC_PREFIX, () -> client.bulk(bulkRequest, options));
return execute(OS_WRITE_OP_METRIC_PREFIX, () -> {
try {
rateLimiter.acquirePermit();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Each bulk request contains multiple index request. if the throttle is on each index request, the bulk request limit may not help.
  2. How does OpenSearch customer configure this paramater?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Rate limit on bulk request would limit the overall number of index request.
  2. They can add it as an extra Spark parameter for now. We might want to add an attribute for Datasource so we can specify per datasource.

@ykmr1224
Copy link
Collaborator Author

During Index/mv refreshing, can user adjust this value if the customer under pressure. My concern is that this configuration may be helpful during the testing stage but not in the production environment. We should investigate methods to dynamically control the ingestion rate.

Let me take this as an item for long term solution. If we use token bucket solution with external state store(DynamoDB, etc.), we can adjust the limit dynamically.

Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
@ykmr1224
Copy link
Collaborator Author

Added [Experimental] to the doc.

@penghuo penghuo merged commit 15ee355 into opensearch-project:main Aug 16, 2024
6 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 16, 2024
Signed-off-by: Tomoyuki Morita <[email protected]>
(cherry picked from commit 15ee355)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
penghuo pushed a commit that referenced this pull request Aug 16, 2024
(cherry picked from commit 15ee355)

Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 16, 2024
Signed-off-by: Tomoyuki Morita <[email protected]>
(cherry picked from commit 15ee355)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
penghuo pushed a commit that referenced this pull request Aug 17, 2024
(cherry picked from commit 15ee355)

Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants