Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Avoid throttling when writing data to index #640

Open
ykmr1224 opened this issue Sep 10, 2024 · 0 comments
Open

[FEATURE] Avoid throttling when writing data to index #640

ykmr1224 opened this issue Sep 10, 2024 · 0 comments
Labels

Comments

@ykmr1224
Copy link
Collaborator

Is your feature request related to a problem?
When flint execute indexing query, each worker node will call OpenSearch bulk API to write data.
Currently it could end up causing throttling and execution failure due to too high request rate.
Although already implemented rate limiter (#567) and bulk request retry (#572), it still have high chance to cause failure especially when:

  • There are many worker node
  • OpenSearch Serverless is used as index store (does not support WAIT_FOR refresh policy)
  • Other job or other process send requests to the destination

What solution would you like?
To fully address this issue, we want to introduce adaptive rate limit to bulk request. One example of adaptive control algorithm is AIMD(additive-increase/multiplicative-decrease), which is used for TCP congestion control (ref). By utilizing this kind of algorithm, each node will try to adjust their traffic by gradually increase the call rate and quickly decrease when the request is throttled. (Each node will ideally end up similar call rate)

Netflix/concurrency-limits provides AIMD based concurrency limit and we might be able to utilize it for our implementation.

What alternatives have you considered?

  • Centralized rate limiter using DynamoDB, Redis, etc.
  • Driver node to decide rate limit for each worker node

Do you have any additional context?
n/a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants