-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[META] Remote Routing Table - v2.15 #12995
Labels
Meta
Meta issue, not directly linked to a PR
ShardManagement:Routing
v2.15.0
Issues and PRs related to version 2.15.0
Comments
[Triage - attendees 1 2 3 4 5 6 7 8] |
github-project-automation
bot
moved this from 🆕 New
to ✅ Done
in Shard Management Project Board
Apr 3, 2024
github-project-automation
bot
moved this from ✅ Done
to 🏗 In progress
in Shard Management Project Board
Jun 2, 2024
This was referenced Jun 4, 2024
74 tasks
This was referenced Jun 10, 2024
himshikha
changed the title
[META] Remote Routing Table
[META] Remote Routing Table - v2.15
Jun 11, 2024
github-project-automation
bot
moved this from 🏗 In progress
to ✅ Done
in Shard Management Project Board
Jul 9, 2024
github-project-automation
bot
moved this to 2.15.0 (Release window opens on June 10th, 2024 and closes on June 25th, 2024)
in OpenSearch Project Roadmap
Aug 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Meta
Meta issue, not directly linked to a PR
ShardManagement:Routing
v2.15.0
Issues and PRs related to version 2.15.0
Please describe the end goal of this project
Project Meta: #14164
This Meta tracks issues to be targetted for v2.15.
Each shard movement results in a cluster state update which needs to be communicated to all the data nodes to be able to effectively route requests. This results in a scaling problem for a reasonably bigger size having large number of nodes. This can cause inter-node network to get swamped due to bigger states and high volume/frequency of network transfers.
Proposed Solution : Reduce memory and communication overhead for routing table updates using a remote store as an intermediate store and leveraging remote store interactions for data transfers and sparing the node to node network bandwidth
We will move Routing table to remote store. Cluster manager node will be responsible for updating the remote store whenever any updates in routing happen. Since we will have the complete table in storage, we can optimize on what we want to keep in memory on the nodes and use remote store to get the routing information whenever required. Data nodes will only need to keep routings for replica shards whose primary reside on the node.
For reducing communication overhead, cluster state publication will intimate data nodes of the change with updated cluster state term and version rather than complete diff. Data nodes will download the updated routing information from storage. This would make communication from cluster manager faster and each node can individually update their local memory.
Issues
Related component
ShardManagement:Routing
The text was updated successfully, but these errors were encountered: