-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[META] Reader/Writer Separation #15306
Comments
For scale to zero implementation.To be able to scale down the primary and replicas and keep only the search replicas for search traffic and ability to bring back the primary and regular replicas for write (index) traffic. Update Cluster State (Initial)
Store Original Configuration
Prepare for Scale Down
Update Routing Table
Remove Shards
Handle Cluster Health
Scale Up Process (when flag is removed)
Thanks to @mch2 for the initial setup! Based on the above points, please share any additional thoughts and let me know if required any changes to the implementation order. Thanks |
Hey thanks for writing this out @prudhvigodithi, couple thoughts.
I don't think we need to store anything new here other than a flag to indicate that we are deleting writers at an index level, and perhaps the latest synced primary term and SegmentInfos version per shard (so that searchers will sync until they reach that version). The count of pri/replica would remain as is in IndexMetadata. Routing tables I think should remove the primary and writer replica entries so that if writers are scaled up again the shards are re-allocated - or we leave those entries in the tables simply as unassigned shards, but thats unnecessary if they are never reallocated which would likely be the vast majority of cases. Lastly, I think we should go for a flag name that better indicates that the writers are what are removed. Scale to zero could cover both reads/writes. Maybe something like |
Thanks @mch2 here is a small POC for adding the index flag
|
Coming from #15306 (comment), having the right allocation rules dint help to directly remove the primary and replica shards. I had to make few changes with segment replication code and main...prudhvigodithi:OpenSearch:searchonly, @mch2 please take a look and see if the initial approach. Now with the following updates I can see the shards are actually removed.
The status is
My initial approach is to have a POC to completely scale down (scale to zero) the primary and replicas having just the
Now next steps is
Thank you |
Had an offline sync up with @mch2 and yes we should be able to remove the custom allocation decider (
Directly modifies the routing tables and closes the primary and replica shards having only search replicas and allowing the search traffic. From
To
When updated to Now next steps is:
|
Coming from #15306 (comment) with the latest change main...prudhvigodithi:OpenSearch:searchonly, a final sync is done by the primary shards to remote store before closing the shard.
The cluster health and cluster state and is also handled, notice the
|
thanks @prudhvigodithi for pushing this fwd, looking good so far! Feel free to raise a draft when ready and we can discuss there.
Lets prioritize the remote case. Users who want this separation likely won't be syncing from primary shards in the first place. As far as your next steps:
This should be taken care of on primary close if we ensure the engine is flushed
Yes, we'll need to ensure these shards have the latest checkpoint before stopping the async task in IndexService. We should store additional metadata with the store itself rather than putting more into cluster state. |
Thanks @mch2, I have updated the method AFAIK the last pending POC check is to stop the search replicas (or do a final sync) once |
Ah interesting, so in the normal index closure case we do not guarantee the docs have been indexed before closure only persisted via remote translog. We'll need to change this for the scale down case. |
I took a stab at the cluster health in more details, I did a small refactoring of When all green
With scale down index (scale to zero)
With unhealthy search replicas:
|
Please describe the end goal of this project
High level goal is to separate indexing and search traffic within remote store enabled clusters to achieve independent scalability and failure/workload isolation.
Supporting References
#15237,
#7258
#14596
Issues
Experimental feature goals - Achieve workload & failure isolation with read scale to zero.
-- Experimental (target 2.17)
GA feature goals: Achieve write scale to zero, ISM/Clients integration, and reasonable (cat/health) API updates.
Related component
Search:Performance
The text was updated successfully, but these errors were encountered: