-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Redefine the computation of segment replication metrics in Node Stats #16801
Comments
I assume "search replicas" term is from Reader-Writer separation proposal #15237 and it would be using pull based replication. Is that correct?
If replica is overloaded and download is slow, search replica would be downloading data from metadata that was polled, let's say 10 mins back. Unless it polls the latest metadata again, will we be able to understand the bytes left to be downloaded? |
But they will still need this information for doing write rejections due to searchers lagging . I like this method as it will help in pin pointing exact node/shard copy which is having issue, as opposed to a convoluted approach today. |
I think we can do something like this: Each replica regardless of type tracks the latest received checkpoint that includes a timestamp of when the primary (writer) refreshed on it. Search replica would have to poll for the latest cp from the store while writer replicas get this already when checkpoints are published. @sachinpkale is correct, the current pull based replication skips the metadata fetch if the search replica is already running replication, so we'd have to add a separate process to fetch the cp or allow the checkpoint step to continue. Node stats should report the state of the replicas on the node, not the state of the replicas of the primaries on the node (as is today). Each shard then compares its currently served checkpoint to the "latest received". If the sis version is different ie the shard is behind the store, lag becomes the diff between current time & the latest received cp's timestamp which is roughly the replication lag - upload time. Bytes behind is already computed by the diff of two checkpoints as metadata is included. Checkpoints behind could be the diff in sis version, although this metric is pretty useless imo. Backpressure: Ultimately we are giving up some precision on lag computation for simplicity, which I think is well worth it. |
Is your feature request related to a problem? Please describe
Currently, during segment replication, primary shards collect stats from each of their replicas. However, search replicas will not report stats to the primary shard.
Node stats return the following metrics, which are based on stats collected by the primary from its replicas:
We want to
Describe the solution you'd like
The idea is to stop comparing replicas with the primary, as is done currently, and instead report stats for each replica based on its ongoing replication events, similar to
cat recovery
. This approach decouples the primary from stats calculation, and the primary shard will no longer need to maintain segment replication checkpoint information for all its replicas in theReplicationTracker
.We can generate metrics to answer the following questions:
This will decouple the behavior so that we can use point in time stats of each shard for the node stats calculation for segment replication.
Please note that we are currently modifying how search replicas compute and return stats in cat segrep. Details are discussed here: #15534.
Related component
Search:Performance
The text was updated successfully, but these errors were encountered: