Indexing flow enhancements for admission controller #8911
Labels
enhancement
Enhancement or improvement to existing feature or request
Indexing
Indexing, Bulk Indexing and anything related to indexing
Overview
The parent RFC #8910 discusses the admission control framework, which limits and restricts the incoming requests early when a node begins to go under stress.
OpenSearch today provides few gating mechanisms to protect a node when under duress, via the concepts of queue rejections and circuit breakers. However, queue sizes are fixed and isolated and do not effectively represent the total work required to be done. Similarly, Circuit breakers act as a last line of defence, are mostly too late to act upon, and do not offer fairness. Some of these gaps create availability issues with the cluster when under duress due to hardware failures, node performance degradation, or traffic bursts.
Shard Indexing Pressure today tries to address this to some extent by rejecting indexing requests based on the memory accounting at the shard level along with other key performance factors like throughput and last successful requests.
Challenges in the Shard Indexing BackPressure
Proposed Solution
The admission controller framework we are suggesting will track the resource utilisation for all the downstream nodes and maintain these stats at the coordinator. Along with the shard indexing backpressure, which tracks the shard level view for downstream nodes, we are proposing a new solution that will give the node-level performance stats for all the downstream nodes.
In the admission controller framework, we will support multiple levels of rejection, such as rejection at the coordinator based on the coordinator's or downstream nodes resource utilisation and rejection at the target node based on the target node's resource utilisation. As part of it, we will enhance the indexing flow to proactively reject requests if the data nodes are stressed.
Reject the Incoming Indexing request based on the resource utilization of downstream nodes
Proposed Indexing Flow
For every bulk shard request, we will evaluate the performance stats of all nodes that are associated with primary or replica shards of the indexing operation and proceed with the request either by allowing it or rejecting it.
Primary Shard Node is in Stress
Replica Shard Node is in Stress
This will be a further extension to node-level rejections based on nodeLimitBreached at the coordinator which now consider rejections based on the CPU/JVM/IO usage of the downstream nodes.
Rejection of Incoming Indexing requests at target node
The coordinator will allow the requests to the stressed target nodes, when it doesn’t have the latest stats for the downstream nodes. In such cases, we will reject requests at target/data nodes.
Primary Shard Node is in Stress
Replica Shard Node is in Stress
Co-authored by @bharath-techie
The text was updated successfully, but these errors were encountered: