You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During restart, OpenSearch appears to attempt to relocate the Primary-shard of a remote_snapshot type Index and fails.
This might be an instance of the problem mentioned in #11563 (comment).
[2024-02-15T15:43:08,808][INFO ][o.o.c.r.a.a.BalancedShardsAllocator] [10.0.1.146] Swap relocation performed for shard [[index_5][0], node[ECzLzBEhTYmA58qyuEWNaQ], [R], s[STARTED], a[id=WGd5CZOZTf2-qD411BjkoQ]]
[2024-02-15T15:43:09,012][WARN ][o.o.i.c.IndicesClusterStateService] [10.0.1.146] [index_5][0] marking and sending shard failed due to [failed updating shard routing entry]
java.lang.IllegalArgumentException: illegal state: trying to move shard from primary mode to replica mode. Current [index_5][0], node[ECzLzBEhTYmA58qyuEWNaQ], [P], s[STARTED], a[id=WGd5CZOZTf2-qD411BjkoQ], new [index_5][0], node[ECzLzBEhTYmA58qyuEWNaQ], [R], s[STARTED], a[id=WGd5CZOZTf2-qD411BjkoQ]
at org.opensearch.index.shard.IndexShard.updateShardState(IndexShard.java:597) ~[opensearch-2.11.1.jar:2.11.1]
at org.opensearch.indices.cluster.IndicesClusterStateService.updateShard(IndicesClusterStateService.java:710) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:650) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:293) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:606) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:593) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:561) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:484) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:186) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:849) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:282) [opensearch-2.11.1.jar:2.11.1]
at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:245) [opensearch-2.11.1.jar:2.11.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
This appears to leave a Replica shard perpetually in the state of INITIALIZING:
This isn't exactly trivial to reproduce, there seems to be something else involved that causes the problem. Here are the steps taken to arrive at the current-state however:
The expected behavior is for all shards to be successfully recovered upon restart without operations that result in a Yellow-state (e.g. orphaned replica-shards).
Additional Details
Plugins
Please list all plugins currently enabled.
Thanks @etgraylog! This does indeed look like the issue fixed by #11563. That fix is included in 2.12, which will be released in the coming week. Will you be able to pick up that release and test this?
Thanks @etgraylog! This does indeed look like the issue fixed by #11563. That fix is included in 2.12, which will be released in the coming week. Will you be able to pick up that release and test this?
Thanks @etgraylog! This does indeed look like the issue fixed by #11563. That fix is included in 2.12, which will be released in the coming week. Will you be able to pick up that release and test this?
Describe the bug
During restart, OpenSearch appears to attempt to relocate the Primary-shard of a
remote_snapshot
type Index and fails.This might be an instance of the problem mentioned in #11563 (comment).
This appears to leave a Replica shard perpetually in the state of INITIALIZING:
Without any obvious causes of why:
Related component
Storage:Snapshots
To Reproduce
This isn't exactly trivial to reproduce, there seems to be something else involved that causes the problem. Here are the steps taken to arrive at the current-state however:
https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/snapshots/searchable_snapshot/#create-a-searchable-snapshot-index
Expected behavior
The expected behavior is for all shards to be successfully recovered upon restart without operations that result in a Yellow-state (e.g. orphaned replica-shards).
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
Debian
Bullseye
2.11.1
Additional context
This might be an instance of the problem mentioned in #11563 (comment).
The text was updated successfully, but these errors were encountered: