[Remote Store] Cluster State Applier thread blocked on remote store operations #12026

gbbafna · 2024-01-26T05:33:06Z

Describe the bug

On remote store clusters, we can see cluster state applier thread is blocked on remote store calls. In case when the calls to remote store takes a lot of time, the node is not able to apply the cluster state and LagDetector on the cluster manager kicks it out .

[2024-01-24T20:56:24,412][WARN ][o.o.i.c.IndicesClusterStateService] [a] [.index][5] marking and sending shard failed due to [failed to create shard]
java.io.IOException: java.io.IOException: Exception when listing blobs by prefix [x/y/z/metadata]
    at org.opensearch.index.store.RemoteDirectory.listFilesByPrefixInLexicographicOrder(RemoteDirectory.java:138)
    at org.opensearch.index.store.RemoteSegmentStoreDirectory.readLatestMetadataFile(RemoteSegmentStoreDirectory.java:191)
    at org.opensearch.index.store.RemoteSegmentStoreDirectory.init(RemoteSegmentStoreDirectory.java:145)
    at org.opensearch.index.store.RemoteSegmentStoreDirectory.<init>(RemoteSegmentStoreDirectory.java:132)
    at org.opensearch.index.store.RemoteSegmentStoreDirectoryFactory.newDirectory(RemoteSegmentStoreDirectoryFactory.java:74)
    at org.opensearch.index.store.RemoteSegmentStoreDirectoryFactory.newDirectory(RemoteSegmentStoreDirectoryFactory.java:49)
    at org.opensearch.index.IndexService.createShard(IndexService.java:488)
    at org.opensearch.indices.IndicesService.createShard(IndicesService.java:1036)
    at org.opensearch.indices.IndicesService.createShard(IndicesService.java:212)
    at org.opensearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:673)
    at org.opensearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:650)
    at org.opensearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:295)
    at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:606)
    at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:593)
    at org.opensearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:561)
    at org.opensearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:484)
    at org.opensearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:186)
    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:858)
    at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:282)
    at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:245)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.io.IOException: Exception when listing blobs by prefix [x/y/z/metadata]
    at org.opensearch.repositories.s3.S3BlobContainer.listBlobsByPrefixInSortedOrder(S3BlobContainer.java:455)
    at org.opensearch.common.blobstore.BlobContainer.listBlobsByPrefixInSortedOrder(BlobContainer.java:234)
    at org.opensearch.common.blobstore.EncryptedBlobContainer.listBlobsByPrefixInSortedOrder(EncryptedBlobContainer.java:207)
    at org.opensearch.index.store.RemoteDirectory.listFilesByPrefixInLexicographicOrder(RemoteDirectory.java:127)
    ... 22 more
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
    at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
    at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)

On cluster manager node :

[2024-01-24T20:56:24,339][WARN ][o.o.c.c.LagDetector      ] [0f2] node [{a}{b}{c}{{dir}] is lagging at cluster state version [29651], although publication of cluster state version [29652] completed [1.5m] ago

[2024-01-24T20:56:25,192][INFO ][o.o.c.s.MasterService    ] [0f2] node-left [{a}{b}{c}{{dir}] reason: lagging], term: 14, version: 29656, delta: removed {[{a}{b}{c}{{dir}]}

Related component

Storage:Durability

To Reproduce

We see this when there are high amount of relocations in the cluster .

Expected behavior

For cluster state applier, we should have a dedicated threadpool, so that it doesn't get blocked on any resource - be it threadpool / connections etc.

Additional Details

Plugins
repository-s3

Host/Environment (please complete the following information):

Amazon Linux 2

Additional context
Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

Bukhtawar · 2024-01-26T16:35:22Z

Thanks Gaurav, Ideally cluster state applier thread blocks or should block any blocking or networking operation on that thread. But given the flow with remote store we might need a dedicated and prioritized threadpool in remote store for cluster state applier interactions and we may want to still want to keep the blocking behaviour of the cluster state applier thread.

shwetathareja · 2024-02-09T11:17:28Z

We shouldn't perform expensive operation in Cluster state applier thread. If you can offload this work to dedicated thread pool, that would be preferred.

Ideally applier are expected to finish before any listeners can be triggered but we can evaluate if appliers can execute their task which doesn't depend on the cluster state in the parallel in the background.

gbbafna added bug Something isn't working untriaged labels Jan 26, 2024

github-actions bot added the Storage:Durability Issues and PRs related to the durability framework label Jan 26, 2024

gbbafna changed the title ~~[Remote Store] Cluster State Applier thread blocked on remote uploads~~ [Remote Store] Cluster State Applier thread blocked on remote store operations Jan 26, 2024

gbbafna removed the untriaged label Jan 26, 2024

gbbafna mentioned this issue Jan 26, 2024

[Meta] Remote Store Follow-up Items #11966

Open

shwetathareja added the Cluster Manager label Feb 9, 2024

Bukhtawar added Storage-Lifecycle and removed Storage-Lifecycle labels Feb 15, 2024

github-project-automation bot added this to Storage Project Board Feb 15, 2024

github-project-automation bot moved this to 🆕 New in Storage Project Board Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Remote Store] Cluster State Applier thread blocked on remote store operations #12026

[Remote Store] Cluster State Applier thread blocked on remote store operations #12026

gbbafna commented Jan 26, 2024

Bukhtawar commented Jan 26, 2024

shwetathareja commented Feb 9, 2024

[Remote Store] Cluster State Applier thread blocked on remote store operations #12026

[Remote Store] Cluster State Applier thread blocked on remote store operations #12026

Comments

gbbafna commented Jan 26, 2024

Describe the bug

Related component

To Reproduce

Expected behavior

Additional Details

Bukhtawar commented Jan 26, 2024

shwetathareja commented Feb 9, 2024