Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch all the locks for a shard to avoid multiple remote store calls #11987

Merged

Conversation

sachinpkale
Copy link
Member

@sachinpkale sachinpkale commented Jan 23, 2024

Description

  • Currently, snapshot interop with remote store takes a lock on remote segment store metadata files.
  • This lock ensures that the metadata file is not deleted as part of cleanup of stale files.
  • Cleanup of stale metadata files happen at each flush and to check if the lock is acquired or not, a remote store call is made.
  • This can result in too many remote store calls when there are too many snapshots present and frequent flushes.
  • In this change, we list all the lock files to avoid multiple remote store calls.

Related Issues

Resolves #11409

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Storage:Durability Issues and PRs related to the durability framework labels Jan 23, 2024
Copy link
Contributor

github-actions bot commented Jan 23, 2024

Compatibility status:

Checks if related components are compatible with change 2381cb7

Incompatible components

Incompatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/alerting.git]

Copy link
Contributor

❌ Gradle check result for e5a3d2b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Member

@ashking94 ashking94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good mostly, will re-review after UTs/ITs.

Signed-off-by: Sachin Kale <[email protected]>
Copy link
Contributor

❌ Gradle check result for 68bd634: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Sachin Kale <[email protected]>
Copy link
Collaborator

@gbbafna gbbafna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good .

Copy link
Contributor

❌ Gradle check result for 9837f28: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Sachin Kale <[email protected]>
Copy link
Contributor

❌ Gradle check result for 26fc4cd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 26fc4cd: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 26fc4cd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for 26fc4cd: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=search.aggregation/20_terms/string profiler via global ordinals}

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Jan 25, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (304042f) 71.37% compared to head (2381cb7) 71.40%.
Report is 16 commits behind head on main.

Files Patch % Lines
...earch/index/store/RemoteSegmentStoreDirectory.java 88.88% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11987      +/-   ##
============================================
+ Coverage     71.37%   71.40%   +0.02%     
- Complexity    59403    59408       +5     
============================================
  Files          4923     4923              
  Lines        279214   279218       +4     
  Branches      40596    40596              
============================================
+ Hits         199302   199380      +78     
+ Misses        63375    63218     -157     
- Partials      16537    16620      +83     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sachinpkale sachinpkale requested a review from ashking94 January 25, 2024 07:01
Signed-off-by: Sachin Kale <[email protected]>
Copy link
Member

@ashking94 ashking94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

❕ Gradle check result for 2381cb7: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing
      1 org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@andrross andrross added the backport 2.x Backport to 2.x branch label Jan 26, 2024
@andrross andrross merged commit 6e744f5 into opensearch-project:main Jan 26, 2024
39 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 26, 2024
…11987)

* Fetch all the locks for a shard to avoid multiple calls

Signed-off-by: Sachin Kale <[email protected]>

* Fix lock file comparison issue

Signed-off-by: Sachin Kale <[email protected]>

* Add unit tests

Signed-off-by: Sachin Kale <[email protected]>

* Add more unit tests

Signed-off-by: Sachin Kale <[email protected]>

* Address PR comments

Signed-off-by: Sachin Kale <[email protected]>

---------

Signed-off-by: Sachin Kale <[email protected]>
Co-authored-by: Sachin Kale <[email protected]>
(cherry picked from commit 6e744f5)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
andrross pushed a commit that referenced this pull request Jan 26, 2024
…11987) (#12037)

* Fetch all the locks for a shard to avoid multiple calls



* Fix lock file comparison issue



* Add unit tests



* Add more unit tests



* Address PR comments



---------



(cherry picked from commit 6e744f5)

Signed-off-by: Sachin Kale <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sachin Kale <[email protected]>
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Mar 1, 2024
…pensearch-project#11987)

* Fetch all the locks for a shard to avoid multiple calls

Signed-off-by: Sachin Kale <[email protected]>

* Fix lock file comparison issue

Signed-off-by: Sachin Kale <[email protected]>

* Add unit tests

Signed-off-by: Sachin Kale <[email protected]>

* Add more unit tests

Signed-off-by: Sachin Kale <[email protected]>

* Address PR comments

Signed-off-by: Sachin Kale <[email protected]>

---------

Signed-off-by: Sachin Kale <[email protected]>
Co-authored-by: Sachin Kale <[email protected]>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
…pensearch-project#11987)

* Fetch all the locks for a shard to avoid multiple calls

Signed-off-by: Sachin Kale <[email protected]>

* Fix lock file comparison issue

Signed-off-by: Sachin Kale <[email protected]>

* Add unit tests

Signed-off-by: Sachin Kale <[email protected]>

* Add more unit tests

Signed-off-by: Sachin Kale <[email protected]>

* Address PR comments

Signed-off-by: Sachin Kale <[email protected]>

---------

Signed-off-by: Sachin Kale <[email protected]>
Co-authored-by: Sachin Kale <[email protected]>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…pensearch-project#11987)

* Fetch all the locks for a shard to avoid multiple calls

Signed-off-by: Sachin Kale <[email protected]>

* Fix lock file comparison issue

Signed-off-by: Sachin Kale <[email protected]>

* Add unit tests

Signed-off-by: Sachin Kale <[email protected]>

* Add more unit tests

Signed-off-by: Sachin Kale <[email protected]>

* Address PR comments

Signed-off-by: Sachin Kale <[email protected]>

---------

Signed-off-by: Sachin Kale <[email protected]>
Co-authored-by: Sachin Kale <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request skip-changelog Storage:Durability Issues and PRs related to the durability framework
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cache metadata lock info to avoid remote store calls on each commit
4 participants