Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

link previous cluster uuid to current cluster uuid even if current cluster uuid is not committed #10832

Conversation

linuxpi
Copy link
Collaborator

@linuxpi linuxpi commented Oct 22, 2023

Description

Cluster chaining can break if there is consecutive node replace and restart

This change removes the check for clusterUUIDCommitted for current state when fetch previousClusterUUID from remote. But since we already have a check on clusterUUIDCommitted while constructing cluster chain, we should be good

final List<String> validClusterUUIDs = manifestsByClusterUUID.values()
.stream()
.filter(m -> !isInvalidClusterUUID(m) && !clusterUUIDGraph.containsValue(m.getClusterUUID()))
.map(ClusterMetadataManifest::getClusterUUID)
.collect(Collectors.toList());

** It might be beneficial to make previousClusterUUID part of Metadata itself in future **

Related Issues

Resolves #10841

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 22, 2023

Compatibility status:

Checks if related components are compatible with change d048ce7

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: bansvaru <[email protected]>
@linuxpi
Copy link
Collaborator Author

linuxpi commented Oct 22, 2023

Flaky Test

#10154

org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.testRequestBreaker {p0={"search.concurrent_segment_search.enabled":"true"}}
org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.testRequestBreaker {p0={"search.concurrent_segment_search.enabled":"false"}}

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.http.SearchRestCancellationIT.testAutomaticCancellationMultiSearchDuringFetchPhase

@codecov
Copy link

codecov bot commented Oct 22, 2023

Codecov Report

Merging #10832 (d048ce7) into main (51626d0) will decrease coverage by 0.05%.
Report is 7 commits behind head on main.
The diff coverage is 96.51%.

@@             Coverage Diff              @@
##               main   #10832      +/-   ##
============================================
- Coverage     71.31%   71.27%   -0.05%     
- Complexity    58671    58711      +40     
============================================
  Files          4860     4869       +9     
  Lines        276335   276475     +140     
  Branches      40198    40202       +4     
============================================
- Hits         197068   197046      -22     
- Misses        62803    63031     +228     
+ Partials      16464    16398      -66     
Files Coverage Δ
...upport/replication/TransportReplicationAction.java 76.92% <100.00%> (-3.76%) ⬇️
...ava/org/opensearch/cluster/node/DiscoveryNode.java 91.62% <100.00%> (+0.17%) ⬆️
...a/org/opensearch/common/network/NetworkModule.java 92.20% <100.00%> (+0.20%) ⬆️
...rg/opensearch/common/settings/ClusterSettings.java 92.85% <ø> (ø)
.../java/org/opensearch/gateway/GatewayMetaState.java 70.27% <100.00%> (+1.76%) ⬆️
...earch/index/remote/RemoteStorePressureService.java 100.00% <ø> (ø)
server/src/main/java/org/opensearch/node/Node.java 85.31% <100.00%> (+0.09%) ⬆️
...ting/admissioncontrol/AdmissionControlService.java 100.00% <100.00%> (ø)
...issioncontrol/controllers/AdmissionController.java 100.00% <100.00%> (ø)
...g/admissioncontrol/enums/AdmissionControlMode.java 100.00% <100.00%> (ø)
... and 9 more

... and 488 files with indirect coverage changes

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards

@linuxpi linuxpi self-assigned this Oct 23, 2023
@github-actions github-actions bot added the bug Something isn't working label Oct 23, 2023
@linuxpi linuxpi changed the title link previous cluster uuid to current cluster uuid even if current cl… link previous cluster uuid to current cluster uuid even if current cluster uuid is not committed Oct 23, 2023
@github-actions github-actions bot added Cluster Manager Storage Issues and PRs relating to data and metadata storage Storage:Remote v2.12.0 Issues and PRs related to version 2.12.0 labels Oct 23, 2023
@shwetathareja shwetathareja merged commit 91ac084 into opensearch-project:main Oct 25, 2023
66 of 93 checks passed
@shwetathareja shwetathareja added the backport 2.x Backport to 2.x branch label Oct 25, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 25, 2023
…uster uuid is not committed (#10832)

* link previous cluster uuid to current cluster uuid even if current cluster uuid is not committed

Signed-off-by: bansvaru <[email protected]>
(cherry picked from commit 91ac084)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@linuxpi linuxpi deleted the fix-remote-cluster-chaining-break branch October 25, 2023 11:18
sachinpkale pushed a commit that referenced this pull request Oct 25, 2023
…uster uuid is not committed (#10832) (#10906)

* link previous cluster uuid to current cluster uuid even if current cluster uuid is not committed


(cherry picked from commit 91ac084)

Signed-off-by: bansvaru <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…uster uuid is not committed (opensearch-project#10832)

* link previous cluster uuid to current cluster uuid even if current cluster uuid is not committed

Signed-off-by: bansvaru <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch bug Something isn't working Cluster Manager skip-changelog Storage:Remote Storage Issues and PRs relating to data and metadata storage v2.12.0 Issues and PRs related to version 2.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] [Remote State] Cluster chaining can break if there is consecutive node replace and restart
3 participants