Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add to and from XContent to ClusterBlock and ClusterBlocks #13694

Closed

Conversation

shiv0408
Copy link
Member

@shiv0408 shiv0408 commented May 15, 2024

Description

We need to upload the ClusterBlocks object to remote to enable the cluster state publication flow from remote. To do that we need to parse ClusterBlocks object to and from XContent.

Related Issues

Resolves #13692

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
    - [x] Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 75ce997: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shiv0408 shiv0408 force-pushed the cluster_blocks_xcontent branch from 75ce997 to 215a135 Compare May 15, 2024 22:50
Copy link
Contributor

❌ Gradle check result for 215a135: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shiv0408
Copy link
Member Author

Created a new issue for unrealted failing test org.opensearch.common.cache.stats.DefaultCacheStatsHolderTests.testConcurrentRemoval - #13695

@shiv0408 shiv0408 force-pushed the cluster_blocks_xcontent branch from 215a135 to 141b86b Compare May 16, 2024 00:17
Copy link
Contributor

❌ Gradle check result for 141b86b: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shiv0408
Copy link
Member Author

Failing test already identified as flaky
org.opensearch.remotestore.RemoteStoreStatsIT.testDownloadStatsCorrectnessSinglePrimaryMultipleReplicaShards - #10152

Copy link
Contributor

❌ Gradle check result for 141b86b: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shiv0408 shiv0408 force-pushed the cluster_blocks_xcontent branch from 141b86b to ee86138 Compare May 30, 2024 11:15
Copy link
Contributor

❌ Gradle check result for ee86138: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shiv0408 shiv0408 force-pushed the cluster_blocks_xcontent branch from ee86138 to ba27fd1 Compare May 30, 2024 12:45
Copy link
Contributor

✅ Gradle check result for ba27fd1: SUCCESS

Copy link

codecov bot commented May 30, 2024

Codecov Report

Attention: Patch coverage is 83.92857% with 18 lines in your changes missing coverage. Please review.

Project coverage is 71.56%. Comparing base (b15cb0c) to head (ba27fd1).
Report is 348 commits behind head on main.

Current head ba27fd1 differs from pull request most recent head d69ae20

Please upload reports for the commit d69ae20 to get more accurate results.

Files Patch % Lines
...ava/org/opensearch/cluster/block/ClusterBlock.java 79.62% 6 Missing and 5 partials ⚠️
...va/org/opensearch/cluster/block/ClusterBlocks.java 87.71% 1 Missing and 6 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #13694      +/-   ##
============================================
+ Coverage     71.42%   71.56%   +0.14%     
- Complexity    59978    61136    +1158     
============================================
  Files          4985     5059      +74     
  Lines        282275   287628    +5353     
  Branches      40946    41671     +725     
============================================
+ Hits         201603   205829    +4226     
- Misses        63999    64761     +762     
- Partials      16673    17038     +365     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

for (ClusterBlockLevel level : levels) {
builder.value(level.name().toLowerCase(Locale.ROOT));
}
builder.endArray();
if (context == Metadata.XContentContext.GATEWAY) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this cause BWC failures for non-remote cluster?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are only passing GATEWAY as a context when custom metadata should be stored as part of the persistent cluster state. Don't see any reason this could cause any BWC issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a possibility of mismatch of nodes with old and new version of this code? (say OpenSearch 2.14 upgrading to 2.15). In that case, this additional fields (introduced in 2.15) can't be read by older nodes running in 2.14.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XContent we are creating is not being sent to the nodes of older version. We will just be uploading this to remote repo. In case of upgrade, we will start publication of cluster state to nodes, when all the nodes have moved to 2.15. So, this would not cause any problem imo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be consumed by the data nodes from remote?

return new ClusterBlock(id, uuid, description, retryable, disableStatePersistence, allowReleaseResources, status, levels);
}

private static String skipBlockID(XContentParser parser) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add explanation of why this is needed? Probably an example will help here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have written the fromXContent is a way that if it can parse the ClusterBlock as a XContentFragment as well as a complete Object. This specific method is added to enable the parsing ClusterBlock as a complete XContentObject like given below.

{
  "block-1" : {
    "uuid": ....
  }
}

Although, we might never need to parse such object. But the UT I have written in ClusterBlockTests.java, we are creating a complete XContentObject from cluster block and trying to parse it back to ClusterBlock object, that is the reason this was required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we differentiate between XContentObject and XConentFragment? Can we separate XContentObject parsing to a separate function and then use the above method to parse the XConentFragment inside that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XContentObject has a paired start end object. While for fragment that is not true. Sure, we can separate the logic to parse the fragment to a innerParser and use that in fromXContent method which can accept both Object and a Fragment.

@shiv0408 shiv0408 force-pushed the cluster_blocks_xcontent branch from ba27fd1 to dd7babd Compare June 5, 2024 21:39
Copy link
Contributor

github-actions bot commented Jun 5, 2024

❌ Gradle check result for dd7babd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Shivansh Arora <[email protected]>
@shiv0408 shiv0408 force-pushed the cluster_blocks_xcontent branch from dd7babd to d69ae20 Compare June 5, 2024 22:18
Copy link
Contributor

github-actions bot commented Jun 5, 2024

❌ Gradle check result for d69ae20: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shiv0408
Copy link
Member Author

shiv0408 commented Jun 6, 2024

❌ Gradle check result for d69ae20: FAILURE

#11979

for (ClusterBlockLevel level : levels) {
builder.value(level.name().toLowerCase(Locale.ROOT));
}
builder.endArray();
if (context == Metadata.XContentContext.GATEWAY) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a possibility of mismatch of nodes with old and new version of this code? (say OpenSearch 2.14 upgrading to 2.15). In that case, this additional fields (introduced in 2.15) can't be read by older nodes running in 2.14.

if (token == XContentParser.Token.FIELD_NAME) {
currentFieldName = parser.currentName();
} else if (token.isValue()) {
switch (Objects.requireNonNull(currentFieldName)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currentFiledName will always be non null after first field. For this logic to work, currentFieldName should be reset, once the corresponding value is parsed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it will always be non Null. I think I will get rid of this nonNull check rather than resetting the field every time we parse something.

return new ClusterBlock(id, uuid, description, retryable, disableStatePersistence, allowReleaseResources, status, levels);
}

private static String skipBlockID(XContentParser parser) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we differentiate between XContentObject and XConentFragment? Can we separate XContentObject parsing to a separate function and then use the above method to parse the XConentFragment inside that?

@shiv0408
Copy link
Member Author

shiv0408 commented Jun 7, 2024

#13576 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request skip-changelog
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

[Remote Cluster State] Add toXContent and fromXContent for ClusterBlocks
4 participants