Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup Unreferenced file on segment merge failure #9483

Closed
wants to merge 1 commit into from

Conversation

RS146BIJAY
Copy link
Contributor

@RS146BIJAY RS146BIJAY commented Aug 22, 2023

Description

Describe the bug
Currently, during segment merge (both during auto merge and force merge), OpenSearch creates multiple temporary files while merging individual segment components. Once individual components of the segments are merged, IndexWriter creates a Compound file (*.cfs) containing all the Lucene components for a segment. At the end, OpenSearch deletes these temporary files.

Now, in case these temporary files consume all the space on the node (because enough amount of space is unavailable for segment merge to go through), segment merge will fail. Also, the files which got created during segment merge do not get cleaned up and it continues to occupy space on the node. This can cause FSHealthService checks to fail for this node, ultimately causing these nodes to be removed from cluster.

Expected behavior
In case Force segment merge fails, OpenSearch should remove the temporary files created during segment merge.

Related Issues

#5710
#8024

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.http.SearchRestCancellationIT.testAutomaticCancellationMultiSearchDuringFetchPhase

@codecov
Copy link

codecov bot commented Aug 22, 2023

Codecov Report

Merging #9483 (e0f15c9) into main (ebdffbb) will decrease coverage by 0.20%.
Report is 3 commits behind head on main.
The diff coverage is 29.09%.

@@             Coverage Diff              @@
##               main    #9483      +/-   ##
============================================
- Coverage     71.20%   71.00%   -0.20%     
+ Complexity    57474    57403      -71     
============================================
  Files          4776     4778       +2     
  Lines        270815   270916     +101     
  Branches      39584    39591       +7     
============================================
- Hits         192835   192370     -465     
- Misses        61741    62330     +589     
+ Partials      16239    16216      -23     
Files Changed Coverage Δ
...earch/script/expression/ExpressionScoreScript.java 0.00% <0.00%> (ø)
...arch/script/expression/ExpressionScriptEngine.java 33.99% <ø> (ø)
...earch/example/expertscript/ExpertScriptPlugin.java 0.00% <0.00%> (ø)
...pensearch/common/settings/IndexScopedSettings.java 100.00% <ø> (ø)
...ry/functionscore/TermFrequencyFunctionFactory.java 0.00% <0.00%> (ø)
...n/java/org/opensearch/script/ScoreScriptUtils.java 1.43% <0.00%> (-0.31%) ⬇️
...nsearch/search/lookup/LeafTermFrequencyLookup.java 35.71% <35.71%> (ø)
...c/main/java/org/opensearch/script/ScoreScript.java 56.92% <50.00%> (-1.15%) ⬇️
.../main/java/org/opensearch/index/engine/Engine.java 74.50% <72.22%> (-0.07%) ⬇️
...nsearch/painless/action/PainlessExecuteAction.java 77.58% <100.00%> (+0.29%) ⬆️
... and 3 more

... and 480 files with indirect coverage changes

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.pit.DeletePitMultiNodeIT.testDeleteWhileSearch

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:

Checks if related components are compatible with change 5d3633c

Incompatible components

Incompatible components: [https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/security-analytics.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT.test {yaml=pit/10_basic/Delete all}
      1 org.opensearch.search.pit.DeletePitMultiNodeIT.testDeleteWhileSearch
      1 org.opensearch.repositories.azure.AzureBlobStoreRepositoryTests.testMultipleSnapshotAndRollback

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:

Checks if related components are compatible with change 5d3633c

Incompatible components

Incompatible components: [https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/security-analytics.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@RS146BIJAY
Copy link
Contributor Author

Code coverage failure seems to be coming from this PR: #9081

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant