Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean shuffle data #312

Merged
merged 3 commits into from
Apr 22, 2024
Merged

Conversation

penghuo
Copy link
Collaborator

@penghuo penghuo commented Apr 18, 2024

Description

  • For REPL, clean shuffle data after query result is consumed.
  • For Index/MV Refresh Job, clean shuffle data after each microBatch.

Issues Resolved

#302

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@penghuo penghuo marked this pull request as ready for review April 18, 2024 19:53
@penghuo penghuo self-assigned this Apr 18, 2024
@penghuo penghuo added the 0.4 label Apr 18, 2024
@@ -92,6 +89,8 @@ case class JobOperator(
try {
// Wait for streaming job complete if no error and there is streaming job running
if (!exceptionThrown && streaming && spark.streams.active.nonEmpty) {
//
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Signed-off-by: Peng Huo <[email protected]>
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

@penghuo penghuo merged commit 222e473 into opensearch-project:main Apr 22, 2024
4 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Apr 29, 2024
* Cleanup Spark shuffle data after data is consumed

Signed-off-by: Peng Huo <[email protected]>

* update comments

Signed-off-by: Peng Huo <[email protected]>

---------

Signed-off-by: Peng Huo <[email protected]>
(cherry picked from commit 222e473)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
noCharger pushed a commit that referenced this pull request Apr 29, 2024
* Cleanup Spark shuffle data after data is consumed



* update comments



---------


(cherry picked from commit 222e473)

Signed-off-by: Peng Huo <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants