forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-49048][SS] Add support for reading relevant operator metadata …
…at given batch id ### What changes were proposed in this pull request? Add support for reading relevant operator metadata at given batch id ### Why are the changes needed? Needed to support reading state for operators that allow for schema changes across batch ids This change also introduces the location for the state schema format for the v2 version. As part of this version, the operator metadata and the state schema will be written to the following locations: - for the operator metadata, this will be under the `<checkpoint_loc>/state/<operator_id>/ _metadata` directory - for the state schema, this will be under the `<checkpoint_loc>/state/<operator_id>/ _stateSchema` directory In the older versions for the operator/state schema formats, this would be stored in the following location: - for the operator metadata, this will be under the `<checkpoint_loc>/state/<operator_id>/ _metadata` directory - for the state schema, this will be under the `<checkpoint_loc>/state/<operator_id>/ 0/<storeName>/_metadata/schema` directory ### Does this PR introduce _any_ user-facing change? Yes Allows the user to specify a batchId while querying the operator metadata. This is a no-op for operators using metadata version 1 and will provide the right metadata from v2 onwards ``` spark .read .format("state-metadata") .option("batchId", <batchId>) .load(<path>) ``` ### How was this patch tested? Added new unit tests ``` ===== POSSIBLE THREAD LEAK IN SUITE o.a.s.sql.execution.streaming.state.OperatorStateMetadataSuite, threads: Idle Worker Monitor for python3 (daemon=true), rpc-boss-3-1 (daemon=true), ForkJoinPool.commonPool-worker-3 (daemon=true), ForkJoinPool.commonPool-worker-2 (daemon=true), shuffle-boss-6-1 (daemon=true), ForkJoinPool.commonPool-worker-1 (daemon=true) ===== [info] Run completed in 30 seconds, 859 milliseconds. [info] Total number of tests run: 11 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 11, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#47528 from anishshri-db/task/SPARK-49048. Authored-by: Anish Shrigondekar <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
- Loading branch information
1 parent
b477753
commit 2a75210
Showing
10 changed files
with
179 additions
and
40 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.