forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-40571][SS][TESTS] Construct a new test case for applyInPandasW…
…ithState to verify fault-tolerance semantic with random python worker failures ### What changes were proposed in this pull request? This PR proposes a new test case for applyInPandasWithState to verify fault-tolerance semantic is not broken despite of random python worker failure. If the sink provides end-to-end exactly-once, the query should respect the guarantee. Otherwise, the query should respect stateful exactly-once, but at-least-once in terms of outputs. The test leverages file stream sink which is end-to-end exactly-once, but to make the verification simpler, we just verify whether the stateful exactly-once is guaranteed despite of python worker failures. ### Why are the changes needed? This strengthen the test coverage, especially the fault-tolerance semantic. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? New test added. Manually ran `./python/run-tests --testnames 'pyspark.sql.tests.test_pandas_grouped_map_with_state'` 10 times and all succeeded. Closes apache#38008 from HeartSaVioR/SPARK-40571. Authored-by: Jungtaek Lim <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
- Loading branch information
1 parent
311a855
commit 37517df
Showing
1 changed file
with
147 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters