Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add another test for a possible deadlock scenario caused by #8703 #8769

Merged

Conversation

hendrikmakait
Copy link
Member

Addendum to #8703 adding a test for another deadlock scenario.

  • Tests added / passed
  • Passes pre-commit run --all-files

Copy link
Contributor

github-actions bot commented Jul 15, 2024

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    29 files  +     8      29 suites  +8   11h 57m 37s ⏱️ + 4h 30m 44s
 4 087 tests +     5   3 971 ✅ +    10    112 💤  -   2  4 ❌  - 3 
55 286 runs  +16 183  52 844 ✅ +15 394  2 437 💤 +792  5 ❌  - 3 

For more details on these failures, see this check.

Results for commit ac2c5f5. ± Comparison against base commit 48eefee.

This pull request removes 1 and adds 6 tests. Note that renamed tests count towards both.
distributed.tests.test_scheduler ‑ test_deadlock_dependency_of_queued_released
distributed.tests.test_client ‑ test_release_persisted_collection
distributed.tests.test_client ‑ test_release_persisted_collection_sync
distributed.tests.test_scheduler ‑ test_deadlock_dependency_of_queued_released_when_worker_removed[False]
distributed.tests.test_scheduler ‑ test_deadlock_dependency_of_queued_released_when_worker_removed[True]
distributed.tests.test_scheduler ‑ test_deadlock_dependency_of_queued_released_when_worker_replaced[False]
distributed.tests.test_scheduler ‑ test_deadlock_dependency_of_queued_released_when_worker_replaced[True]

♻️ This comment has been updated with latest results.

@hendrikmakait hendrikmakait changed the title Add another test for a possible deadlock scenario caused by 8703 Add another test for a possible deadlock scenario caused by #8703 Jul 15, 2024
Comment on lines 4721 to 4725
await async_poll_for(
lambda: b.state.tasks.get(dep_key) is not None
and b.state.tasks.get(dep_key).state == "memory",
timeout=5,
)
Copy link
Member

@fjetter fjetter Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is a little ambiguous. This could either mean that B computed the result or that it fetched it. Both will take different code paths.

This is a likely source for a flaky test that'll be hard to track down

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uff, I was already wondering why this test failed on some CI runs but never locally!

for i in range(s.total_nthreads * 2 + 1)
]
dep_key = dep.key
del dep
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit/hint: In tests I often use dep.release() which is pretty much the same but you have access to the key later on. Also, in case there is some reference to this it will still be released.

This is style, feel free to ignore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point, I didn't think of this!

@hendrikmakait hendrikmakait merged commit d68a5d9 into dask:main Jul 16, 2024
27 of 34 checks passed
@hendrikmakait hendrikmakait deleted the additional-deadlock-test-for-8703 branch July 16, 2024 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants