Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Failure: failed to read response from monitor process in snapshot_scheduler_test #1794

Closed
linh2931 opened this issue Oct 18, 2023 · 12 comments · Fixed by #1821, #1822, #1827 or #1830
Closed
Assignees
Labels
👍 lgtm test-instability tag issues for flaky tests, high priority to address

Comments

@linh2931
Copy link
Member

Seen this again. A core dump somewhere?

https://github.com/AntelopeIO/leap/actions/runs/6555274088/job/17803647521?pr=1792#step:5:25942

@linh2931 linh2931 added the test-instability tag issues for flaky tests, high priority to address label Oct 18, 2023
@enf-ci-bot enf-ci-bot moved this to Todo in Team Backlog Oct 18, 2023
@spoonincode
Copy link
Member

@bhazzard bhazzard added 👍 lgtm and removed triage labels Oct 19, 2023
@bhazzard bhazzard added this to the Leap v5.0.0-stable milestone Oct 19, 2023
@linh2931 linh2931 self-assigned this Oct 19, 2023
@heifner
Copy link
Member

heifner commented Oct 19, 2023

@linh2931 linh2931 moved this from Todo to In Progress in Team Backlog Oct 19, 2023
@heifner
Copy link
Member

heifner commented Oct 20, 2023

https://github.com/AntelopeIO/leap/actions/runs/6587504928/job/17898239176

BTW, I ran the plugin_test tests/plugin_test --catch_system_errors=no all night locally in a loop and it never failed once.

@spoonincode
Copy link
Member

@bhazzard
Copy link

Kevin mentioned on 10/24/23 that he ran it all night without a single failure.

@bhazzard
Copy link

Kevin also mentioned a theory that maybe something else was writing to the same directory, though most tests use a tmp directory.

@bhazzard
Copy link

move to 5.0.1 milestone

@linh2931
Copy link
Member Author

The problem is shifted to read_only_trxs:
https://github.com/AntelopeIO/leap/actions/runs/6631880102/job/18016715919?pr=1819

Looks like some interferences in tests within plugin_test or in parallel test.

@spoonincode
Copy link
Member

imo until we learn otherwise, let's consider this issue strictly for failures on snapshot_scheduler_test which maybe hopefully was fixed with #1821 (since there was clearly threading violations in it, crashes/weirdness isn't a stretch), so let's close it. If we should open another issue for the read only trx tests failing, let's do that.

@linh2931
Copy link
Member Author

#1823 opened for read only trx tests.

@linh2931
Copy link
Member Author

Happened in snapshot_scheduler_test in this CICD run: https://github.com/AntelopeIO/leap/actions/runs/6643256190/job/18050335448?pr=1819

@linh2931 linh2931 reopened this Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment