Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate flaky tests on CI #700

Closed
wants to merge 9 commits into from
Closed

Conversation

masih
Copy link
Member

@masih masih commented Oct 7, 2024

Set up a temporary workflow to run all flaky tests repeatedly on CI for further investigation.

Set up a temporary workflow to run all flaky tests repeatedly on CI for
further investigation.
Copy link

codecov bot commented Oct 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 64.82%. Comparing base (23ce86a) to head (486b9a8).
Report is 4 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (23ce86a) and HEAD (486b9a8). Click for more details.

HEAD has 2 uploads less than BASE
Flag BASE (23ce86a) HEAD (486b9a8)
4 2
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main     #700       +/-   ##
===========================================
- Coverage   76.32%   64.82%   -11.50%     
===========================================
  Files          69       69               
  Lines        5596     5584       -12     
===========================================
- Hits         4271     3620      -651     
- Misses        911     1630      +719     
+ Partials      414      334       -80     

see 26 files with indirect coverage changes

@Stebalien
Copy link
Member

Well, this looks like it helped a bit.

@Stebalien
Copy link
Member

Hm. Well, I'm going to try increasing the timeout.

@masih
Copy link
Member Author

masih commented Oct 7, 2024

@Stebalien Tests seem to only fail when run in conjunction with other tests. To me so far this seems like some kind of congestion issue deep in libp2p/pubsub stack.

@Stebalien
Copy link
Member

Tests seem to only fail when run in conjunction with other tests. To me so far this seems like some kind of congestion issue deep in libp2p/pubsub stack.

Or just a timing issue? We're using a mock network, so at least it's not some kind of port issue.

But yeah, it sounds like the solution is to simply not run them in parallel.

masih added a commit that referenced this pull request Oct 8, 2024
Repeated runs on CI
shows that there is some contention when `f3_test.go` tests are run in
parallel. See #700. Avoid F3 test flakiness by running them serially.

This fixes the symptom that is test flakiness but not the root cause.
The root cause may be within one of the test dependencies.

Fixes #623, #659, #684, #699
@masih
Copy link
Member Author

masih commented Oct 8, 2024

Findings:

  • Repeated individual tests do not fail (repeated up to 100 times each).
  • They do fail when run in conjunction with other tests.
    • More specifically in conjunction with other tests in f3_test.go
  • These tests are time-based, and use an artificial clock. Having reviewed the artificial clock code nothing immediately stands out.
  • The cause of failure is waiting for a condition (e.g. instance number) that is never satisfied regardless of how long we wait (tested up to 30m timeout).
  • Tests pass consistently after removing t.Parallel() (repeated 50 times).

@masih masih closed this Oct 8, 2024
github-merge-queue bot pushed a commit that referenced this pull request Oct 8, 2024
Repeated runs on CI
shows that there is some contention when `f3_test.go` tests are run in
parallel. See #700. Avoid F3 test flakiness by running them serially.

This fixes the symptom that is test flakiness but not the root cause.
The root cause may be within one of the test dependencies.

Fixes #623, #659, #684, #699
@Stebalien Stebalien deleted the masih/flaky-test-ci-party branch October 28, 2024 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants