Feedback about Test Analytics and Flaky Test Reporting #304

rohan-at-sentry · 2024-03-14T20:13:09Z

Thanks for dropping by! 👋

We've recently released a whole new feature around Test analytics and are working on reporting Test Flakes on your PR ❄️ .

We'd love to hear feedback on

How your setup experience was.
How easy/useful the PR comment is

This issue is intended to share and collect feedback about the tool. If you have support needs or questions, please let us know!

houserx-jmcc · 2024-03-29T20:34:48Z

How your setup experience was.
- Very easy, worked on the first go after adding the new GitHub action step.
How easy/useful the PR comment is
- I like that it is combined with the existing coverage comment versus being a separate one.

In the event this action is not used in the context of a pull request (i.e. a scheduled test, pre-release test, etc.), it would be nice to be able to also display the results in GitHub's Job Summary in lieu of a PR comment. We currently use https://github.com/phoenix-actions/test-reporting to do this and it works nicely. When sharing test results, permalinking to the job summary works better due to the bigger screen size it can occupy. If this feature was added, we'd switch over completely!

houserx-jmcc · 2024-04-01T20:23:34Z

An update: we're also seeing the below error intermittently that we have been unfortunately seeing in v4 of the standard codecov-action (codecov/codecov-action#1280):

Error: write EPIPE
    at afterWriteDispatched (node:internal/stream_base_commons:160:15)
    at writeGeneric (node:internal/stream_base_commons:151:3)
    at Socket._writeGeneric (node:net:952:11)
    at Socket._write (node:net:964:8)
    at writeOrBuffer (node:internal/streams/writable:447:12)
    at _write (node:internal/streams/writable:389:10)
    at Socket.Writable.end (node:internal/streams/writable:665:17)
    at Socket.end (node:net:7[22](https://github.com/houserx/platform/actions/runs/8513226507/job/23316490519?pr=8922#step:12:23):31)
    at module.exports (/runner/_work/_actions/codecov/test-results-action/v1/node_modules/gpg/lib/spawnGPG.js:50:1)
    at Object.call (/runner/_work/_actions/codecov/test-results-action/v1/node_modules/gpg/lib/gpg.js:28:1)

rohan-at-sentry · 2024-04-02T14:07:05Z

Thanks for your feedback @houserx-jmcc - we're looking into adding support for reporting on GH Job summary as well

Re the issue with EPIPE - yeah it's something we've noticed. We're still working on a fix for this

rohan-at-sentry · 2024-04-02T20:45:17Z

@houserx-jmcc out of curiosity what non PR usecases are you using https://github.com/phoenix-actions/test-reporting for?

Off the top of my head, I'd guess something along the lines of running tests before deploy (of a production image/release) etc. Are there other usecases that you can share?

houserx-jmcc · 2024-04-02T20:49:25Z

Off the top of my head, I'd guess something along the lines of running tests before deploy (of a production image/release) etc. Are there other usecases that you can share?

Yup! That is one. Others include: running the tests on a stable branch while investigating suspected flakey tests, browser tests we run after deployments to increase confidence, and scheduled tests that periodically perform data quality checks against our stable environment databases. We use Jest for all these use cases and thus can easily pipe the XML into the test action reporter.

And for viewing historical test results on a pull request, the job summary is tied to each workflow execution. The PR comment is auto-updated though, so checking on past results becomes a bit more obtuse.

rohan-at-sentry · 2024-04-04T19:34:02Z

@houserx-jmcc - is this directionally what you had in mind? (still early days!)

https://github.com/joseph-sentry/codecov-cli/actions/runs/8560172298

houserx-jmcc · 2024-04-04T21:36:37Z

Precisely! Looks great :)

rohan-at-sentry · 2024-04-08T18:11:15Z

@houserx-jmcc - We're currently testing a new version of our test analytics feature internally where we're able to detect and report on Flaky Tests. At this time, I'm reaching out to everyone who has set up Codecov Test Analytics to invite feedback on the UI of our Flaky Test Feature. If this is of interest to you, please let me know and I'll reach out with some scheduling options.

houserx-jmcc · 2024-04-08T18:52:25Z

@rohan-at-sentry Happy to provide some additional feedback 👍

rohan-at-sentry · 2024-04-11T16:02:43Z

thanks @houserx-jmcc. Here are some times that work for us (you can find time beginning next week if that works better) - https://calendar.app.google/7qTT3zeshUnrEHMy5

cc @Adal3n3

Blacksmoke16 · 2024-08-08T05:19:41Z

I read through the docs a few times and I have a question. For context, I have a monorepo that I plan on setting up with codecov via the components feature where by I'll run the specs/generate a report for each component into a dedicated directory that'll then be uploaded all at once via the official GHA which will handle assigning each report to the proper component.

However, what I'd like to confirm is would doing something similar for the junit reports work in a similar way? Where codecov will just know which component to assign the test results to? Does this feature even work with components at the moment? Or is it global to a repo and long as I upload all the junit reports it'll just work?

rohan-at-sentry · 2024-08-08T23:29:03Z

@Blacksmoke16 currently yes the following is an accurate way to describe the system

global to a repo and long as I upload all the junit reports it'll just work

For clarity, right now we'd simply output these results onto the tests tab. You can see this in action on our own repos here
Further -

codecov will just know which component to assign the test results to?

is not supported currently, but I'd like to learn more

We're thinking of ways to group around test suites and environments (those are things we've heard from customers we're working with), but I'd be curious to hear what other groupings are top of mind for you. Also, do you envision these "groupings" to be identical to the components you'd set up for coverage? If so, why?

Blacksmoke16 · 2024-08-09T01:52:01Z

For clarity, right now we'd simply output these results onto the tests tab.

Ah, okay that example helps a lot. Basically is just a big list of all tests.

Also, do you envision these "groupings" to be identical to the components you'd set up for coverage? If so, why?

Uhh, I'm still in the reading/research/thinking phase of this as I have yet to actually get reports uploading or anything so I don't have a ton of useful insight just yet. However my initial reasoning was like, in my case at least, a component is a individual independent project that just happens to be in a monorepo. So it just made sense that tests output would be tied to a specific component to more easily see stats in that regard versus all components at once. Then you could more easily answer like "what is the most failure prone component?" Then, as you pointed out, flags would still make sense for filtering purposes to maybe spot issues like "our integration tests in staging have a lot higher failure rate than on prod" or something along those lines.

But honestly it could also make sense to have the tests tab be more like higher level stats/aggregates and the actual test results be more tightly coupled to a specific commit/pr? I'll be sure to report back if I think of anything else after actually getting things setup and playing with it more.

webknjaz · 2024-08-27T16:45:21Z

@rohan-at-sentry in one repo, test uploads crash fail with Upload failed: {"service":["This field may not be null."]} on GHA but not locally: codecov/codecov-cli#486 / https://github.com/ansible/awx-plugins/actions/runs/10581340011/job/29318439675#step:23:47. I haven't been able to identify the exact cause. My understanding is that the confusing error is coming from Django in https://github.com/codecov/codecov-api. It would be useful to both improve logging and fix the issue in the uploader.

rohan-at-sentry · 2024-08-27T16:50:23Z

Thanks, @webknjaz I opened codecov/engineering-team#2437 to assess

kevinphamsigma · 2024-09-05T03:45:50Z

thanks @houserx-jmcc. Here are some times that work for us (you can find time beginning next week if that works better) - https://calendar.app.google/7qTT3zeshUnrEHMy5

cc @Adal3n3

Is the flaky test detection enabled automatically at this point? Or something we need to enable? Also happpy to provide some feedback. Seems like this has good potential but missing some key features that would make it more useful for us

rohan-at-sentry · 2024-09-05T14:28:34Z

@kevinphamsigma yeah flaky test detection should be enabled at this point. I'd love to hear feedback. Can I get you to book some time with us using this link

jluebbe · 2024-09-13T11:18:54Z

The JUnit XML files generated by meson are not parsed: jluebbe/rauc#4 (comment)

In the web dashboard tab for tests, "No test results found for this branch" is shown.

jluebbe · 2024-09-13T12:19:07Z

It would be useful if a wildcard could be used in the files: argument, so that multiple .junit.xml files in the same directory could be uploaded in one go.

rohan-at-sentry · 2024-09-13T14:58:38Z

@jluebbe I'm going to track the parsing issue you're facing in a different issue - codecov/test-results-action#86. I think we were able to root cause it.

Re - multiple files being uploaded from the same directory, we do this already. Can you point me to a run where it didn't work? (is it the same one that happened for jluebbe/rauc#4 (comment))

jluebbe · 2024-09-13T16:04:37Z

Re - multiple files being uploaded from the same directory, we do this already. Can you point me to a run where it didn't work? (is it the same one that happened for jluebbe/rauc#4 (comment))

At least from the examples and argument documentation, it's not clear how to do this. Perhaps something like files: test-output/*.junit.xml would work?

webknjaz · 2024-09-16T00:47:24Z

@rohan-at-sentry got another couple of bugs in the context of test file uploads:

upload file discovery/normalization can traceback on unexpected input: [BUG] A files input with a trailing comma causes a traceback test-results-action#89 / [BUG] Empty-string report path passed to CLI causes a traceback codecov-cli#501
I tried uploading junit files produced by ansible-test sanity (it's a testing tool used in Ansible Collections ecosystem and ansible-core itself; it is able to produce reports regarding different linter runs) and Codecov bot is not happy about the format but does not provide any details on what it expected: 🧪 Upload ansible-test coverage to Codecov ansible/awx#15527 (comment) — it'd be nice to have some specification to check against. Here's the implementation of said junit XML writer: https://github.com/ansible/ansible/blob/b5ae8a3/lib/ansible/utils/_junit_xml.py / https://github.com/ansible/ansible/blob/b5ae8a3/test/lib/ansible_test/_internal/test.py#L110-L127
note that the above problem is happening in the same project where reporting GH statuses is broken: [BUG] Status checks are not reported in a recently configured project — ansible/awx #511 (this one is probably not related, but I'm mentioning it just in case)

Bibo-Joshi · 2024-09-19T19:53:17Z

This feature is a great idea and I'm eager to try it out!

How your setup experience was.

Pretty easy. What caught me off guard was the naming convention enforcing file names ending on *junit.xml. That could either be mentioned more visibly or loosend.

How easy/useful the PR comment is

TBH I don't much like automated PR comments. I'd prefer to have that in a job summary. For the coverage reports, one can already disable the comments.

joseph-sentry · 2024-09-19T20:44:59Z

At least from the examples and argument documentation, it's not clear how to do this. Perhaps something like files: test-output/*.junit.xml would work?

@jluebbe you are correct, it is missing from the documentation but you should be able to use wildcards / globs in the files argument

joseph-sentry · 2024-09-19T20:46:56Z

Pretty easy. What caught me off guard was the naming convention enforcing file names ending on *junit.xml. That could either be mentioned more visibly or loosend.

@Bibo-Joshi this restriction is being loosened, not by much to be fair, we're now searching for *junit*xml and *TEST*.xml files.

It should be able to find any file matching that pattern in subdirectories automatically if disable_search is not set to true.

rohan-at-sentry · 2024-09-19T22:59:20Z

@jluebbe - I've updated our docs to catch this, thanks for writing in https://docs.codecov.com/docs/test-result-ingestion-beta#troubleshooting

millotp · 2024-09-20T13:03:15Z

Hey, thanks for this feature it's really helpful to track down flaky tests, it would be useful to be able to reset the statistics once a test has been fixed, like the number of commits failed.

It would be nice to see the statistics for all branches at once, to also see which tests are flaky on pull requests, not just on main.

rohan-at-sentry · 2024-09-20T14:37:19Z

@millotp - Thanks for the feedback

It would be nice to see the statistics for all branches at once, to also see which tests are flaky on pull requests, not just on main.

You're right, and we've heard this from others as well.

It's something we're tracking internally, and it's likely we pick it up sometime mid q4 this year. https://github.com/codecov/feedback/issues/516

Re

it would be useful to be able to reset the statistics once a test has been fixed, like the number of commits failed.

We haven't really documented this in our docs just yet (and we should, I'll address that later today), but we "expire" the flaky tag for a test after ~30 days of "non-flake like activity".

Effectively, if a flaky test (that we detected) hasn't failed for 30 days (maybe you fixed, maybe it doesn't flake very often) then we no longer report it as flaky - because it isn't demonstrably slowing your team down. Hopefully that helps

cosmith · 2024-09-20T15:37:25Z

Hello, we're interested with my team to migrate to Codecov Test Analytics from Datadog CI, but the main reason for us to want test analytics is to detect flaky tests at the repo level and not at the branch level. We have a periodic reminder to look at the most flaky tests in our repo and fix them. So +1 for https://github.com/codecov/feedback/issues/516

Blacksmoke16 · 2024-09-30T13:17:08Z

Okay, I finally got around to getting this implemented, but am running into a bit of an issue:

❌ We are unable to process any of the uploaded JUnit XML files. Please ensure your files are in the right format.

The CI job says it found all 13 reports and uploaded them. I also pasted the XML from one of them into https://lotterfriends.github.io/online-junit-parser and it parsed fine. Is there a way to get more details on what the problem is? It's possible some of the JUnit files from the other CI Job would not have any tests in it, but the junit file is still valid.

EDIT: Here's an excerpt from the junit files if that helps:

<?xml version="1.0"?>
<testsuite tests="116" skipped="0" errors="0" failures="0" time="0.005031332" timestamp="2024-09-29T19:55:54Z" hostname="theStone">
  <testcase file="/home/george/dev/git/athena-framework/athena/src/components/negotiation/spec/base_accept_spec.cr" classname="src.components.negotiation.spec.base_accept_spec" name="BaseAcceptTest parse parameters 3" line="18" time="6.2057e-5"/>
  <testcase file="/home/george/dev/git/athena-framework/athena/src/components/negotiation/spec/base_accept_spec.cr" classname="src.components.negotiation.spec.base_accept_spec" name="BaseAcceptTest parse parameters 1" line="18" time="4.9572e-5"/>
  <testcase file="/home/george/dev/git/athena-framework/athena/src/components/negotiation/spec/base_accept_spec.cr" classname="src.components.negotiation.spec.base_accept_spec" name="BaseAcceptTest parse parameters 0" line="18" time="6.358e-6"/>
  <testcase file="/home/george/dev/git/athena-framework/athena/src/components/negotiation/spec/base_accept_spec.cr" classname="src.components.negotiation.spec.base_accept_spec" name="BaseAcceptTest build parameters string 0" line="7" time="3.832e-6"/>
</testsuite>

EDIT2: I tried making it so the junit.xml files file attribute was relative to the project directory, but that also failed: athena-framework/athena#462 (comment)

Blacksmoke16 · 2024-10-12T13:37:40Z

Okay, based on what I read in https://docs.codecov.com/docs/test-result-ingestion-beta#troubleshooting. I'm pretty sure this is because codecov is expecting the outer most testsuites tag, and possibly the name attribute.

Should I make a separate issue for this? The warning in https://docs.codecov.com/docs/test-result-ingestion-beta points to this issue, so wasn't sure.

ggivo · 2024-10-15T10:51:47Z

Once the test is fixed, we should see there are no recent failures. But my understanding is that in "Test" dashboard "Failiure rate" is calculated based on all runs, and if we have a flaky test with lots of failures it will pop up on the top even after it is fixed.

Can I check a trend of flaky tests or reset the failure rate?
Or if possible show the rate of failures based on the last N test runs?

rohan-at-sentry · 2024-10-15T15:47:57Z

@Blacksmoke16 if you could raise an issue, please 🙏🏽

rohan-at-sentry · 2024-10-15T16:05:40Z

@ggivo we have filters coming soon (like early nov-ish) that should help with this. You can filter based on time and that should help adjust the rate to the window after your fix

covecod bot added the Waiting for: Product Owner label Mar 14, 2024

rohan-at-sentry added Feedback For gathering customer feedback and removed Waiting for: Product Owner labels Mar 14, 2024

covecod bot added the Waiting for: Product Owner label Mar 29, 2024

rohan-at-sentry mentioned this issue Apr 2, 2024

Display results in Github Job Summary codecov/test-results-action#34

Open

rohan-at-sentry mentioned this issue Aug 27, 2024

Error uploading test result report to Codecov codecov/engineering-team#2437

Open

This was referenced Sep 4, 2024

Testing only codecov/engineering-team#299

Open

Add TA feedback link in the comment codecov/engineering-team#2470

Open

rohan-at-sentry mentioned this issue Sep 13, 2024

JUnit XML files generated by meson are not parsed codecov/test-results-action#86

Open

Bibo-Joshi mentioned this issue Sep 19, 2024

Introduce Codecov's Test Analysis python-telegram-bot/python-telegram-bot#4487

Merged

rohan-at-sentry mentioned this issue Sep 20, 2024

Expand test analytics to report on test analytics for "ALL Branches" codecov/engineering-team#2542

Open

Blacksmoke16 mentioned this issue Sep 28, 2024

Spec Coverage Reports athena-framework/athena#450

Closed

This was referenced Oct 4, 2024

Rewrite to the same comment for offline test analytics codecov/codecov-cli#522

Merged

Update test-results-parser codecov/codecov-cli#524

Merged

Better formatting of options for the empty-upload command codecov/codecov-cli#525

Merged

This was referenced Oct 11, 2024

Release 0.7.6 codecov/codecov-cli#526

Merged

Adding small ci job codecov/codecov-cli#527

Merged

Adding basic network upload command codecov/codecov-cli#528

Open

Blacksmoke16 mentioned this issue Oct 15, 2024

JUnit XML report fails to process #543

Open

github-actions bot mentioned this issue Oct 15, 2024

Remove token enforcement for true tokenless endpoints codecov/codecov-cli#533

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback about Test Analytics and Flaky Test Reporting #304

Feedback about Test Analytics and Flaky Test Reporting #304

rohan-at-sentry commented Mar 14, 2024 •

edited

Loading

houserx-jmcc commented Mar 29, 2024 •

edited

Loading

houserx-jmcc commented Apr 1, 2024 •

edited

Loading

rohan-at-sentry commented Apr 2, 2024

rohan-at-sentry commented Apr 2, 2024

houserx-jmcc commented Apr 2, 2024 •

edited

Loading

rohan-at-sentry commented Apr 4, 2024 •

edited

Loading

houserx-jmcc commented Apr 4, 2024

rohan-at-sentry commented Apr 8, 2024 •

edited

Loading

houserx-jmcc commented Apr 8, 2024

rohan-at-sentry commented Apr 11, 2024

Blacksmoke16 commented Aug 8, 2024

rohan-at-sentry commented Aug 8, 2024 •

edited

Loading

Blacksmoke16 commented Aug 9, 2024

webknjaz commented Aug 27, 2024

rohan-at-sentry commented Aug 27, 2024

kevinphamsigma commented Sep 5, 2024

rohan-at-sentry commented Sep 5, 2024

jluebbe commented Sep 13, 2024

jluebbe commented Sep 13, 2024

rohan-at-sentry commented Sep 13, 2024 •

edited

Loading

jluebbe commented Sep 13, 2024

webknjaz commented Sep 16, 2024

Bibo-Joshi commented Sep 19, 2024

joseph-sentry commented Sep 19, 2024

joseph-sentry commented Sep 19, 2024

rohan-at-sentry commented Sep 19, 2024

millotp commented Sep 20, 2024

rohan-at-sentry commented Sep 20, 2024 •

edited

Loading

cosmith commented Sep 20, 2024

Blacksmoke16 commented Sep 30, 2024 •

edited

Loading

Blacksmoke16 commented Oct 12, 2024 •

edited

Loading

ggivo commented Oct 15, 2024

rohan-at-sentry commented Oct 15, 2024

rohan-at-sentry commented Oct 15, 2024

Feedback about Test Analytics and Flaky Test Reporting #304

Feedback about Test Analytics and Flaky Test Reporting #304

Comments

rohan-at-sentry commented Mar 14, 2024 • edited Loading

houserx-jmcc commented Mar 29, 2024 • edited Loading

houserx-jmcc commented Apr 1, 2024 • edited Loading

rohan-at-sentry commented Apr 2, 2024

rohan-at-sentry commented Apr 2, 2024

houserx-jmcc commented Apr 2, 2024 • edited Loading

rohan-at-sentry commented Apr 4, 2024 • edited Loading

houserx-jmcc commented Apr 4, 2024

rohan-at-sentry commented Apr 8, 2024 • edited Loading

houserx-jmcc commented Apr 8, 2024

rohan-at-sentry commented Apr 11, 2024

Blacksmoke16 commented Aug 8, 2024

rohan-at-sentry commented Aug 8, 2024 • edited Loading

Blacksmoke16 commented Aug 9, 2024

webknjaz commented Aug 27, 2024

rohan-at-sentry commented Aug 27, 2024

kevinphamsigma commented Sep 5, 2024

rohan-at-sentry commented Sep 5, 2024

jluebbe commented Sep 13, 2024

jluebbe commented Sep 13, 2024

rohan-at-sentry commented Sep 13, 2024 • edited Loading

jluebbe commented Sep 13, 2024

webknjaz commented Sep 16, 2024

Bibo-Joshi commented Sep 19, 2024

joseph-sentry commented Sep 19, 2024

joseph-sentry commented Sep 19, 2024

rohan-at-sentry commented Sep 19, 2024

millotp commented Sep 20, 2024

rohan-at-sentry commented Sep 20, 2024 • edited Loading

cosmith commented Sep 20, 2024

Blacksmoke16 commented Sep 30, 2024 • edited Loading

Blacksmoke16 commented Oct 12, 2024 • edited Loading

ggivo commented Oct 15, 2024

rohan-at-sentry commented Oct 15, 2024

rohan-at-sentry commented Oct 15, 2024

rohan-at-sentry commented Mar 14, 2024 •

edited

Loading

houserx-jmcc commented Mar 29, 2024 •

edited

Loading

houserx-jmcc commented Apr 1, 2024 •

edited

Loading

houserx-jmcc commented Apr 2, 2024 •

edited

Loading

rohan-at-sentry commented Apr 4, 2024 •

edited

Loading

rohan-at-sentry commented Apr 8, 2024 •

edited

Loading

rohan-at-sentry commented Aug 8, 2024 •

edited

Loading

rohan-at-sentry commented Sep 13, 2024 •

edited

Loading

rohan-at-sentry commented Sep 20, 2024 •

edited

Loading

Blacksmoke16 commented Sep 30, 2024 •

edited

Loading

Blacksmoke16 commented Oct 12, 2024 •

edited

Loading