Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce flakey codecov job failures #323

Closed
zzak opened this issue Aug 14, 2019 · 2 comments
Closed

Reduce flakey codecov job failures #323

zzak opened this issue Aug 14, 2019 · 2 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@zzak
Copy link
Contributor

zzak commented Aug 14, 2019

Our coverage job occasionally fails due to network timeout.

We should add a retry or something to this job to reduce the chances the job will fail causing a developer to manually re-run the workflow.

@zzak zzak added bug Something isn't working good first issue Good for newcomers labels Aug 14, 2019
@brenol
Copy link

brenol commented Oct 4, 2019

Hi @zzak!

This is odd, I just followed the trail and if I understood correctly (I'll put it into here what made me think of this flow, I'm new to circleCI):

  • Saw the config.yml you mentioned
  • Saw it imported codecov: circleci/[email protected] from clojure
  • Noticed that it uses codecov-clojure upload task (is this the correct idea I'm having here?)

If thats it, then it means there already is 3 retries for this request:

commands:
  upload:
    parameters:
      token:
        description: The API token to use for uploading to Codecov.
        type: string
        default: $CODECOV_TOKEN
      path:
        description: Path to the code coverage data file to upload.
        type: string
    steps:
      - run:
          name: Upload Coverage Results
          command: |
            curl --request POST --retry 3 --silent --show-error --fail \
            --data-binary @<< parameters.path >> \
            "https://codecov.io/upload/v2?service=circleci\
            &token=<< parameters.token >>\
            &commit=$CIRCLE_SHA1\
            &branch=$CIRCLE_BRANCH\
            &build=$CIRCLE_BUILD_NUM\
            &job=$CIRCLE_NODE_INDEX\
            &build_url=$CIRCLE_BUILD_URL\
            &slug=$CIRCLE_PROJECT_USERNAME/$CIRCLE_PROJECT_REPONAME\
            &pr=$CIRCLE_PR_NUMBER"

And we can also see from the output that it's already retrying.

While looking a little for some fixes, I stumped on this issue in mapbox-gl-native: mapbox/mapbox-gl-native#15248
, which is mentioned that in their official docs they suggest that you do not fail builds:

Exit 0
Codecov will exit 0 to prevent failing the build, if there are issues. If you would like Codecov to exit with 1, use bash <(curl -s https://codecov.io/bash) -Z.

exit 0 is not full proof. Please use this command to always exit with 0: bash <(curl -s https://codecov.io/bash) || echo 'Codecov failed to upload'.

So, what needs to be done here is pull up that logic in codecov-clojure to overwrite the retry param, right? I'm not sure if it'll fix though.

Also, I noticed that they're using http2.0. Perhaps we should try a little with http1.1 and see if it keeps happening? Really not sure on what to do here.

To tackle on this PR and to move forward, as any of the other things I mentioned, I'll make the code not use codecov-cojure, instead, I'll make it use codecov directly and add the upload command from codecov-clojure here, so we can control how the upload is ran on this project over here.

@hannahhenderson
Copy link
Contributor

@zzak looking back over four months of tests in this repo, it doesn't look like codecov has failed unless test jobs were also failing. I'm going to go ahead and close this issue, please go ahead and open a new issue if this problem occurs again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants