Skip to content

Commit

Permalink
Merge pull request #106 from puppetlabs/cat_1820
Browse files Browse the repository at this point in the history
(CAT-1820) Move workflow-restarter to cat_github_actions
  • Loading branch information
gavindidrichsen committed Sep 27, 2024
2 parents c6a80f3 + 395b3cb commit cf8e994
Show file tree
Hide file tree
Showing 9 changed files with 269 additions and 0 deletions.
56 changes: 56 additions & 0 deletions .github/actions/workflow-restarter-proxy/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
name: 'Workflow Restarter Proxy'
description: |
This custom action acts as a proxy to trigger the reusable workflow that restarts a failed job.
NOTE: This action cannot itself do the re-start because in effect it's composite steps get folded
into the source workflow, the one that "uses" this custom action. Since github does not allow a workflow
to retrigger itself, then the source workflow must be triggered not by this but by another workflow.
Therefore, this custom action triggers that other workflow.
inputs:
repository:
description: 'Should be set to github.repository via the calling workflow'
required: true
run_id:
description: 'Should be set to github.run_id via the calling workflow'
required: true
runs:
using: 'composite'
steps:
# ABORT if not SOURCE_GITHUB_TOKEN environment variable set
- name: Check for presence of SOURCE_GITHUB_TOKEN environment variable
shell: bash
run: |
if [[ -z "${{ env.SOURCE_GITHUB_TOKEN }}" ]]; then
echo "ERROR: \$SOURCE_GITHUB_TOKEN must be set by the calling workflow" 1>&2 && exit 1
fi
# checkout the repository because I want bundler to have access to my Gemfile
- name: Checkout repository
uses: actions/checkout@v4

# setup ruby including a bundle install of my Gemfile
- name: Set up Ruby and install Octokit
uses: ruby/setup-ruby@v1
with:
ruby-version: '3'
bundler-cache: true # 'bundle install' will be run and gems cached for faster workflow runs

# Trigger the reusable workflow
- name: Trigger reusable workflow
shell: bash
run: |
gem install octokit
ruby -e "
require 'octokit'
client = Octokit::Client.new(:access_token => '${{ env.SOURCE_GITHUB_TOKEN }}')
client.post(
'/repos/${{ inputs.repository }}/actions/workflows/workflow-restarter.yml/dispatches',
{
ref: 'main',
inputs: {
repo: '${{ inputs.repository }}',
run_id: '${{ inputs.run_id }}'
}
}
)
"
59 changes: 59 additions & 0 deletions .github/workflows/workflow-restarter-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: Workflow Restarter TEST

on:
workflow_dispatch:
inputs:
fail:
description: >
For (acceptance, unit) jobs:
'true' = (fail, succeed) and
'false' = (succeed, fail)
required: true
default: 'true'
env:
SOURCE_GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

jobs:
unit:
runs-on: ubuntu-latest
steps:
- name: Check outcome
run: |
if [ "${{ github.event.inputs.fail }}" = "true" ]; then
echo "'unit' job succeeded"
exit 0
else
echo "'unit' job failed"
exit 1
fi
acceptance:
runs-on: ubuntu-latest
steps:
- name: Check outcome
run: |
if [ "${{ github.event.inputs.fail }}" = "true" ]; then
echo "'acceptance' job failed"
exit 1
else
echo "'acceptance' job succeeded"
exit 0
fi
on-failure-workflow-restarter-proxy:
# (1) run this job after the "acceptance" job and...
needs: [acceptance, unit]
# (2) continue ONLY IF "acceptance" fails
if: always() && needs.acceptance.result == 'failure' || needs.unit.result == 'failure'
runs-on: ubuntu-latest
steps:
# (3) checkout this repository in order to "see" the following custom action
- name: Checkout repository
uses: actions/checkout@v4

- name: Trigger reusable workflow
uses: "puppetlabs/cat-github-actions/.github/actions/workflow-restarter-proxy@main"
env:
SOURCE_GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
repository: ${{ github.repository }}
run_id: ${{ github.run_id }}
48 changes: 48 additions & 0 deletions .github/workflows/workflow-restarter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
name: Workflow Restarter
on:
workflow_call:
inputs:
repo:
description: "GitHub repository name."
required: true
type: string
run_id:
description: "The ID of the workflow run to rerun."
required: true
type: string
retries:
description: "The number of times to retry the workflow run."
required: false
type: string
default: "3"

jobs:
rerun:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Check retry count
id: check-retry
run: |
# IF `--attempts` returns a non-zero exit code, then keep retrying
status_code=$(gh run view ${{ inputs.run_id }} --repo ${{ inputs.repo }} --attempt ${{ inputs.retries }} --json status) || {
echo "Retry count is within limit"
echo "::set-output name=should_retry::true"
exit 0
}
# ELSE `--attempts` returns a zero exit code, so stop retrying
echo "Retry count has reached the limit"
echo "::set-output name=should_retry::false"
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

- name: Re-run failed jobs
if: ${{ steps.check-retry.outputs.should_retry == 'true' }}
run: gh run rerun --failed ${{ inputs.run_id }} --repo ${{ inputs.repo }}
continue-on-error: true
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Binary file added docs/image-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/image-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/image-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/image-4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/image.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
106 changes: 106 additions & 0 deletions docs/workflow-restarter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# workflow-restarter

## Description

Although GitHub provides built-in programatic mechanisms for retrying individual steps within a workflow, it doesn't provide one for retrying entire workflows. One possible reason for this limitation may be to prevent accidental infinite retry loops around failing workflows. Any workflow that fails, however, can be manually re-started from the failed workflow on the `Actions` tab of the repository. For more information on restarting github worklows see [Re-running workflows and jobs](https://docs.github.com/en/actions/managing-workflow-runs/re-running-workflows-and-jobs).

Nevertheless, it is possible to programmatically restart a workflow after it fails and the section below shows how to restart a failing workflow 3 times using the `workflow-restarter` re-usable workflow.

## Usage

If setting up the the `workflow-restarter` for the first time, then make sure to initialize it first and then configure another workflow to programmatically restart on failure.

### Initialize the `Workflow Restarter`

First, configure the `workflow-restarter-proxy` custom action by creating a `workflow-restarter.yml` file beneath the `.github/workflows` directory in your repository.

Second, configure the `workflow-restarter` re-usable workflow:

```yaml
name: Workflow Restarter
on:
workflow_dispatch:
inputs:
repo:
description: "GitHub repository name."
required: true
type: string
run_id:
description: "The ID of the workflow run to rerun."
required: true
type: string
retries:
description: "The number of times to retry the workflow run."
required: false
type: string
default: "3"

jobs:
call-reusable-workflow:
uses: puppetlabs/cat-github-actions/.github/workflows/workflow-restarter.yml@main
with:
repo: ${{ inputs.repo }}
run_id: ${{ inputs.run_id }}
retries: ${{ inputs.retries }}
```
Finally, verify that the `workflow-restarter.yml` performs as expected:
1. Add a `workflow-restarter-test.yml` file to `.github/workflows`, copy the contents of `./github/workflows/workflow-restarter-test` from this repository
2. Kick off the `workflow-restarter-test` and it should fail and be re-started 3 times. For example output see the [appendix below](#verify-workflow-restarter-with-workflow-restarter-test).

### Configure an existing workflow to use `on-failure-workflow-restarter`

Now add something like the following `yaml` job at the end of your workflow, changing only the `needs` section to suit.

For example, the following will trigger a restart if either the `acceptance` or the `unit` jobs preceeding it fail. A restart of the failing jobs will be attempted 3 times at which point if the failing jobs continue to fail, then the workflow will be marked as failed. If, however, at any point the `acceptance` and `unit` both pass fine then the restarted workflow will be marked as successful

```yaml
on-failure-workflow-restarter-proxy:
# (1) run this job after the "acceptance" job and...
needs: [acceptance, unit]
# (2) continue ONLY IF "acceptance" fails
if: always() && needs.acceptance.result == 'failure' || needs.unit.result == 'failure'
runs-on: ubuntu-latest
steps:
# (3) checkout this repository in order to "see" the following custom action
- name: Checkout repository
uses: actions/checkout@v2
# (4) "use" the custom action to retrigger the failed "acceptance job" above
# NOTE: pass the SOURCE_GITHUB_TOKEN to the custom action because (a) it must have
# this to trigger the reusable workflow that restarts the failed job; and
# (b) custom actions do not have access to the calling workflow's secrets
- name: Trigger reusable workflow
uses: ./.github/actions/workflow-restarter-proxy
env:
SOURCE_GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
repository: ${{ github.repository }}
run_id: ${{ github.run_id }}
```

## Appendix

### Verify `Workflow Restarter` with `Workflow Restarter TEST`

The following shows 3 `Workflow Restarter` occuring after the `Workflow Restarter TEST`, which is set to fail continuously.

![alt text](image.png)

Looking closer at the `Workflow Restarter TEST` reveals

* that the workflow includes 2 jobs `unit` and `acceptance`; and
* that the workflow has been re-run 3 times, e.g.,

![alt text](image-1.png)

Further, the following sequence of screenshots shows that only failed jobs are re-run.

* The `on-failure-workflow-restarter` job **(1)** is triggered by the failure of the `unit` job and **(2)** successfully calls the `workflow-restarter` workflow
* The `workflow-restarter` in turn triggers a re-run of the `unit` job **(3)** and the `Workflow Restarter TEST` shows this as an incremented attempt count at **(4)**.

![alt text](image-2.png)

![alt text](image-3.png)

![alt text](image-4.png)

0 comments on commit cf8e994

Please sign in to comment.