Fix failed run count logic for alerting test failures #7554

youngbupark · 2024-04-27T00:47:37Z

Description

When it counts the consecutive failed runs, the old code treats in_progress as success. This will fix the issue to count the failed runs correctly by skipping non-success runs. I validated this logic in this action run.

Type of change

This pull request is a minor refactor, code cleanup, test improvement, or other maintenance task and doesn't change the functionality of Radius (issue link optional).

Fixes: #issue_number

codecov · 2024-04-27T00:56:41Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 61.08%. Comparing base (00a3092) to head (e9b3484).
Report is 95 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #7554   +/-   ##
=======================================
  Coverage   61.08%   61.08%           
=======================================
  Files         520      520           
  Lines       27125    27125           
=======================================
+ Hits        16568    16570    +2     
+ Misses       9093     9092    -1     
+ Partials     1464     1463    -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

radius-functional-tests · 2024-04-27T02:20:58Z

Radius functional test overview

🔍 Go to test action run

Name	Value
Repository	youngbupark/radius
Commit ref	`47d1814`
Unique ID	func679852845b
Image tag	pr-func679852845b

Click here to see the list of tools in the current test run

gotestsum 1.10.0
KinD: v0.20.0
Dapr: 1.12.0
Azure KeyVault CSI driver: 1.4.2
Azure Workload identity webhook: 1.1.0
Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-func679852845b
Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-func679852845b
controller test image location: ghcr.io/radius-project/dev/controller:pr-func679852845b
ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-func679852845b
deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting ucp functional tests...
⌛ Starting cli functional tests...
⌛ Starting samples functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting daprrp functional tests...
⌛ Starting msgrp functional tests...
⌛ Starting shared functional tests...
⌛ Starting datastoresrp functional tests...
✅ ucp functional tests succeeded
✅ msgrp functional tests succeeded
✅ kubernetes functional tests succeeded
✅ daprrp functional tests succeeded
✅ cli functional tests succeeded
✅ samples functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ shared functional tests succeeded

ytimocin · 2024-04-27T02:21:40Z

.github/workflows/functional-test.yaml

@@ -715,10 +715,14 @@ jobs:
            for (const run of response.data.workflow_runs) {
              if (run.conclusion === 'failure') {
                failureCount++;
-              } else {
+              } else if (run.conclusion === 'success') {


I am trying to understand better. So, listWorkflowRuns above will bring 10 workflows. But I think what we should do is that we should take a look at the last completed workflow outcome and, if that is failure, then we can create an issue.

If we check the last 10 all the time, we may end up creating multiple issues for the same failures. I think adding the completed status check to the call above and setting per_page to 1 would be sufficient. If the last completed workflow has failed, then just set the failureCount to 2 and create the issue.

Suggestions:

We can add a status COMPLETED check to the GitHub call above. We can see different statuses here.

Please see this screen shot. When I cancelled workflow, cancelled run was also in completed status.

Because we want to treat only failure conclusion as a failed run, we need to get 1+ actions runs and then linear search to find the recent failure in the array. So I was fetching 10 action runs and then scan the workflow status to find the failure and ignore the other statuses, such as cancelled.

Also, I want to allow us to adjust the acceptable failures. that's why I add ISSUE_CREATE_THRESHOLD variable. I can add more comments for the clarification.

I created new composite action to be shared. Please review the PR again.

youngbupark · 2024-04-28T07:23:41Z

.github/actions/count-failed-runs/action.yaml

@@ -0,0 +1,65 @@
+name: "Count completed failed runs"


@ytimocin I created the composite action to be shared in two workflows.

youngbupark · 2024-04-28T07:24:10Z

.github/actions/count-failed-runs/action.yaml

+            per_page: perpage
+          });
+
+          // Scan `failure` conclusion runs to find the consecutive failures while


@ytimocin I added the comment why it scans the workflow run payloads

youngbupark · 2024-04-28T07:25:10Z

.github/actions/count-failed-runs/action.yaml

@@ -0,0 +1,65 @@
+name: "Count completed failed runs"
+description: This actions counts the number of consecutive failed runs for a given workflow.
+inputs:


This is validated by this workflow run - https://github.com/radius-project/radius/actions/runs/8865909289/job/24342699250

radius-functional-tests · 2024-04-28T09:51:39Z

Radius functional test overview

🔍 Go to test action run

Name	Value
Repository	youngbupark/radius
Commit ref	`0a93e98`
Unique ID	func91f364417c
Image tag	pr-func91f364417c

Click here to see the list of tools in the current test run

gotestsum 1.10.0
KinD: v0.20.0
Dapr: 1.12.0
Azure KeyVault CSI driver: 1.4.2
Azure Workload identity webhook: 1.1.0
Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-func91f364417c
Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-func91f364417c
controller test image location: ghcr.io/radius-project/dev/controller:pr-func91f364417c
ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-func91f364417c
deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting daprrp functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting cli functional tests...
⌛ Starting datastoresrp functional tests...
⌛ Starting ucp functional tests...
⌛ Starting msgrp functional tests...
⌛ Starting samples functional tests...
✅ kubernetes functional tests succeeded
✅ ucp functional tests succeeded
✅ msgrp functional tests succeeded
✅ daprrp functional tests succeeded
✅ samples functional tests succeeded
✅ cli functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ shared functional tests succeeded

radius-functional-tests · 2024-05-13T17:48:08Z

Radius functional test overview

🔍 Go to test action run

Name	Value
Repository	youngbupark/radius
Commit ref	`eeae94e`
Unique ID	funcf5b947d0f5
Image tag	pr-funcf5b947d0f5

Click here to see the list of tools in the current test run

gotestsum 1.10.0
KinD: v0.20.0
Dapr: 1.12.0
Azure KeyVault CSI driver: 1.4.2
Azure Workload identity webhook: 1.1.0
Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-funcf5b947d0f5
Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-funcf5b947d0f5
controller test image location: ghcr.io/radius-project/dev/controller:pr-funcf5b947d0f5
ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-funcf5b947d0f5
deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting msgrp functional tests...
⌛ Starting samples functional tests...
⌛ Starting shared functional tests...
⌛ Starting daprrp functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting cli functional tests...
⌛ Starting ucp functional tests...
⌛ Starting datastoresrp functional tests...
✅ samples functional tests succeeded
✅ msgrp functional tests succeeded
✅ kubernetes functional tests succeeded
✅ daprrp functional tests succeeded
✅ ucp functional tests succeeded
✅ cli functional tests succeeded
✅ shared functional tests succeeded
✅ datastoresrp functional tests succeeded

radius-functional-tests · 2024-06-03T13:05:23Z

Radius functional test overview

🔍 Go to test action run

Name	Value
Repository	youngbupark/radius
Commit ref	`77a0a74`
Unique ID	funcf4584b92ce
Image tag	pr-funcf4584b92ce

Click here to see the list of tools in the current test run

gotestsum 1.10.0
KinD: v0.20.0
Dapr: 1.12.0
Azure KeyVault CSI driver: 1.4.2
Azure Workload identity webhook: 1.1.0
Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-funcf4584b92ce
Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-funcf4584b92ce
controller test image location: ghcr.io/radius-project/dev/controller:pr-funcf4584b92ce
ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-funcf4584b92ce
deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting samples functional tests...
⌛ Starting cli functional tests...
⌛ Starting datastoresrp functional tests...
⌛ Starting msgrp functional tests...
⌛ Starting daprrp functional tests...
⌛ Starting shared functional tests...
⌛ Starting ucp functional tests...
⌛ Starting kubernetes functional tests...
✅ samples functional tests succeeded
✅ msgrp functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ cli functional tests succeeded
✅ ucp functional tests succeeded
✅ daprrp functional tests succeeded
✅ shared functional tests succeeded

radius-functional-tests · 2024-06-13T21:05:13Z

Radius functional test overview

🔍 Go to test action run

Name	Value
Repository	youngbupark/radius
Commit ref	`a2f6a99`
Unique ID	funcf0cd4eb039
Image tag	pr-funcf0cd4eb039

Click here to see the list of tools in the current test run

gotestsum 1.10.0
KinD: v0.20.0
Dapr: 1.12.0
Azure KeyVault CSI driver: 1.4.2
Azure Workload identity webhook: 1.1.0
Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-funcf0cd4eb039
Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-funcf0cd4eb039
controller test image location: ghcr.io/radius-project/dev/controller:pr-funcf0cd4eb039
ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-funcf0cd4eb039
deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting daprrp functional tests...
⌛ Starting ucp functional tests...
⌛ Starting shared functional tests...
⌛ Starting samples functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting msgrp functional tests...
⌛ Starting datastoresrp functional tests...
✅ ucp functional tests succeeded
✅ samples functional tests succeeded
✅ kubernetes functional tests succeeded
✅ daprrp functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ cli functional tests succeeded
✅ shared functional tests succeeded

kachawla

@ytimocin are you familiar with what caused the reporting to break in the place?

Signed-off-by: Young Bu Park <[email protected]>

github-actions · 2024-10-12T18:06:55Z

This pull request has been automatically marked as stale because it has been inactive for 90 days. Remove stale label or comment or this PR will be closed in 7 days.

youngbupark requested review from a team as code owners April 27, 2024 00:47

youngbupark changed the title ~~Fix functional test notification logic~~ Fix failed run count logic for alerting test failures Apr 27, 2024

youngbupark temporarily deployed to functional-tests April 27, 2024 02:20 — with GitHub Actions Inactive

ytimocin reviewed Apr 27, 2024

View reviewed changes

youngbupark commented Apr 28, 2024

View reviewed changes

youngbupark temporarily deployed to functional-tests April 28, 2024 09:51 — with GitHub Actions Inactive

ytimocin approved these changes Apr 29, 2024

View reviewed changes

ytimocin force-pushed the youngp/fix-condition branch from 0a93e98 to eeae94e Compare May 13, 2024 17:46

ytimocin temporarily deployed to functional-tests May 13, 2024 17:47 — with GitHub Actions Inactive

ytimocin force-pushed the youngp/fix-condition branch from eeae94e to 77a0a74 Compare June 3, 2024 13:03

ytimocin temporarily deployed to functional-tests June 3, 2024 13:04 — with GitHub Actions Inactive

ytimocin mentioned this pull request Jun 3, 2024

Fix failed run count logic for alerting test failures #7657

Open

1 task

ytimocin force-pushed the youngp/fix-condition branch from 77a0a74 to a2f6a99 Compare June 13, 2024 21:04

ytimocin temporarily deployed to functional-tests June 13, 2024 21:04 — with GitHub Actions Inactive

kachawla reviewed Jun 26, 2024

View reviewed changes

youngbupark added 6 commits July 13, 2024 19:51

do not count in_process run

fc095c6

Signed-off-by: Young Bu Park <[email protected]>

wip

2ea0a81

Signed-off-by: Young Bu Park <[email protected]>

wip

012d724

Signed-off-by: Young Bu Park <[email protected]>

revert

54582c7

Signed-off-by: Young Bu Park <[email protected]>

use composite pattern

d6ee198

Signed-off-by: Young Bu Park <[email protected]>

wip

d0cdf40

Signed-off-by: Young Bu Park <[email protected]>

youngbupark added 7 commits July 13, 2024 19:51

wip

b147b9c

Signed-off-by: Young Bu Park <[email protected]>

wip

79c1000

Signed-off-by: Young Bu Park <[email protected]>

wip

7e2a07a

Signed-off-by: Young Bu Park <[email protected]>

wip

7cfc2d0

Signed-off-by: Young Bu Park <[email protected]>

test

4c4ee62

Signed-off-by: Young Bu Park <[email protected]>

fix all

98852b2

Signed-off-by: Young Bu Park <[email protected]>

revert

e9b3484

Signed-off-by: Young Bu Park <[email protected]>

ytimocin force-pushed the youngp/fix-condition branch from a2f6a99 to e9b3484 Compare July 14, 2024 02:51

ytimocin had a problem deploying to functional-tests August 13, 2024 02:51 — with GitHub Actions Failure

github-actions bot added the stale label Oct 12, 2024

github-actions bot closed this Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix failed run count logic for alerting test failures #7554

Fix failed run count logic for alerting test failures #7554

youngbupark commented Apr 27, 2024

codecov bot commented Apr 27, 2024 •

edited

Loading

radius-functional-tests bot commented Apr 27, 2024 •

edited

Loading

ytimocin Apr 27, 2024

youngbupark Apr 28, 2024 •

edited

Loading

youngbupark Apr 28, 2024

youngbupark Apr 28, 2024

youngbupark Apr 28, 2024 •

edited

Loading

youngbupark Apr 28, 2024

radius-functional-tests bot commented Apr 28, 2024 •

edited

Loading

radius-functional-tests bot commented May 13, 2024 •

edited

Loading

radius-functional-tests bot commented Jun 3, 2024 •

edited

Loading

radius-functional-tests bot commented Jun 13, 2024 •

edited

Loading

kachawla left a comment

github-actions bot commented Oct 12, 2024

Fix failed run count logic for alerting test failures #7554

Fix failed run count logic for alerting test failures #7554

Conversation

youngbupark commented Apr 27, 2024

Description

Type of change

codecov bot commented Apr 27, 2024 • edited Loading

Codecov Report

radius-functional-tests bot commented Apr 27, 2024 • edited Loading

Radius functional test overview

Test Status

ytimocin Apr 27, 2024

Choose a reason for hiding this comment

youngbupark Apr 28, 2024 • edited Loading

Choose a reason for hiding this comment

youngbupark Apr 28, 2024

Choose a reason for hiding this comment

youngbupark Apr 28, 2024

Choose a reason for hiding this comment

youngbupark Apr 28, 2024 • edited Loading

Choose a reason for hiding this comment

youngbupark Apr 28, 2024

Choose a reason for hiding this comment

radius-functional-tests bot commented Apr 28, 2024 • edited Loading

Radius functional test overview

Test Status

radius-functional-tests bot commented May 13, 2024 • edited Loading

Radius functional test overview

Test Status

radius-functional-tests bot commented Jun 3, 2024 • edited Loading

Radius functional test overview

Test Status

radius-functional-tests bot commented Jun 13, 2024 • edited Loading

Radius functional test overview

Test Status

kachawla left a comment

Choose a reason for hiding this comment

github-actions bot commented Oct 12, 2024

codecov bot commented Apr 27, 2024 •

edited

Loading

radius-functional-tests bot commented Apr 27, 2024 •

edited

Loading

youngbupark Apr 28, 2024 •

edited

Loading

youngbupark Apr 28, 2024 •

edited

Loading

radius-functional-tests bot commented Apr 28, 2024 •

edited

Loading

radius-functional-tests bot commented May 13, 2024 •

edited

Loading

radius-functional-tests bot commented Jun 3, 2024 •

edited

Loading

radius-functional-tests bot commented Jun 13, 2024 •

edited

Loading