Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failed run count logic for alerting test failures #7554

Closed

Conversation

youngbupark
Copy link

Description

When it counts the consecutive failed runs, the old code treats in_progress as success. This will fix the issue to count the failed runs correctly by skipping non-success runs. I validated this logic in this action run.

Type of change

  • This pull request is a minor refactor, code cleanup, test improvement, or other maintenance task and doesn't change the functionality of Radius (issue link optional).

Fixes: #issue_number

@youngbupark youngbupark requested review from a team as code owners April 27, 2024 00:47
Copy link

codecov bot commented Apr 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 61.08%. Comparing base (00a3092) to head (e9b3484).
Report is 95 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7554   +/-   ##
=======================================
  Coverage   61.08%   61.08%           
=======================================
  Files         520      520           
  Lines       27125    27125           
=======================================
+ Hits        16568    16570    +2     
+ Misses       9093     9092    -1     
+ Partials     1464     1463    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@youngbupark youngbupark changed the title Fix functional test notification logic Fix failed run count logic for alerting test failures Apr 27, 2024
@radius-functional-tests
Copy link

radius-functional-tests bot commented Apr 27, 2024

Radius functional test overview

🔍 Go to test action run

Name Value
Repository youngbupark/radius
Commit ref 47d1814
Unique ID func679852845b
Image tag pr-func679852845b
Click here to see the list of tools in the current test run
  • gotestsum 1.10.0
  • KinD: v0.20.0
  • Dapr: 1.12.0
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.1.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-func679852845b
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-func679852845b
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-func679852845b
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-func679852845b
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting ucp functional tests...
⌛ Starting cli functional tests...
⌛ Starting samples functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting daprrp functional tests...
⌛ Starting msgrp functional tests...
⌛ Starting shared functional tests...
⌛ Starting datastoresrp functional tests...
✅ ucp functional tests succeeded
✅ msgrp functional tests succeeded
✅ kubernetes functional tests succeeded
✅ daprrp functional tests succeeded
✅ cli functional tests succeeded
✅ samples functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ shared functional tests succeeded

@@ -715,10 +715,14 @@ jobs:
for (const run of response.data.workflow_runs) {
if (run.conclusion === 'failure') {
failureCount++;
} else {
} else if (run.conclusion === 'success') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to understand better. So, listWorkflowRuns above will bring 10 workflows. But I think what we should do is that we should take a look at the last completed workflow outcome and, if that is failure, then we can create an issue.

If we check the last 10 all the time, we may end up creating multiple issues for the same failures. I think adding the completed status check to the call above and setting per_page to 1 would be sufficient. If the last completed workflow has failed, then just set the failureCount to 2 and create the issue.

Suggestions:

  • We can add a status COMPLETED check to the GitHub call above. We can see different statuses here.

Copy link
Author

@youngbupark youngbupark Apr 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see this screen shot. When I cancelled workflow, cancelled run was also in completed status.
image

Because we want to treat only failure conclusion as a failed run, we need to get 1+ actions runs and then linear search to find the recent failure in the array. So I was fetching 10 action runs and then scan the workflow status to find the failure and ignore the other statuses, such as cancelled.

Also, I want to allow us to adjust the acceptable failures. that's why I add ISSUE_CREATE_THRESHOLD variable. I can add more comments for the clarification.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created new composite action to be shared. Please review the PR again.

@@ -0,0 +1,65 @@
name: "Count completed failed runs"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ytimocin I created the composite action to be shared in two workflows.

per_page: perpage
});

// Scan `failure` conclusion runs to find the consecutive failures while
Copy link
Author

@youngbupark youngbupark Apr 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ytimocin I added the comment why it scans the workflow run payloads

@@ -0,0 +1,65 @@
name: "Count completed failed runs"
description: This actions counts the number of consecutive failed runs for a given workflow.
inputs:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@radius-functional-tests
Copy link

radius-functional-tests bot commented Apr 28, 2024

Radius functional test overview

🔍 Go to test action run

Name Value
Repository youngbupark/radius
Commit ref 0a93e98
Unique ID func91f364417c
Image tag pr-func91f364417c
Click here to see the list of tools in the current test run
  • gotestsum 1.10.0
  • KinD: v0.20.0
  • Dapr: 1.12.0
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.1.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-func91f364417c
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-func91f364417c
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-func91f364417c
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-func91f364417c
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting daprrp functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting cli functional tests...
⌛ Starting datastoresrp functional tests...
⌛ Starting ucp functional tests...
⌛ Starting msgrp functional tests...
⌛ Starting samples functional tests...
✅ kubernetes functional tests succeeded
✅ ucp functional tests succeeded
✅ msgrp functional tests succeeded
✅ daprrp functional tests succeeded
✅ samples functional tests succeeded
✅ cli functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ shared functional tests succeeded

@radius-functional-tests
Copy link

radius-functional-tests bot commented May 13, 2024

Radius functional test overview

🔍 Go to test action run

Name Value
Repository youngbupark/radius
Commit ref eeae94e
Unique ID funcf5b947d0f5
Image tag pr-funcf5b947d0f5
Click here to see the list of tools in the current test run
  • gotestsum 1.10.0
  • KinD: v0.20.0
  • Dapr: 1.12.0
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.1.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-funcf5b947d0f5
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-funcf5b947d0f5
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-funcf5b947d0f5
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-funcf5b947d0f5
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting msgrp functional tests...
⌛ Starting samples functional tests...
⌛ Starting shared functional tests...
⌛ Starting daprrp functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting cli functional tests...
⌛ Starting ucp functional tests...
⌛ Starting datastoresrp functional tests...
✅ samples functional tests succeeded
✅ msgrp functional tests succeeded
✅ kubernetes functional tests succeeded
✅ daprrp functional tests succeeded
✅ ucp functional tests succeeded
✅ cli functional tests succeeded
✅ shared functional tests succeeded
✅ datastoresrp functional tests succeeded

@radius-functional-tests
Copy link

radius-functional-tests bot commented Jun 3, 2024

Radius functional test overview

🔍 Go to test action run

Name Value
Repository youngbupark/radius
Commit ref 77a0a74
Unique ID funcf4584b92ce
Image tag pr-funcf4584b92ce
Click here to see the list of tools in the current test run
  • gotestsum 1.10.0
  • KinD: v0.20.0
  • Dapr: 1.12.0
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.1.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-funcf4584b92ce
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-funcf4584b92ce
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-funcf4584b92ce
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-funcf4584b92ce
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting samples functional tests...
⌛ Starting cli functional tests...
⌛ Starting datastoresrp functional tests...
⌛ Starting msgrp functional tests...
⌛ Starting daprrp functional tests...
⌛ Starting shared functional tests...
⌛ Starting ucp functional tests...
⌛ Starting kubernetes functional tests...
✅ samples functional tests succeeded
✅ msgrp functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ cli functional tests succeeded
✅ ucp functional tests succeeded
✅ daprrp functional tests succeeded
✅ shared functional tests succeeded

@radius-functional-tests
Copy link

radius-functional-tests bot commented Jun 13, 2024

Radius functional test overview

🔍 Go to test action run

Name Value
Repository youngbupark/radius
Commit ref a2f6a99
Unique ID funcf0cd4eb039
Image tag pr-funcf0cd4eb039
Click here to see the list of tools in the current test run
  • gotestsum 1.10.0
  • KinD: v0.20.0
  • Dapr: 1.12.0
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.1.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-funcf0cd4eb039
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-funcf0cd4eb039
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-funcf0cd4eb039
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-funcf0cd4eb039
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting daprrp functional tests...
⌛ Starting ucp functional tests...
⌛ Starting shared functional tests...
⌛ Starting samples functional tests...
⌛ Starting kubernetes functional tests...
⌛ Starting msgrp functional tests...
⌛ Starting datastoresrp functional tests...
✅ ucp functional tests succeeded
✅ samples functional tests succeeded
✅ kubernetes functional tests succeeded
✅ daprrp functional tests succeeded
✅ datastoresrp functional tests succeeded
✅ cli functional tests succeeded
✅ shared functional tests succeeded

Copy link
Contributor

@kachawla kachawla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ytimocin are you familiar with what caused the reporting to break in the place?

Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Signed-off-by: Young Bu Park <[email protected]>
Copy link

This pull request has been automatically marked as stale because it has been inactive for 90 days. Remove stale label or comment or this PR will be closed in 7 days.

@github-actions github-actions bot added the stale label Oct 12, 2024
@github-actions github-actions bot closed this Oct 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants