Issue with CRDs installed by testkube-operator via ArgoCD #5280

hinskii · 2024-04-08T11:28:59Z

Title: Pods Instantly Terminating When Running Tests Deployed as Code via Dashboard

Description

We've encountered an issue where pods are instantly terminating when attempting to run tests that have been deployed as code and triggered through the Testkube dashboard. Tests run manually via the GUI do not exhibit this issue.
Environment

Testkube Version: 1.17.15
Testkube Dashboard Version: 1.16.7
Deployment Method: Deployed as code via ArgoCD and run through the dashboard.

Observations

1.The problem does not seem to originate from the Testkube dashboard itself, as scheduling tests defined in the code also does not work.
2.When using apiVersion: executor.testkube.io/v1 for the Executor, pods are instantly terminated. However, oddly, using a non-existent apiVersion (e.g., executor.testkube.io/v3) seems to bypass the issue.
3.This behavior suggests it might not be directly related to the Testkube versions but could involve some configuration or compatibility issue.
4. Error messages and logs suggest that the executor pod fails to find the specified pods, indicating a possible issue with pod creation or lifecycle management.

Logs and Errors:
1.Logs indicate that tests start but then fail due to the executor pod not finding the specified pods (e.g., "error":"pods "661036dd23f9ee494fb15ddd-l5tct" not found").
2. error related to the update of results from the job's pod also points to the pods not being found.
3. Despite specifying the correct API version for the Executor (apiVersion: executor.testkube.io/v1), pods are instantly terminated, while an incorrect API version doesn't produce this issue.

Unique Condition: Argo CD Manifest Out of Sync

Interestingly, when the Argo CD manifest was out of sync with the cluster state, the tests were able to run successfully. This anomaly suggests that the synchronization state of Argo CD, or the application of the manifests through Argo CD, plays a significant role in the issue. It raises questions about how Argo CD's sync process might affect the creation and lifecycle of the test execution pods.

Steps to Reproduce:

1. Deploy tests as code with the Testkube dashboard using apiVersion: executor.testkube.io/v1. via ArgoCD
2. Trigger the tests through the dashboard.
3. Observe that the pods are instantly terminated.
4. Then change apiVersion to: executor.testkube.io/v3
5. Try to run test

Additional Information
The issue may involve ArgoCD behavior, as manually applying CRDs with kubectl and bypassing ArgoCD results in successful test execution. It suggests a potential interaction issue between ArgoCD deployments and the Testkube setup.

The text was updated successfully, but these errors were encountered:

vsukhin · 2024-04-08T16:53:34Z

thank you @hinskii for prioritization to @TheBrunoLopes and @jmorante-ks

frederikb · 2024-04-12T07:13:57Z

Hi @hinskii, did you check out the hints in kubeshop/helm-charts#711 ?

Basically you should be adding the following annotations to all templates (job-container-template.yaml.tmpl, job-template.yml.tmpl, ...):

annotations:
    argocd.argoproj.io/compare-options: IgnoreExtraneous
    argocd.argoproj.io/sync-options: Prune=false

otherwise ArgoCD will kill your test execution pods, which may seem to be at random times due to the internal sync intervals of ArgoCD. I strongly assume that this is what you are seeing as well.

vsukhin · 2024-04-12T09:10:37Z

great suggestion @frederikb !

hinskii · 2024-04-15T05:56:21Z

That was the point! Ive just deleted prune from argo manifest and it works! Thank you guys!

vsukhin · 2024-04-15T06:23:33Z

@ypoplavs fyi

hinskii added the bug 🐛 Something is not working as should be label Apr 8, 2024

vsukhin closed this as completed Apr 15, 2024

henrik-farre mentioned this issue May 22, 2024

Stable releases? kubeshop/helm-charts#843

Closed

frederikb mentioned this issue Jun 5, 2024

feat(testkube-api): ability to specify annotations on job templates kubeshop/helm-charts#873

Merged

6 tasks

linear bot added Bug Created by Linear-GitHub Sync and removed bug 🐛 Something is not working as should be labels Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with CRDs installed by testkube-operator via ArgoCD #5280

Issue with CRDs installed by testkube-operator via ArgoCD #5280

hinskii commented Apr 8, 2024

vsukhin commented Apr 8, 2024

frederikb commented Apr 12, 2024

vsukhin commented Apr 12, 2024

hinskii commented Apr 15, 2024

vsukhin commented Apr 15, 2024

Issue with CRDs installed by testkube-operator via ArgoCD #5280

Issue with CRDs installed by testkube-operator via ArgoCD #5280

Comments

hinskii commented Apr 8, 2024

vsukhin commented Apr 8, 2024

frederikb commented Apr 12, 2024

vsukhin commented Apr 12, 2024

hinskii commented Apr 15, 2024

vsukhin commented Apr 15, 2024