Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with CRDs installed by testkube-operator via ArgoCD #5280

Closed
hinskii opened this issue Apr 8, 2024 · 5 comments
Closed

Issue with CRDs installed by testkube-operator via ArgoCD #5280

hinskii opened this issue Apr 8, 2024 · 5 comments
Labels
Bug Created by Linear-GitHub Sync

Comments

@hinskii
Copy link

hinskii commented Apr 8, 2024

Title: Pods Instantly Terminating When Running Tests Deployed as Code via Dashboard

Description

We've encountered an issue where pods are instantly terminating when attempting to run tests that have been deployed as code and triggered through the Testkube dashboard. Tests run manually via the GUI do not exhibit this issue.
Environment

Testkube Version: 1.17.15
Testkube Dashboard Version: 1.16.7
Deployment Method: Deployed as code via ArgoCD and run through the dashboard.

Observations

1.The problem does not seem to originate from the Testkube dashboard itself, as scheduling tests defined in the code also does not work.
2.When using apiVersion: executor.testkube.io/v1 for the Executor, pods are instantly terminated. However, oddly, using a non-existent apiVersion (e.g., executor.testkube.io/v3) seems to bypass the issue.
3.This behavior suggests it might not be directly related to the Testkube versions but could involve some configuration or compatibility issue.
4. Error messages and logs suggest that the executor pod fails to find the specified pods, indicating a possible issue with pod creation or lifecycle management.

Logs and Errors:
1.Logs indicate that tests start but then fail due to the executor pod not finding the specified pods (e.g., "error":"pods "661036dd23f9ee494fb15ddd-l5tct" not found").
2. error related to the update of results from the job's pod also points to the pods not being found.
3. Despite specifying the correct API version for the Executor (apiVersion: executor.testkube.io/v1), pods are instantly terminated, while an incorrect API version doesn't produce this issue.

Unique Condition: Argo CD Manifest Out of Sync

Interestingly, when the Argo CD manifest was out of sync with the cluster state, the tests were able to run successfully. This anomaly suggests that the synchronization state of Argo CD, or the application of the manifests through Argo CD, plays a significant role in the issue. It raises questions about how Argo CD's sync process might affect the creation and lifecycle of the test execution pods.

Steps to Reproduce:

1. Deploy tests as code with the Testkube dashboard using apiVersion: executor.testkube.io/v1. via ArgoCD
2. Trigger the tests through the dashboard.
3. Observe that the pods are instantly terminated.
4. Then change apiVersion to: executor.testkube.io/v3
5. Try to run test

Additional Information
The issue may involve ArgoCD behavior, as manually applying CRDs with kubectl and bypassing ArgoCD results in successful test execution. It suggests a potential interaction issue between ArgoCD deployments and the Testkube setup.

@hinskii hinskii added the bug 🐛 Something is not working as should be label Apr 8, 2024
@vsukhin
Copy link
Collaborator

vsukhin commented Apr 8, 2024

thank you @hinskii for prioritization to @TheBrunoLopes and @jmorante-ks

@frederikb
Copy link
Contributor

Hi @hinskii, did you check out the hints in kubeshop/helm-charts#711 ?

Basically you should be adding the following annotations to all templates (job-container-template.yaml.tmpl, job-template.yml.tmpl, ...):

annotations:
    argocd.argoproj.io/compare-options: IgnoreExtraneous
    argocd.argoproj.io/sync-options: Prune=false

otherwise ArgoCD will kill your test execution pods, which may seem to be at random times due to the internal sync intervals of ArgoCD. I strongly assume that this is what you are seeing as well.

@vsukhin
Copy link
Collaborator

vsukhin commented Apr 12, 2024

great suggestion @frederikb !

@hinskii
Copy link
Author

hinskii commented Apr 15, 2024

That was the point! Ive just deleted prune from argo manifest and it works! Thank you guys!

@vsukhin
Copy link
Collaborator

vsukhin commented Apr 15, 2024

@ypoplavs fyi

@vsukhin vsukhin closed this as completed Apr 15, 2024
@linear linear bot added Bug Created by Linear-GitHub Sync and removed bug 🐛 Something is not working as should be labels Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Created by Linear-GitHub Sync
Projects
None yet
Development

No branches or pull requests

3 participants