-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Controller doesn't requeue ScheduledQueryRules #4293
Comments
When ASO encounters a I expect that what's happening here is that the error needs to be reclassified as non-fatal so that our existing retry mechanisms will kick in. Fortunately, we have an extension mechanism (the interface To do this, we need the entire error that's returned from Azure - this should be prominent in your logs, just prior to the logs you quoted above. |
Ok, let me rerun the tests once more to get accurate data, will come back with the errors |
So, this is the sequence of the logs that we get:
Here, it's waiting a bit for the Owner RG to be created, 1-2 mins, then sends the resource to Azure.
A couple of minutes after, 2-3 mins max, the LAW is created, but this resource creation is never retried. Even after a couple of hours, no more logs regarding this. |
Version of Azure Service Operator
v2.8.0
Describe the bug
Our use-case is to deploy an Azure AKS scale-unit, consisting of multiple resources, including:
When we create a new scale-unit, we do introduce a direct dependency for LAW to be created before the ScheduledQueryRule.
The first time ASO tries to create the resource it will fail, because it's waiting for the RG object to be created, but afterwards, it should fail and retry until the LAW object is created (eventual consistency).
But after the first failure of LAW missing, it won't requeue the object. We think that the Bad Request response got from Azure is not handled properly and goes into a permanent failure.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The controller should be able to retry missing resource requests, coded as Bad Requests.
Screenshots
These are the logs that we saw in the controller, but the object will not be reconciled.
Additional context
We are also using CAPZ along with ASO, and we also have these kinds of resources there, which reference LAW via ArmId, and they will be eventually reconciled when the LAW is created. This should behave the same.
The text was updated successfully, but these errors were encountered: