-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update status readiness #654
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1142,15 +1142,10 @@ func (r *Reconciler) Reconcile(ctx context.Context, req reconcile.Request) (resu | |
} | ||
} | ||
|
||
if observation.ResourceUpToDate { | ||
// We did not need to create, update, or delete our external resource. | ||
// Per the below issue nothing will notify us if and when the external | ||
// resource we manage changes, so we requeue a speculative reconcile | ||
// after the specified poll interval in order to observe it and react | ||
// accordingly. | ||
// https://github.com/crossplane/crossplane/issues/289 | ||
// skip the update if the management policy is set to ignore updates | ||
if !policy.ShouldUpdate() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was originally lower in the reconciler, but we moved it here to catch cases where the management policy does not allow updates early. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not convinced this is an improvement on the previous behavior. At this point in the reconcile loop we know no update is needed, regardless of whether the policy would or wouldn't allow an update. I feel it's more useful to log that no update is needed than to log that no update will be performed due to policy. Logging that no update will be performed due to policy leaves it ambiguous as to whether an update is needed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is a good point! But I think I would rather rewrite the
And then I can immediately on the line below check if Would this maybe create spam for the users, though? I am not very familiar with how the |
||
reconcileAfter := r.pollIntervalHook(managed, r.pollInterval) | ||
log.Debug("External resource is up to date", "requeue-after", time.Now().Add(reconcileAfter)) | ||
log.Debug("Skipping update due to managementPolicies. Reconciliation succeeded", "requeue-after", time.Now().Add(reconcileAfter)) | ||
managed.SetConditions(xpv1.ReconcileSuccess()) | ||
return reconcile.Result{RequeueAfter: reconcileAfter}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedStatus) | ||
} | ||
|
@@ -1159,45 +1154,48 @@ func (r *Reconciler) Reconcile(ctx context.Context, req reconcile.Request) (resu | |
log.Debug("External resource differs from desired state", "diff", observation.Diff) | ||
} | ||
|
||
// skip the update if the management policy is set to ignore updates | ||
if !policy.ShouldUpdate() { | ||
reconcileAfter := r.pollIntervalHook(managed, r.pollInterval) | ||
log.Debug("Skipping update due to managementPolicies. Reconciliation succeeded", "requeue-after", time.Now().Add(reconcileAfter)) | ||
managed.SetConditions(xpv1.ReconcileSuccess()) | ||
return reconcile.Result{RequeueAfter: reconcileAfter}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedStatus) | ||
} | ||
if !observation.ResourceUpToDate && policy.ShouldUpdate() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This if condition is what allows us to treat the This is as close to the implementation for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure To me:
It would potentially make sense to set I believe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that this reconciler sets the The reconciler can know that it triggered the creation of of an external resource, but it has no way to know whether creation is done. The Will There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't really agree with this. You seem to focus a lot on what the cloud provider reports for the resource, so if they say it is ready, we set it to ready, but is this really what we want? When I first read this in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That is a good point. My desire was always to just make Update behave the same as Create, to always give the most up to date status on if the resource is as I desire. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wasn't actually able to find these definitions in the docs, only this from the troubleshooting part:
You acknowledge that there is a gap here, and I feel like the issues you linked confirm the gap and that is a big enough problem that workarounds have been attempted to fix it. I understand that definitions and such can be scary to change, as they often introduce some level of a breaking change, but I don't really see the negative side of changing this. All providers already implement a way of checking if the resource is actually compliant with the desired state, I only suggest that we actually use it for What is the argument for not changing this? Another issue with the current implementation: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'm not convinced the way you're addressing the problem in this PR is the best way to address the problem. If we're going to change the semantics of
I think a change like this warrants at least a one-pager to get us all aligned on what the problem is, and on which solution is best. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good! Would you want me to write it, or is this something you would do internally? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I can give it a try 😄 |
||
update, err := external.Update(externalCtx, managed) | ||
if err != nil { | ||
// We'll hit this condition if we can't update our external resource, | ||
// for example if our provider credentials don't have access to update | ||
// it. If this is the first time we encounter this issue we'll be | ||
// requeued implicitly when we update our status with the new error | ||
// condition. If not, we requeue explicitly, which will trigger backoff. | ||
log.Debug("Cannot update external resource") | ||
record.Event(managed, event.Warning(reasonCannotUpdate, err)) | ||
managed.SetConditions(xpv1.Updating(), xpv1.ReconcileError(errors.Wrap(err, errReconcileUpdate))) | ||
return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedStatus) | ||
Comment on lines
+1167
to
+1168
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is mostly the point of the PR. This is what gives us more consistent behaviour for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also similar additions on line 1177, for example. |
||
} | ||
|
||
update, err := external.Update(externalCtx, managed) | ||
if err != nil { | ||
// We'll hit this condition if we can't update our external resource, | ||
// for example if our provider credentials don't have access to update | ||
// it. If this is the first time we encounter this issue we'll be | ||
// requeued implicitly when we update our status with the new error | ||
// condition. If not, we requeue explicitly, which will trigger backoff. | ||
log.Debug("Cannot update external resource") | ||
record.Event(managed, event.Warning(reasonCannotUpdate, err)) | ||
managed.SetConditions(xpv1.ReconcileError(errors.Wrap(err, errReconcileUpdate))) | ||
return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedStatus) | ||
} | ||
if _, err := r.managed.PublishConnection(ctx, managed, update.ConnectionDetails); err != nil { | ||
// If this is the first time we encounter this issue we'll be requeued | ||
// implicitly when we update our status with the new error condition. If | ||
// not, we requeue explicitly, which will trigger backoff. | ||
log.Debug("Cannot publish connection details", "error", err) | ||
record.Event(managed, event.Warning(reasonCannotPublish, err)) | ||
managed.SetConditions(xpv1.Updating(), xpv1.ReconcileError(err)) | ||
return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedStatus) | ||
} | ||
|
||
if _, err := r.managed.PublishConnection(ctx, managed, update.ConnectionDetails); err != nil { | ||
// If this is the first time we encounter this issue we'll be requeued | ||
// implicitly when we update our status with the new error condition. If | ||
// not, we requeue explicitly, which will trigger backoff. | ||
log.Debug("Cannot publish connection details", "error", err) | ||
record.Event(managed, event.Warning(reasonCannotPublish, err)) | ||
managed.SetConditions(xpv1.ReconcileError(err)) | ||
// We've successfully updated our external resource. In many cases the | ||
// updating process takes a little time to finish. We requeue explicitly | ||
// order to observe the external resource to determine whether it's | ||
// ready for use. | ||
Comment on lines
+1181
to
+1184
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure about the "We requeue explicitly order to observe..." sentence. It seems a bit odd to me. This comment was copied from the |
||
log.Debug("Successfully requested update of external resource") | ||
record.Event(managed, event.Normal(reasonUpdated, "Successfully requested update of external resource")) | ||
managed.SetConditions(xpv1.Updating(), xpv1.ReconcileSuccess()) | ||
return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedStatus) | ||
} | ||
|
||
// We've successfully updated our external resource. Per the below issue | ||
// nothing will notify us if and when the external resource we manage | ||
// changes, so we requeue a speculative reconcile after the specified poll | ||
// interval in order to observe it and react accordingly. | ||
// We did not need to create, update, or delete our external resource. | ||
// Per the below issue nothing will notify us if and when the external | ||
// resource we manage changes, so we requeue a speculative reconcile | ||
// after the specified poll interval in order to observe it and react | ||
// accordingly. | ||
Comment on lines
+1191
to
+1195
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the comment that was used for the We moved it to the bottom as it seems to work well for the last resort case, which is when there is nothing we need to do. |
||
// https://github.com/crossplane/crossplane/issues/289 | ||
reconcileAfter := r.pollIntervalHook(managed, r.pollInterval) | ||
log.Debug("Successfully requested update of external resource", "requeue-after", time.Now().Add(reconcileAfter)) | ||
record.Event(managed, event.Normal(reasonUpdated, "Successfully requested update of external resource")) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We chose to not emit any events here, but we are not sure. Emitting here would emit every reconcile that just observed and everything is okay, so it doesn't seem right. |
||
log.Debug("External resource is up to date", "requeue-after", time.Now().Add(reconcileAfter)) | ||
managed.SetConditions(xpv1.ReconcileSuccess()) | ||
return reconcile.Result{RequeueAfter: reconcileAfter}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedStatus) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per #583 (comment), I don't think an updating resource is necessarily an unready resource.