Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(controller): copy ownerreference from oldNs on namespace UPDATE admission requests #768

Merged
merged 1 commit into from
Nov 9, 2023

Conversation

oliverbaehler
Copy link
Collaborator

Fixes behaviour mentioned in thread:

Since the flux kustomize-controller runs a dry-run operation with it's server-side apply, we can skip these. Since Dry-Run operations are not persisted into etcd.

I have change the sideEffect of the hook because it matches it's description:

  • NoneOnDryRun: calling the webhook will possibly have side effects, but if a request with dryRun: true is sent to the webhook, the webhook will suppress the side effects (the webhook is dryRun-aware).

Not really sure what that even does, but the hook is now dry-run aware.

@netlify
Copy link

netlify bot commented May 26, 2023

Deploy Preview for capsule-documentation canceled.

Name Link
🔨 Latest commit c16fa89
🔍 Latest deploy log https://app.netlify.com/sites/capsule-documentation/deploys/650489efc972070008e6b03f

@oliverbaehler oliverbaehler changed the title fix: skip owernref validation on dry-run admissions requests fix: skip ownerrf validation on dry-run admissions requests May 26, 2023
@oliverbaehler oliverbaehler changed the title fix: skip ownerrf validation on dry-run admissions requests fix: skip owner validation on dry-run admissions requests May 26, 2023
@prometherion prometherion added this to the v0.3.2 milestone May 26, 2023
Copy link
Member

@prometherion prometherion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I would like just to have as a sort of double-check a review from @MaxFedotov.

@prometherion prometherion requested a review from MaxFedotov May 26, 2023 14:16
@MaxFedotov
Copy link
Collaborator

@oliverbaehler Hey! And why use NoneOnDryRun instead of None for sideEffect? As I understand, this validating webhook doesn't have any sideEffect at all, and according to flux docs None should be OK.

And the other thing I'm thinking about is that we are breaking a bit an idea of dry-run, as it is mostly used to validate and test requests without actually applying it. With this change, we are actually skipping all validation, so even if something will be broken\invalid in dry-run request, capsule will return OK in all cases.

@prometherion, and do we actually need this check at all:
https://github.com/clastix/capsule/blob/7becdbaf799582426e4b5c5bc06753e7ae867a7b/pkg/webhook/namespace/owner_reference.go#L54-L58

What if we add here: https://github.com/clastix/capsule/blob/7becdbaf799582426e4b5c5bc06753e7ae867a7b/pkg/webhook/ownerreference/patching.go#L51-L55

The code to set ownerReference back? Even if the user will modify it, it will be always set back to the correct value by Capsule.

@prometherion
Copy link
Member

As I understand, this validating webhook doesn't have any sideEffect at all, and according to flux docs None should be OK.

It seems Max is correct here.

And the other thing I'm thinking about is that we are breaking a bit an idea of dry-run, as it is mostly used to validate and test requests without actually applying it. With this change, we are actually skipping all validation, so even if something will be broken\invalid in dry-run request, capsule will return OK in all cases.

Same here, although wondering why Flux started misbehaving correctly. @oliverbaehler it would be helpful to understand what's going on before getting this merged.

I'm gonna remove this from the v0.3.2 milestone since we have to release some security fixes.

@prometherion prometherion modified the milestones: v0.3.2, v0.3.3 Jun 1, 2023
@oliverbaehler
Copy link
Collaborator Author

oliverbaehler commented Jun 3, 2023

@MaxFedotov Thanks for your inputs.

I wasn't sure on sideEffects, what it really meant. The dynamic admission controller doc seemed a bit vague to me. But you are right, since the webhook only operates on the admission request, the side-effect of None should be fine.

Regarding the behaviour, my assumption would be that somehow ownerreferences do not resolve the same way (if they do, when executing a dry-run). That's at least why the code blocked any update. I will have to do some testing and dig some code to verify that suspicion.

Kustomize-Controller:

Here the Apply is made (differs based on resources to create). For namespace the apply is made here:

The apply operation is from the common pkg ssa, this invokes at some point a dryRun Operation before the actual apply:

@prometherion
Copy link
Member

@oliverbaehler @MaxFedotov it seems to me this issue is stale.

Please, share if we should refactor this PR or if it can be closed.

@oliverbaehler
Copy link
Collaborator Author

@prometherion one of our production clusters only works with that fix. However we are currently redesgning flux etc with the tenancy. Lets just kept open. Maybe we will discover the same issues. If so we ll build the suggested fix from max

@oliverbaehler
Copy link
Collaborator Author

The current approach would patch all Ownerreferences from the oldObject to the newObject. From my point of view this prevents ownerreferences from being removed or tampered (except if you are not part of capsule-users, which is now also the case).

What we noticed, even with the DryRun patch, the flux kustomize/helm controller kept failing. The reconciles which would attempt to UPDATE namespaces, would always fail here: https://github.com/clastix/capsule/blob/master/pkg/webhook/namespace/owner_reference.go#L54

While i can't make an accurate statement about what's happening, one thing is sure: On Reconcile Operations these two controllers try to apply the namespaces without ownerreferences.

Anyway with this patch everything works fine for. What do you guys think? @prometherion @MaxFedotov

@oliverbaehler oliverbaehler changed the title fix: skip owner validation on dry-run admissions requests fix: always copy ownerreference on namespace UPDATE admission events Sep 15, 2023
@prometherion
Copy link
Member

I reviewed it, and it looks safe to me.

The proposed handler was primarily implemented to prevent privilege escalation, which has been then covered with a specific e2e feature test and it's not failing, besides the flakiness we're used to with GH actions.

@MaxFedotov I'm for merging it, please, let me know if it's ok for you too.

@oliverbaehler oliverbaehler modified the milestones: v0.3.4, v0.4.1 Nov 7, 2023
Copy link

netlify bot commented Nov 7, 2023

Deploy Preview for capsule-documentation canceled.

Name Link
🔨 Latest commit 3ffaa1f
🔍 Latest deploy log https://app.netlify.com/sites/capsule-documentation/deploys/654ca5d4fa9378000840db08

@oliverbaehler oliverbaehler changed the title fix: always copy ownerreference on namespace UPDATE admission events fix(controller): always copy ownerreference on namespace update admission events Nov 9, 2023
@oliverbaehler oliverbaehler changed the title fix(controller): always copy ownerreference on namespace update admission events fix(controller): copy ownerreference from oldNs on namespace UPDATE admission requests Nov 9, 2023
@prometherion prometherion merged commit cde44ba into projectcapsule:main Nov 9, 2023
16 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants