Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-17157: label non-OLM resources #3017

Conversation

stevekuznetsov
Copy link
Member

HACK: add a round-tripper to ensure we label non-OLM resources

We can either flag-gate this to dev only, or delete it before merge.

Signed-off-by: Steve Kuznetsov [email protected]


*: label non-OLM resources

Today, our controllers use un-filtered LIST+WATCH calls to monitor the
state of the cluster. For OLM-specific resource types, that's fine,
since we need to know (for instance) about every CSV. For non-OLM
resource groups, though, that is needlessly wasteful in memory
consumption and makes our controller's footprint scale with the size of
the cluster itself, irrespective of the usage of OLM. Adding a label to
every resource we create is the first step in being able to filter down
all of those requests to only those objects with our label.

Signed-off-by: Steve Kuznetsov [email protected]


@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 22, 2023
@openshift-ci-robot
Copy link
Collaborator

@stevekuznetsov: This pull request references Jira Issue OCPBUGS-17157, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.14.0) matches configured target version for branch (4.14.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @jianzhangbjz

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

HACK: add a round-tripper to ensure we label non-OLM resources

We can either flag-gate this to dev only, or delete it before merge.

Signed-off-by: Steve Kuznetsov [email protected]


*: label non-OLM resources

Today, our controllers use un-filtered LIST+WATCH calls to monitor the
state of the cluster. For OLM-specific resource types, that's fine,
since we need to know (for instance) about every CSV. For non-OLM
resource groups, though, that is needlessly wasteful in memory
consumption and makes our controller's footprint scale with the size of
the cluster itself, irrespective of the usage of OLM. Adding a label to
every resource we create is the first step in being able to filter down
all of those requests to only those objects with our label.

Signed-off-by: Steve Kuznetsov [email protected]


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Aug 22, 2023
@stevekuznetsov stevekuznetsov force-pushed the skuznets/label-for-rbac branch 2 times, most recently from f1b5e49 to de75e19 Compare August 22, 2023 19:50
@ncdc
Copy link
Member

ncdc commented Aug 22, 2023

Was the hack commit for your local testing, to confirm things were doing what you expected, and that you didn't miss adding labels anywhere in the code base?

cmd/olm/main.go Outdated Show resolved Hide resolved
perdasilva
perdasilva previously approved these changes Aug 23, 2023
Copy link
Collaborator

@perdasilva perdasilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - just a couple of non-blocking questions. If, in your judgement, they are not worth addressing, I'm happy to merge

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 23, 2023
@perdasilva perdasilva dismissed their stale review August 23, 2023 09:51

I didn't notice CI is failing - code lgtm, get CI to green and we're good =D

@openshift-ci openshift-ci bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 23, 2023
@perdasilva
Copy link
Collaborator

/approve cancel (see review dismissal for details)

@stevekuznetsov
Copy link
Member Author

@ncdc yes, but I could see it being generally useful moving forward to not mistakenly re-introduce the bug. Once the informers are filtered, you'd end up knowing anyway since your controller would not do the right thing but the round-tripper failure would be much more direct.

@stevekuznetsov stevekuznetsov force-pushed the skuznets/label-for-rbac branch 3 times, most recently from 64345e9 to 7e445f6 Compare August 23, 2023 16:19
@stevekuznetsov
Copy link
Member Author

@ncdc @perdasilva this should be ready for review

@@ -442,34 +442,38 @@ var _ = Describe("Operator Group", func() {
}(ns)
}

// Generate operatorGroupA - OwnNamespace
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to have concepts in the code like "operatorGroupA" but the value in the cluster will be auto-generated, it's imperative to print that relationship for folks that debug your test.

@@ -426,8 +426,8 @@ var _ = Describe("Operator Group", func() {
// Create crd so csv succeeds
// Ensure clusterroles created and aggregated for access provided APIs

// Generate namespaceA
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to have phases of a test, use a logger to make them appear in the output to make debugging failures easier.

@@ -346,7 +346,7 @@ func catalogSourceRegistryPodSynced(catalog *operatorsv1alpha1.CatalogSource) bo
if connState != nil {
state = connState.LastObservedState
}
fmt.Printf("waiting for catalog pod %v to be available (for sync) - %s\n", catalog.GetName(), state)
fmt.Printf("waiting for catalog pod %s/%s to be available (for sync) - %s\n", catalog.GetNamespace(), catalog.GetName(), state)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always fully qualify the object you're waiting on so that folks debugging a failing test know where to look.

@@ -4432,17 +4432,18 @@ func fetchCSV(c versioned.Interface, name, namespace string, checker csvConditio
var fetchedCSV *operatorsv1alpha1.ClusterServiceVersion
var err error

ctx.Ctx().Logf("waiting for CSV %s/%s to reach condition", namespace, name)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a function like fetchCSV is going to validate conditions, at least log something to make it less surprising.

@ncdc
Copy link
Member

ncdc commented Aug 23, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 23, 2023
Copy link
Collaborator

@perdasilva perdasilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 24, 2023
This round-tripper is added to our *rest.Config when it's possible to
detect that we're in a CI environment. Developers should set $CI=true to
get this behavior locally.

Signed-off-by: Steve Kuznetsov <[email protected]>
Today, our controllers use un-filtered LIST+WATCH calls to monitor the
state of the cluster. For OLM-specific resource types, that's fine,
since we need to know (for instance) about every CSV. For non-OLM
resource groups, though, that is needlessly wasteful in memory
consumption and makes our controller's footprint scale with the size of
the cluster itself, irrespective of the usage of OLM. Adding a label to
every resource we create is the first step in being able to filter down
all of those requests to only those objects with our label.

Signed-off-by: Steve Kuznetsov <[email protected]>
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 24, 2023
@openshift-ci
Copy link

openshift-ci bot commented Aug 24, 2023

New changes are detected. LGTM label has been removed.

Copy link
Member

@awgreene awgreene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and I wish that we had done this earlier. @perdasilva and I were recently discussing how OLM "adopts/overwrites" existing k8s resources if there is a name collision. Had we considered using labels in the past we could have written OLM such that it wouldn't delete/update resources missing these labels. Now that OLM is out in the wild there isn't a good migration path.

@stevekuznetsov stevekuznetsov added the lgtm Indicates that a PR is ready to be merged. label Aug 24, 2023
@openshift-merge-robot openshift-merge-robot merged commit 9e7031f into operator-framework:master Aug 24, 2023
15 checks passed
@openshift-ci-robot
Copy link
Collaborator

@stevekuznetsov: Jira Issue OCPBUGS-17157: Some pull requests linked via external trackers have merged:

The following pull requests linked via external trackers have not merged:

These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with /jira refresh.

Jira Issue OCPBUGS-17157 has not been moved to the MODIFIED state.

In response to this:

HACK: add a round-tripper to ensure we label non-OLM resources

We can either flag-gate this to dev only, or delete it before merge.

Signed-off-by: Steve Kuznetsov [email protected]


*: label non-OLM resources

Today, our controllers use un-filtered LIST+WATCH calls to monitor the
state of the cluster. For OLM-specific resource types, that's fine,
since we need to know (for instance) about every CSV. For non-OLM
resource groups, though, that is needlessly wasteful in memory
consumption and makes our controller's footprint scale with the size of
the cluster itself, irrespective of the usage of OLM. Adding a label to
every resource we create is the first step in being able to filter down
all of those requests to only those objects with our label.

Signed-off-by: Steve Kuznetsov [email protected]


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@tmshort tmshort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Some minor comments that don't really need changes.
EDIT: Didn't notice this already merged!

Comment on lines +70 to +74
return i.createOrUpdateValidatingWebhook(ogNamespacelabelSelector, caPEM, desc)
case v1alpha1.MutatingAdmissionWebhook:
i.createOrUpdateMutatingWebhook(ogNamespacelabelSelector, caPEM, desc)
return i.createOrUpdateMutatingWebhook(ogNamespacelabelSelector, caPEM, desc)
case v1alpha1.ConversionWebhook:
i.createOrUpdateConversionWebhook(caPEM, desc)
return i.createOrUpdateConversionWebhook(caPEM, desc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible change in behavior? Is this to return a possible error in these handlers?

Comment on lines +2284 to +2287
if rb.Labels == nil {
rb.Labels = map[string]string{}
}
rb.Labels[install.OLMManagedLabelKey] = install.OLMManagedLabelValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am seeing this idiom so much, it almost makes sense to create a utility API to initialize the labels and then add the label.

e.g.

addlabel(&rbLabels, install.OLMManagedLabelKey, install.OLMManagedLabelValue)

@openshift-ci
Copy link

openshift-ci bot commented Aug 24, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: awgreene, ncdc, perdasilva, stevekuznetsov, tmshort

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [awgreene,perdasilva,tmshort]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants