Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Provider mode] metallb TLS issue. Clients cant install ODF operator #10802

Open
DanielOsypenko opened this issue Nov 5, 2024 · 6 comments
Open
Labels
provider-client Provider-client solution Squad/Yellow All tests related to managerd-services

Comments

@DanielOsypenko
Copy link
Contributor

Behavior: Clients (all deployed hosted client clusters) can not install ODF operators and pull-image timeout expires with msg

error using catalogsource openshift-marketplace/ocs-catalogsource: error encountered while listing bundles: rpc error: code = DeadlineExceeded desc = context deadline exceeded, error using catalogsource openshift-marketplace/community-operators: error encountered while listing bundles: rpc error: code = DeadlineExceeded desc = context deadline exceeded, error using catalogsource openshift-marketplace/redhat-operators: error encountered while listing bundles: rpc error: code = DeadlineExceeded desc = context deadline exceeded

Initial investigation:
catalog source on Client in 'Ready' status

MultiClusterHub is running

oc get MultiClusterHub -A
NAMESPACE                 NAME              STATUS    AGE
open-cluster-management   multiclusterhub   Running   12d

IPAddressPool created.
TODO: check addresses are still reserved for us
cc @dahorak - https://ibm-systems-storage.slack.com/archives/C06E08SNVC7/p1730810762549349?thread_ts=1730792170.488219&cid=C06E08SNVC7

l2advertisement looks Ok.

oc get l2advertisement -o yaml
apiVersion: v1
items:
- apiVersion: metallb.io/v1beta1
  kind: L2Advertisement
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"metallb.io/v1beta1","kind":"L2Advertisement","metadata":{"annotations":{},"name":"l2advertisement","namespace":"metallb-system"},"spec":{"ipAddressPools":["metallb-addresspool"]}}
    creationTimestamp: "2024-10-29T15:56:13Z"
    generation: 1
    name: l2advertisement
    namespace: metallb-system
    resourceVersion: "9216725"
    uid: efce4034-8dce-48be-b45c-f2fbcb05693a
  spec:
    ipAddressPools:
    - metallb-addresspool
kind: List
metadata:
    resourceVersion: ""

logs from metallb-operator-webhook-server

{"level":"info","ts":"2024-10-30T05:53:51Z","logger":"controller-runtime.certwatcher","msg":"Updated current TLS certificate","stacktrace":"sigs.k8s.io/controller-runtime/pkg/certwatcher.(*CertWatcher).ReadCertificate\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/certwatcher/certwatcher.go:161\nsigs.k8s.io/controller-runtime/pkg/certwatcher.New\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/certwatcher/certwatcher.go:62\nsigs.k8s.io/controller-runtime/pkg/webhook.(*DefaultServer).Start\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/server.go:207\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/manager/runnable_group.go:223"}
{"level":"info","ts":"2024-10-30T05:53:51Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/validate-metallb-io-v1beta1-bfdprofile","stacktrace":"sigs.k8s.io/controller-runtime/pkg/webhook.(*DefaultServer).Register\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/server.go:183\ngo.universe.tf/metallb/internal/k8s/webhooks/webhookv1beta1.(*BFDProfileValidator).SetupWebhookWithManager\n\t/metallb/internal/k8s/webhooks/webhookv1beta1/bfdprofile_webhook.go:40\ngo.universe.tf/metallb/internal/k8s.enableWebhook\n\t/metallb/internal/k8s/webhook.go:90\ngo.universe.tf/metallb/internal/k8s.New.func3\n\t/metallb/internal/k8s/k8s.go:300"}
{"level":"info","ts":"2024-10-30T05:53:51Z","logger":"controller-runtime.webhook","msg":"Registering webhook","path":"/convert","stacktrace":"sigs.k8s.io/controller-runtime/pkg/webhook.(*DefaultServer).Register\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/server.go:183\ngo.universe.tf/metallb/internal/k8s.enableWebhook\n\t/metallb/internal/k8s/webhook.go:96\ngo.universe.tf/metallb/internal/k8s.New.func3\n\t/metallb/internal/k8s/k8s.go:300"}
{"level":"info","ts":"2024-10-30T05:53:51Z","logger":"controller-runtime.certwatcher","msg":"Starting certificate watcher","stacktrace":"sigs.k8s.io/controller-runtime/pkg/certwatcher.(*CertWatcher).Start\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/certwatcher/certwatcher.go:115\nsigs.k8s.io/controller-runtime/pkg/webhook.(*DefaultServer).Start.func1\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/server.go:214"}
{"level":"info","ts":"2024-10-30T05:53:51Z","logger":"controller-runtime.webhook","msg":"Serving webhook server","host":"","port":9443,"stacktrace":"sigs.k8s.io/controller-runtime/pkg/webhook.(*DefaultServer).Start\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/server.go:242\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/metallb/vendor/sigs.k8s.io/controller-runtime/pkg/manager/runnable_group.go:223"}
2024/10/30 05:54:02 http: TLS handshake error from 10.130.0.41:48190: remote error: tls: bad certificate
2024/10/30 05:54:03 http: TLS handshake error from 10.130.0.41:48200: remote error: tls: bad certificate
2024/10/30 05:54:05 http: TLS handshake error from 10.130.0.41:48206: remote error: tls: bad certificate
2024/10/30 05:54:05 http: TLS handshake error from 10.130.0.41:48208: remote error: tls: bad certificate
2024/10/30 05:54:06 http: TLS handshake error from 10.130.0.41:48218: remote error: tls: bad certificate
2024/10/30 05:54:08 http: TLS handshake error from 10.130.0.41:58904: remote error: tls: bad certificate
2024/10/30 05:54:08 http: TLS handshake error from 10.130.0.41:58916: remote error: tls: bad certificate
2024/10/30 05:54:09 http: TLS handshake error from 10.130.0.41:58918: remote error: tls: bad certificate
2024/10/30 05:54:11 http: TLS handshake error from 10.130.0.41:58928: remote error: tls: bad certificate
2024/10/30 05:54:14 http: TLS handshake error from 10.130.0.41:58940: remote error: tls: bad certificate
2024/10/30 05:54:15 http: TLS handshake error from 10.130.0.41:58954: remote error: tls: bad certificate
2024/10/30 05:54:17 http: TLS handshake error from 10.130.0.41:58964: remote error: tls: bad certificate
2024/10/30 05:54:17 http: TLS handshake error from 10.130.0.41:58978: remote error: tls: bad certificate
2024/10/30 05:54:18 http: TLS handshake error from 10.130.0.41:44000: remote error: tls: bad certificate
2024/10/30 05:54:20 http: TLS handshake error from 10.130.0.41:44014: remote error: tls: bad certificate
2024/10/30 05:54:23 http: TLS handshake error from 10.130.0.41:44024: remote error: tls: bad certificate
2024/10/30 05:54:24 http: TLS handshake error from 10.130.0.41:44038: remote error: tls: bad certificate
2024/10/30 05:54:26 http: TLS handshake error from 10.130.0.41:44052: remote error: tls: bad certificate
@DanielOsypenko DanielOsypenko added Squad/Yellow All tests related to managerd-services provider-client Provider-client solution labels Nov 5, 2024
@DanielOsypenko
Copy link
Contributor Author

DanielOsypenko commented Nov 5, 2024

also no available monitor and webhook services in metallb ns, similar to bellow (taken from BM2):

frr-k8s-monitor-service                       ClusterIP   None             <none>        9140/TCP,9141/TCP   23h
frr-k8s-webhook-service                       ClusterIP   172.30.21.2      <none>        443/TCP             23h

available only:

oc get service
NAME                                          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
controller-monitor-service                    ClusterIP   None             <none>        9120/TCP            6d21h
metallb-operator-controller-manager-service   ClusterIP   172.30.223.229   <none>        443/TCP             6d21h
metallb-operator-webhook-server-service       ClusterIP   172.30.64.33     <none>        443/TCP             6d21h
metallb-operator-webhook-service              ClusterIP   172.30.27.207    <none>        443/TCP             6d21h
speaker-monitor-service                       ClusterIP   None             <none>        9120/TCP,9121/TCP   6d21h
webhook-service                               ClusterIP   172.30.179.29    <none>        443/TCP             6d21h

@DanielOsypenko
Copy link
Contributor Author

Issue is related or matches with Metallb issue metallb/metallb-operator#494

@fedepaol
Copy link

fedepaol commented Nov 5, 2024

is this upstream or the openshift version?

@DanielOsypenko
Copy link
Contributor Author

@fedepaol full version of metallb operator is metallb-operator.v4.16.0-202410292005, tbh I was thinking that it is a community-only operator:

oc get csv
NAME                                         DISPLAY                          VERSION               REPLACES                                     PHASE
ingress-node-firewall.v4.16.0-202409051837   Ingress Node Firewall Operator   4.16.0-202409051837   ingress-node-firewall.v4.16.0-202410011135   Succeeded
metallb-operator.v4.16.0-202410292005        MetalLB Operator                 4.16.0-202410292005   metallb-operator.v4.16.0-202410251707        Succeeded 

@fedepaol
Copy link

fedepaol commented Nov 5, 2024

That's RH's one. The version is matching the cluster

@DanielOsypenko
Copy link
Contributor Author

Thanks @fedepaol, looking answers on RH slack channel #forum-ocp-metallb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
provider-client Provider-client solution Squad/Yellow All tests related to managerd-services
Projects
None yet
Development

No branches or pull requests

2 participants