Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle custom cluster-domain values doesn't work without certmanager #627

Open
marandalucas opened this issue Apr 11, 2024 · 5 comments
Open
Labels
bug Something isn't working

Comments

@marandalucas
Copy link

@lucchmielowski Hi! Thank you so much for this fix. #399

Unfortunately, It doesn't work for us.

  • We noticed that we should install "cert-manager".
  • metrics-service-address is hardcoded. So, we can't update it from values.yaml.
  • Certificates and secure connections are mandatory.

HELM CONFIG
clusterDomain: gcp-prod-pv-na1-a.company.cluster.local

ERROR:
W0314 15:03:14.706154 1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666", ServerName: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666", }. Err: connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate is valid for keda-operator, keda-operator, keda-operator.keda, keda-operator.keda.svc, keda-operator.keda.svc.cluster.local, keda-admission-webhooks, keda-admission-webhooks.keda, keda-admission-webhooks.keda.svc, keda-admission-webhooks.keda.svc.cluster.local, keda-operator-metrics-apiserver, keda-operator-metrics-apiserver.keda, keda-operator-metrics-apiserver.keda.svc, keda-operator-metrics-apiserver.keda.svc.cluster.local, not keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local"

We wonder if you could fix it. We don't need cert-manager in our clusters.

Thanks in advance

@marandalucas marandalucas added the bug Something isn't working label Apr 11, 2024
@lucchmielowski
Copy link
Contributor

Hello @marandalucas 👋

I just had a look at it and I've tried re-creating the issue and it seems to be working fine on my side.

Just so I understand:

  • You installed cert-manager
  • You installed the chart with a custom clusterDomain
  • Your metrics-server pods throws the error you shared ?

I'm wondering: what version of the chart are you using and are you using the certificates created by the chart ? (certificates.certManager.enabled: true)

Also, could you share your certificate keda-operator-tls-certificates content ?

@marandalucas
Copy link
Author

Hello @lucchmielowski 👍

If you want to recreate the issue you have to:

  1. Create a GKE cluster without the cert-manager tool.
  2. Install KEDA (2.13.0) with a custom clusterDomain.
  3. Check the metric-apiserver pod.
ERROR:
W0314 15:03:14.706154       1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666", ServerName: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666", }. 

We'd like to avoid the cert-manager tool installation because of the following reasons:

  • We don't want to install the "cert-manager" in our infra with the only purpose of creating a certificate for KEDA.
  • Cert-manager consumes resources in the cluster and we have to monitor it.
  • We found a running condition issue adding this component to our Terraform (cert-manager CRDs take a while before it's created by itself.)
    Error: [resource mapping not found for name: "keda-operator-tls-certificates" namespace: "keda" from "": no matches for kind "Certificate" in version "cert-manager.io/v1" │ ensure CRDs are installed first, resource mapping not found for name: "keda-operator-ca" namespace: "keda" from "": no matches for kind "Certificate" in version "cert-manager.io/v1"

Is there another way to fix this through parametrizing metrics-service-address or something like that?

Thank you so much for this project

@lucchmielowski
Copy link
Contributor

lucchmielowski commented Apr 15, 2024

Hi @marandalucas, sorry but I won't really have the time to test in GKE in the next few days, but both issues you shared looks to be linked to a miss-match between the cluster-domain of your cluster and your configuration and not an issue with the chart itself (I might have misunderstood something though)

What makes me think of that is this part of the log you shared earlier :

certificate is valid for keda-operator, keda-operator, keda-operator.keda, keda-operator.keda.svc, keda-operator.keda.svc.cluster.local ... not keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local

as well as the

addrConn.createTransport failed to connect to {Addr: "keda-operator.keda.svc.gcp-prod-pv-na1-a.company.cluster.local:9666...

That does not seem related to a cert issue but more of an addressing issue

Could it be possible that your GKE cluster is using the default svc.cluster.local FQDN ? (in which case you wouldn't need to setup a clusterDomain).
One way to check the correct value to use is running the following command that creates a pod and does an nslookup:

kubectl run -it --image=ubuntu --restart=Never shell -- \
sh -c 'apt-get update > /dev/null && apt-get install -y dnsutils > /dev/null && \
nslookup kubernetes.default | grep Name | sed "s/Name:\skubernetes.default//"'`

Also I understand that you don't want to setup certificate-manager, by default the chart enables the operator to create a kedaorg-certs secret that is being created for TLS communication between keda's components.

@lucchmielowski
Copy link
Contributor

Also, feel free to message me on the Kubernetes slack directly if you find it easier to have a "live" discussion about the issue.

@JorTurFer
Copy link
Member

Hello @marandalucas ,
You don't need cert manager, but you need to update the internal cert system too. (you can use cert-manager or the self-generated certs).
You have to add an extra arg in the operator k8s-cluster-domain: your-domain. This will take your domain into account for certificate generation.

extraArgs:
  # -- Additional KEDA Operator container arguments
  keda:
    k8s-cluster-domain: your-domain
clusterDomain: your-domain

I guess that we could automatically set the arg with clusterDomain value? 🤔 @lucchmielowski WDYT?

in any case, setting both you will be able to use KEDA without cert-manager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants