Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

level=fatal msg="etcd cluster join failed: cannot fetch cluster info from peer urls: could not retrieve cluster information from the given URLs #11331

Closed
abdulsalamalmahdi opened this issue Nov 16, 2024 · 3 comments

Comments

@abdulsalamalmahdi
Copy link

abdulsalamalmahdi commented Nov 16, 2024

trying k3s in to libvirt vms

os: almalinux 9

firewalls disabled
selinux disabled

1st server node ip 192.168.122.11
2nd server node (joining node) ip 192.168.122.12
ping between primary ips working

commands

curl -sfL https://get.k3s.io | K3S_TOKEN=SECRET sh -s - server \
    --cluster-init 
curl -sfL https://get.k3s.io | K3S_TOKEN=SECRET sh -s - server \
    --server https://<ip or hostname of server1>:6443 

tried to debug and fix every possible cause but nothing worked.

logs

Nov 16 20:35:38 master-2 sh[7084]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Starting k3s v1.30.6+k3s1 (1829eaae)"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Managed etcd cluster not yet initialized"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Reconciling bootstrap data between datastore and disk"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg=start
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="schedule, now=2024-11-16T20:35:38+01:00, entry=1, next=2024-11-17T00:00:00+01:00"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Running kube-apiserver --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=https://kubernetes.default.svc.cluster.local,k3s --authorization-mode=Node,RBAC --bind-address=127.0.0.1 --cert-dir=/var/lib/rancher/k3s/server/tls/temporary-certs --client-ca-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --egress-selector-config-file=/var/lib/rancher/k3s/server/etc/egress-selector-config.yaml --enable-admission-plugins=NodeRestriction --enable-aggregator-routing=true --enable-bootstrap-token-auth=true --etcd-cafile=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt --etcd-certfile=/var/lib/rancher/k3s/server/tls/etcd/client.crt --etcd-keyfile=/var/lib/rancher/k3s/server/tls/etcd/client.key --etcd-servers=https://127.0.0.1:2379 --kubelet-certificate-authority=/var/lib/rancher/k3s/server/tls/server-ca.crt --kubelet-client-certificate=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt --kubelet-client-key=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --profiling=false --proxy-client-cert-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.crt --proxy-client-key-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.key --requestheader-allowed-names=system:auth-proxy --requestheader-client-ca-file=/var/lib/rancher/k3s/server/tls/request-header-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6444 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.current.key --service-cluster-ip-range=10.43.0.0/16 --ser>
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Running kube-scheduler --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --bind-address=127.0.0.1 --kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --profiling=false --secure-port=10259"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Running kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --bind-address=127.0.0.1 --cluster-cidr=10.42.0.0/16 --cluster-signing-kube-apiserver-client-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.nochain.crt --cluster-signing-kube-apiserver-client-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --cluster-signing-kubelet-client-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.nochain.crt --cluster-signing-kubelet-client-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --cluster-signing-kubelet-serving-cert-file=/var/lib/rancher/k3s/server/tls/server-ca.nochain.crt --cluster-signing-kubelet-serving-key-file=/var/lib/rancher/k3s/server/tls/server-ca.key --cluster-signing-legacy-unknown-cert-file=/var/lib/rancher/k3s/server/tls/server-ca.nochain.crt --cluster-signing-legacy-unknown-key-file=/var/lib/rancher/k3s/server/tls/server-ca.key --configure-cloud-routes=false --controllers=*,tokencleaner,-service,-route,-cloud-node-lifecycle --kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --profiling=false --root-ca-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --secure-port=10257 --service-account-private-key-file=/var/lib/rancher/k3s/server/tls/service.current.key --service-cluster-ip-range=10.43.0.0/16 --use-service-account-credentials=true"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Running cloud-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --bind-address=127.0.0.1 --cloud-config=/var/lib/rancher/k3s/server/etc/cloud-config.yaml --cloud-provider=k3s --cluster-cidr=10.42.0.0/16 --configure-cloud-routes=false --controllers=*,-route --feature-gates=CloudDualStackNodeIPs=true --kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --leader-elect-resource-name=k3s-cloud-controller-manager --node-status-update-frequency=1m0s --profiling=false"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Server node token is available at /var/lib/rancher/k3s/server/token"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="To join server node to cluster: k3s server -s https://192.168.122.12:6443 -t ${SERVER_NODE_TOKEN}"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Agent node token is available at /var/lib/rancher/k3s/server/agent-token"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="To join agent node to cluster: k3s agent -s https://192.168.122.12:6443 -t ${AGENT_NODE_TOKEN}"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Wrote kubeconfig /etc/rancher/k3s/k3s.yaml"
Nov 16 20:35:38 master-2 k3s[7089]: time="2024-11-16T20:35:38+01:00" level=info msg="Run: k3s kubectl"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="Password verified locally for node master-2"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="certificate CN=master-2 signed by CN=k3s-server-ca@1731784293: notBefore=2024-11-16 19:11:33 +0000 UTC notAfter=2025-11-16 19:35:39 +0000 UTC"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="certificate CN=system:node:master-2,O=system:nodes signed by CN=k3s-client-ca@1731784293: notBefore=2024-11-16 19:11:33 +0000 UTC notAfter=2025-11-16 19:35:39 +0000 UTC"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="Module overlay was already loaded"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="Module nf_conntrack was already loaded"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="Module br_netfilter was already loaded"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="Module iptable_nat was already loaded"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="Module iptable_filter was already loaded"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="Logging containerd to /var/lib/rancher/k3s/agent/containerd/containerd.log"
Nov 16 20:35:39 master-2 k3s[7089]: time="2024-11-16T20:35:39+01:00" level=info msg="Running containerd -c /var/lib/rancher/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /var/lib/rancher/k3s/agent/containerd"
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=info msg="containerd is now running"
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=info msg="Connecting to proxy" url="wss://127.0.0.1:6443/v1-k3s/connect"
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=info msg="Creating k3s-cert-monitor event broadcaster"
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=info msg="Running kubelet --address=0.0.0.0 --allowed-unsafe-sysctls=net.ipv4.ip_forward,net.ipv6.conf.all.forwarding --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --cgroup-driver=systemd --client-ca-file=/var/lib/rancher/k3s/agent/client-ca.crt --cloud-provider=external --cluster-dns=10.43.0.10 --cluster-domain=cluster.local --container-runtime-endpoint=unix:///run/k3s/containerd/containerd.sock --containerd=/run/k3s/containerd/containerd.sock --eviction-hard=imagefs.available<5%,nodefs.available<5% --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10% --fail-swap-on=false --feature-gates=CloudDualStackNodeIPs=true --healthz-bind-address=127.0.0.1 --hostname-override=master-2 --kubeconfig=/var/lib/rancher/k3s/agent/kubelet.kubeconfig --node-ip=192.168.122.12 --node-labels= --pod-manifest-path=/var/lib/rancher/k3s/agent/pod-manifests --read-only-port=0 --resolv-conf=/etc/resolv.conf --serialize-image-pulls=false --tls-cert-file=/var/lib/rancher/k3s/agent/serving-kubelet.crt --tls-private-key-file=/var/lib/rancher/k3s/agent/serving-kubelet.key"
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=info msg="Handling backend connection request [master-2]"
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=info msg="Remotedialer connected to proxy" url="wss://127.0.0.1:6443/v1-k3s/connect"
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=info msg="Adding member master-2-4d7ec752=https://192.168.122.12:2380 to etcd cluster [master-1-dcfdee70=https://192.168.122.11:2380]"
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=info msg="Starting etcd to join cluster with members [master-1-dcfdee70=https://192.168.122.11:2380 master-2-4d7ec752=https://192.168.122.12:2380]"
Nov 16 20:35:40 master-2 k3s[7089]: {"level":"info","ts":"2024-11-16T20:35:40.59814+0100","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["https://127.0.0.1:2380","https://192.168.122.12:2380"]}
Nov 16 20:35:40 master-2 k3s[7089]: {"level":"info","ts":"2024-11-16T20:35:40.598229+0100","caller":"embed/etcd.go:494","msg":"starting with peer TLS","tls-info":"cert = /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.crt, key = /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.key, client-cert=, client-key=, trusted-ca = /var/lib/rancher/k3s/server/tls/etcd/peer-ca.crt, client-cert-auth = true, crl-file = ","cipher-suites":[]}
Nov 16 20:35:40 master-2 k3s[7089]: {"level":"info","ts":"2024-11-16T20:35:40.598422+0100","caller":"embed/etcd.go:135","msg":"configuring client listeners","listen-client-urls":["https://127.0.0.1:2379","https://192.168.122.12:2379"]}
Nov 16 20:35:40 master-2 k3s[7089]: {"level":"info","ts":"2024-11-16T20:35:40.598484+0100","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.13","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.22.8","go-os":"linux","go-arch":"amd64","max-cpu-set":4,"max-cpu-available":4,"member-initialized":false,"name":"master-2-4d7ec752","data-dir":"/var/lib/rancher/k3s/server/db/etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/lib/rancher/k3s/server/db/etcd/member","force-new-cluster":false,"heartbeat-interval":"500ms","election-timeout":"5s","initial-election-tick-advance":true,"snapshot-count":10000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://localhost:2380"],"listen-peer-urls":["https://127.0.0.1:2380","https://192.168.122.12:2380"],"advertise-client-urls":["https://192.168.122.12:2379"],"listen-client-urls":["https://127.0.0.1:2379","https://192.168.122.12:2379"],"listen-metrics-urls":["http://127.0.0.1:2381"],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"master-1-dcfdee70=https://192.168.122.11:2380,master-2-4d7ec752=https://192.168.122.12:2380","initial-cluster-state":"existing","initial-cluster-token":"etcd-cluster","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":true,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
Nov 16 20:35:40 master-2 k3s[7089]: {"level":"info","ts":"2024-11-16T20:35:40.599061+0100","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/var/lib/rancher/k3s/server/db/etcd/member/snap/db","took":"408.319µs"}
Nov 16 20:35:40 master-2 k3s[7089]: {"level":"warn","ts":"2024-11-16T20:35:40.642575+0100","caller":"etcdserver/cluster_util.go:82","msg":"failed to get cluster response","address":"https://192.168.122.11:2380/members","error":"Get \"https://192.168.122.11:2380/members\": EOF"}
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=error msg="Sending HTTP 503 response to 127.0.0.1:53558: runtime core not ready"
Nov 16 20:35:40 master-2 k3s[7089]: {"level":"info","ts":"2024-11-16T20:35:40.64511+0100","caller":"embed/etcd.go:375","msg":"closing etcd server","name":"master-2-4d7ec752","data-dir":"/var/lib/rancher/k3s/server/db/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["https://192.168.122.12:2379"]}
Nov 16 20:35:40 master-2 k3s[7089]: {"level":"info","ts":"2024-11-16T20:35:40.645188+0100","caller":"embed/etcd.go:377","msg":"closed etcd server","name":"master-2-4d7ec752","data-dir":"/var/lib/rancher/k3s/server/db/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["https://192.168.122.12:2379"]}
Nov 16 20:35:40 master-2 k3s[7089]: time="2024-11-16T20:35:40+01:00" level=fatal msg="etcd cluster join failed: cannot fetch cluster info from peer urls: could not retrieve cluster information from the given URLs"
Nov 16 20:35:40 master-2 systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
@brandond
Copy link
Member

brandond commented Nov 17, 2024

Nov 16 20:35:40 master-2 k3s[7089]: {"level":"warn","ts":"2024-11-16T20:35:40.642575+0100","caller":"etcdserver/cluster_util.go:82","msg":"failed to get cluster response","address":"https://192.168.122.11:2380/members","error":"Get \"https://192.168.122.11:2380/members\": EOF"}

The etcd cluster members cannot connect to each other. Make sure the etcd ports are actually open. I suspect they are not.
https://docs.k3s.io/installation/requirements#inbound-rules-for-k3s-nodes

@github-project-automation github-project-automation bot moved this from New to Done Issue in K3s Development Nov 17, 2024
@abdulsalamalmahdi
Copy link
Author

@brandond thank you very much for your reply

I disabled the fire walls completely.

I did not see why the ports would not be open.

maybe there is something I am missing.

@brandond
Copy link
Member

Confirm that you can curl -vks https://192.168.122.11:2380/members from the second node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

2 participants