Failed to deploy K3s #36

robert-sanfeliu · 2024-11-15T15:30:38Z

On DEV environment (K3s)

I was deploying a cluster with 1 master and 2 workers. Master and one of the workers went OK. I could log in to master and see itself and the worker as part of the cluster. However, the other worker wasn't there.

The VM for the faulty worker was created and jobs executed, but I found this:

[INFO]  systemd: Starting k3s-agent
Job for k3s-agent.service failed because the control process exited with error code.
See "systemctl status k3s-agent.service" and "journalctl -xeu k3s-agent.service" for details.
INFO [Fri Nov 15 12:58:28 UTC 2024]: Configuration done successfully in 3 seconds

Then, looking at k3s-agent.service:

INFO [Fri Nov 15 12:58:28 UTC 2024]: Configuration done successfully in 3 seconds
ubuntu@n13342-2-dummy-app-worker-1-1-13342-2:~$ systemctl status k3s-agent.service
● k3s-agent.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s-agent.service; enabled; vendor preset: enabled)
     Active: activating (auto-restart) (Result: exit-code) since Fri 2024-11-15 14:21:42 UTC; 2s ago
       Docs: https://k3s.io
    Process: 135541 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service 2>/dev/null (code=exited, status=0/SUCCESS)
    Process: 135543 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
    Process: 135544 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
    Process: 135545 ExecStart=/usr/local/bin/k3s agent --node-ip (code=exited, status=1/FAILURE)
   Main PID: 135545 (code=exited, status=1/FAILURE)
        CPU: 16ms

Nov 15 14:21:42 n13342-2-dummy-app-worker-1-1-13342-2 systemd[1]: k3s-agent.service: Main process exited, code=exited, status=1/FAILURE
Nov 15 14:21:42 n13342-2-dummy-app-worker-1-1-13342-2 systemd[1]: k3s-agent.service: Failed with result 'exit-code'.
Nov 15 14:21:42 n13342-2-dummy-app-worker-1-1-13342-2 systemd[1]: Failed to start Lightweight Kubernetes.
ubuntu@n13342-2-dummy-app-worker-1-1-13342-2:~$ journalctl -xeu k3s-agent.service
Nov 15 14:33:07 n13342-2-dummy-app-worker-1-1-13342-2 sh[147410]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Nov 15 14:33:07 n13342-2-dummy-app-worker-1-1-13342-2 k3s[147414]: Incorrect Usage: flag needs an argument: -node-ip

The error is:

Incorrect Usage: flag needs an argument: -node-ip

That error comes from the worker install script. More precisely:

echo "Installing K3s Agent"
K3S_DEP_PATH=$HOME/k3s

sudo -H -u ubuntu bash -c "wget -P ${K3S_DEP_PATH} https://raw.githubusercontent.com/eu-nebulous/sal-scripts/dev/k3s/install-kube-k3s-agent-u22-wg.sh && chmod +x $K3S_DEP_PATH/install-kube-k3s-agent-u22-wg.sh && $K3S_DEP_PATH/install-kube-k3s-agent-u22-wg.sh $APPLICATION_ID"

Inside https://raw.githubusercontent.com/eu-nebulous/sal-scripts/dev/k3s/install-kube-k3s-agent-u22-wg.sh

This command is executed:

WIREGUARD_VPN_IP=`ip a | grep wg | grep inet | awk '{print $2}' | cut -d'/' -f1`

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=${K3S_VERSION} K3S_URL="https://${WIREGUARD_SERVER}:6443" K3S_TOKEN=${NODE_TOKEN} INSTALL_K3S_EXEC="--node-ip $WIREGUARD_VPN_IP" sh -

The problem should have been that WIREGUARD_VPN_IP was empty.

However, in previous steps of the installation process for the same node, I can see the logs:

[[email protected];n13342-2-dummy-app-worker-1-1-13342-2_Task_install_0;12:51:40] INFO [Fri Nov 15 12:51:40 UTC 2024]: onm-bootstrap(2482): onm-bootstrap.sh: Configuration done successfully in 42 seconds
[[email protected];n13342-2-dummy-app-worker-1-1-13342-2_Task_install_0;12:52:40] WIREGUARD_VPN_IP= 192.168.55.3
[[email protected];n13342-2-dummy-app-worker-1-1-13342-2_Task_install_0;12:52:40] Executing k3s-preinstall script\n

Running the command

K3S_DEP_PATH=$HOME/k3s

sudo -H -u ubuntu bash -c "wget -P ${K3S_DEP_PATH} https://raw.githubusercontent.com/eu-nebulous/sal-scripts/dev/k3s/install-kube-k3s-agent-u22-wg.sh && chmod +x $K3S_DEP_PATH/install-kube-k3s-agent-u22-wg.sh && $K3S_DEP_PATH/install-kube-k3s-agent-u22-wg.sh $APPLICATION_ID"

Manually worked fine and the node joined the cluster.

The text was updated successfully, but these errors were encountered:

robert-sanfeliu added the bug Something isn't working label Nov 15, 2024

robert-sanfeliu assigned robert-sanfeliu and gkitsos01 and unassigned robert-sanfeliu Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to deploy K3s #36

Failed to deploy K3s #36

robert-sanfeliu commented Nov 15, 2024

Failed to deploy K3s #36

Failed to deploy K3s #36

Comments

robert-sanfeliu commented Nov 15, 2024