Skip to content

Commit

Permalink
templates: ensure RHCOS node hostnames are mostly valid
Browse files Browse the repository at this point in the history
This aims to fix all the edge-cases around invalid hostnames. This can
include:
 - localhost as the hostname
 - a static hostname larger than 63 characters
 - slow reverse DNS host name discovery
 - truncated hostname in initramfs discovery that is 64 character long

Previously, there was a node-valid-hostname.service, but it only ensured
that the hostname was not localhost. When running in the initramfs,
NetworkManager will truncate the hostname at 64 charcters resulting in
the `node-valid-hostname.service` checks succeeding (!localhost) but
still invalid for the cluster. This change consolides the logic used by
the dispatcher script _and_ the systemd unit into a library script that
is used by both.

This fix is not ideal, but it seems like the best path that be
backported to 4.3 and 4.5.

Fixes BZs 1844613, 1845885, 1853400, 1853584 (and probably more)

Signed-off-by: Ben Howard <[email protected]>
  • Loading branch information
Ben Howard committed Jul 9, 2020
1 parent ab4d38c commit 9ddd20d
Show file tree
Hide file tree
Showing 3 changed files with 104 additions and 28 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -4,36 +4,29 @@ contents:
inline: |
#!/bin/bash
#
# On Google Compute Platform (GCP) the hostname may be too long (>63 chars).
# During firstboot the hostname is set in the initramfs before NetworkManager
# runs; on reboot affect nodes use 'localhost'. This hook is a simple work
# around: if the host name is longer than 63 characters, then the hostname
# is truncated at the _first_ dot.
#
# Additionally, this hook does not break DNS or cluster DNS resolution,
# since NetworkManager sets the appropriate /etc/resolv.conf settings.
# 90-long-hostname is a wrapper around /usr/local/sbin/set-valid-hostname.sh,
# which ensures that a node has a valid hostname.
IF=$1
STATUS=$2
log() { logger --tag "network-manager/$(basename $0)" "${@}"; }
# capture all eligible hostnames
if [[ ! "$(/bin/hostname)" =~ (localhost|localhost.local) ]]; then
log "hostname is already set"
if [[ ! "$STATUS" =~ (up|hostname|dhcp4-change|dhcp6-change) ]]; then
exit 0
fi
if [[ ! "$STATUS" =~ (up|hostname|dhcp4-change|dhcp6-change) ]]; then
if [[ ! "$(< /proc/sys/kernel/hostname)" =~ (localhost|localhost.localdomain) ]]; then
log "hostname is already set"
exit 0
fi
default_host="${DHCP4_HOST_NAME:-$DHCP6_HOST_NAME}"
# truncate the hostname to the first dot and than 64 characters.
host=$(printf ${default_host} | cut -f1 -d'.' | cut -c -63)
# source the script since NetworkManager execution rules do
# allow sourcing from /usr/local. RHCOS has an read-only rootfs
# which limits where this can be stashed.
source /usr/local/sbin/set-valid-hostname.sh
host_name="${DHCP4_HOST_NAME:-$DHCP6_HOST_NAME}"
if [ "${#default_host}" -gt 63 ]; then
log "discovered hostname is longer than than 63 characters"
log "truncating ${default_host} => ${host}"
/bin/hostnamectl --transient set-hostname "${host}"
if [ -n "${host_name}" ]; then
set_valid_hostname "${host_name}"
fi
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
mode: 0755
path: "/usr/local/sbin/set-valid-hostname.sh"
contents:
inline: |
#!/bin/bash
# On some platforms the hostname may be too long (>63 chars).
# - On firstboot the hostname is set in the initramfs before NetworkManager
# And it may be truncated at 64 characters (too long)
# - On reboot affect nodes use 'localhost'.
#
# This script is a simple workaround for hostname woes, including
# - NOT a localhost name
# - NOT longer than 63 characters. Names will be truncated at the
# first dot, and then capped at 63 char (which ever is less).
# - Race conditions between truncated hostnames by the dhclient
# and NetworkManager.
#
# Finally, this script is invoked via:
# - /etc/NetworkManager/dispatcher.d/90-long-hostnames
# - on boot via node-valid-hostname.service
export PATH="/usr/bin:/usr/local/bin:/sbin:/usr/local/sbin:/bin:${PATH}"
log() { logger --tag "$(basename $0)" "${@}"; }
# wait_localhost waits until the host gets a real hostname.
# This will wait indefinately. node-valid-hostname.service will terminate
# this after 5m.
wait_localhost() {
log "waiting for non-localhost hostname to be assigned"
while [[ "$(< /proc/sys/kernel/hostname)" =~ (localhost|localhost.localdomain) ]];
do
sleep 1
done
log "node identified as $(</proc/sys/kernel/hostname)"
exit 0
}
set_valid_hostname() {
local host_name=${1}
local type_arg="transient"
# /etc/hostname is used for static hostnames and is authorative.
# This will check to make sure that the static hostname is the
# less than or equal to 63 characters in length.
if [ -f /etc/hostname ] && [ "$(cat /etc/hostname | wc -m)" -gt 0 ]; then
etc_name="$(< /etc/hostname)"
type_arg="static"
if [ "${etc_name}" != "${host_name}" ]; then
log "/etc/hostname is set to ${etc_name} but does not match ${host_name}"
log "using /etc/hostname as the authorative name"
host_name="${etc_name}"
fi
fi
# Only mutate the hostname if the length is longer than 63 characters. The
# hostname will be the lesser of 63 characters after the first dot in the
# FQDN.
if [ "${#host_name}" -gt 63 ]; then
alt_name=$(printf "${host_name}" | cut -f1 -d'.' | cut -c -63)
log "${host_name} is longer than 63 characters, using trunacated hostname"
host_name="${alt_name}"
fi
log "setting ${type_arg} hostname to ${host_name}"
/bin/hostnamectl "--${type_arg}" set-hostname "${host_name}"
exit 0
}
cli_run() {
mode="${1:?mode must be the first argument}"; shift;
case "${mode}" in
wait_localhost) wait_localhost;;
set_valid_hostname) hname="${1:?hostname is a required last argument}";
set_valid_hostname "${hname}";;
*) log "unknown mode ${mode}"; exit 1;;
esac
}
# Allow the functions to be sourced. This can be run either as a
# standalone command or in systemd or part of NetworkManager.
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
cli_run ${@}
fi
16 changes: 8 additions & 8 deletions templates/common/_base/units/node-valid-hostname.service
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,23 @@ name: node-valid-hostname.service
enabled: true
contents: |
[Unit]
Description=Ensure hostname is not localhost
# Only run when the host has a localhost name.
ConditionHost=|localhost
ConditionHost=|localhost.localdomain
Description=Ensure the node hostname is valid for the cluster
Before=network-online.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=/bin/echo "Node has localhost hostname. Waiting for new hostname."
# Get the short hostname. This is more reliable than a regex.
ExecStartPre=/bin/bash -c 'while [ `hostname -s` == "localhost" ]; do sleep 1; done;'
ExecStart=/bin/sh -c "echo Node changed hostname to `hostname`"
User=root

# SystemD prevents direct execution of the script in /usr/local/sbin,
# so it is sourced. See the script for functionality.
ExecStart=/bin/bash -c "source /usr/local/sbin/set-valid-hostname.sh; wait_localhost; set_valid_hostname `hostname`"

# Wait up to 5min for the node to get a real hostname.
TimeoutSec=300

[Install]
WantedBy=multi-user.target
# Ensure that network-online.target will not complete until the node has a real hostname.
RequiredBy=network-online.target

0 comments on commit 9ddd20d

Please sign in to comment.