Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boot time for ISO image 3975.2.0 release has increased to ~1m30s #1514

Open
AverageMarcus opened this issue Aug 9, 2024 · 25 comments · Fixed by flatcar/bootengine#98
Open
Assignees
Labels
kind/bug Something isn't working

Comments

@AverageMarcus
Copy link

AverageMarcus commented Aug 9, 2024

Description

The latest stable release, 3975.2.0, is taking a considerably longer time to boot compared to previous releases.

E.g.

core@localhost ~ $ systemd-analyze
Startup finished in 1.787s (kernel) + 1min 31.064s (initrd) + 4.103s (userspace) = 1min 36.956s
multi-user.target reached after 4.103s in userspace.

vs. the previous release:

Startup finished in 1.309s (kernel) + 2.089s (initrd) + 4.278s (userspace) = 7.677s
multi-user.target reached after 4.254s in userspace.

This has resulted in a delay at boot of roughly 1min 30sec compared to the previous release.

Impact

When building images with image-builder the default boot wait time is 60s after which Packer will start typing out the boot command. As this happens before the OS has finished booting the flatcar-install script isn't actually run correctly and can result in the OS not being correctly set up in an environment such as VSphere (it doesn't install the vmtools for example). With image-builder this then results in it hanging at Waiting for IP... as vcenter is unable to retrieve the IP from the guest OS as no vmtools have been installed.

Environment and steps to reproduce

  1. Set-up: Flatcar 3975.2.0 on VSphere via image-builder
  2. Task: Packer stuck trying to get the IP address of the VM
  3. Action(s):
    a. With image-builder run: make build-node-ova-vsphere-flatcar
    b. The 3975.2.0 image is used to launch a new VM (https://stable.release.flatcar-linux.net/amd64-usr/3975.2.0/flatcar_production_iso_image.iso)
    c. The VM boots but vcenter fails to retrieve the IP address due to lack of vmtools on the guest OS
  4. Error: Waiting for IP...

Expected behavior

The OS boots in under 60 seconds. Ideally back down below 10s like the previous release.

Additional information

This issue was previously about open-vm-tools not being installed but after debugging with @tormath1 on Slack (see thread) we learnt that it was actually due to the delay in boot that was causing the problem so I updated the issue to reflect that.

@AverageMarcus AverageMarcus added the kind/bug Something isn't working label Aug 9, 2024
@tormath1
Copy link
Contributor

tormath1 commented Aug 9, 2024

I just booted an instance on VSphere and the open-vm-tools and friends are available:

Flatcar Container Linux by Kinvolk stable 3975.2.0 for VMware
core@localhost ~ $ systemd-sysext status
HIERARCHY EXTENSIONS         SINCE
/opt      none               -
/usr      containerd-flatcar Fri 2024-08-09 13:41:10 UTC
          docker-flatcar
          oem-vmware
core@localhost ~ $ vmtoolsd --version
VMware Tools daemon, version 12.4.0.48309 (build-23259341)

The ISO image is a generic image (without OEM tools), so that's normal if there is no sign of OEM VMWare tools.

I need to investigate on how the image-builder is building Flatcar VSphere image, I guess it boots the Flatcar ISO and from the ISO it runs the flatcar-install script (similar to OpenStack).

@AverageMarcus
Copy link
Author

Here's the Packer vars that image-builder is using when building Flatcar for VSphere - https://github.com/kubernetes-sigs/image-builder/blob/main/images/capi/packer/ova/flatcar.json

@AverageMarcus
Copy link
Author

After some debugging on Slack we discovered that the boot time for 3975.2.0 has slowed down quite considerably to roughly 1m30s which is longer than the 60s that image-builder is configured to wait by default. Updating the boot_wait time to 5min allowed me to workaround this problem but a 1m30s boot time isn't ideal so I'm going to update / change this issue to focus on investigating that regression.

@AverageMarcus AverageMarcus changed the title open-vm-tools missing from 3975.2.0 release Boot time for 3975.2.0 release has increased to ~1m30s (on VSphere) Aug 9, 2024
@tormath1 tormath1 changed the title Boot time for 3975.2.0 release has increased to ~1m30s (on VSphere) Boot time for ISO image 3975.2.0 release has increased to ~1m30s Aug 9, 2024
@tormath1
Copy link
Contributor

tormath1 commented Aug 9, 2024

Some logs on local ISO with qemu:

$ wget https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_iso_image.iso
$ wget https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_iso.sh
$ ./flatcar_production_iso.sh -- -display curses
...
Aug 09 14:25:14 localhost systemd[1]: dev-disk-by\x2dpartlabel-ROOT.device: Job dev-disk-by\x2dpartlabel-ROOT.device/start timed out.
Aug 09 14:25:14 localhost systemd[1]: Timed out waiting for device dev-disk-by\x2dpartlabel-ROOT.device - /dev/disk/by-partlabel/ROOT.
Aug 09 14:25:14 localhost systemd[1]: dev-disk-by\x2dpartlabel-ROOT.device: Job dev-disk-by\x2dpartlabel-ROOT.device/start failed with result 'timeout'.

It did not investigated further.

@jepio
Copy link
Member

jepio commented Aug 12, 2024

alpha-3913.0.0 boots ok, this issue first occurred with alpha 3941.0.0.

@jepio
Copy link
Member

jepio commented Aug 12, 2024

This is the guilty file: https://github.com/flatcar/bootengine/blob/flatcar-master/dracut/31decrypt-root/decrypt-root.service
and this also affect ipxe/pxe boot, but only when no ignition file is supplied.

@ader1990
Copy link

Maybe use ConditionKernelCommandLine=!flatcar.oem.id=pxe for pxe? But for ISO, I did not see any kernel cmdline that can be used to do not run the service.

@ader1990
Copy link

ader1990 commented Aug 12, 2024

Maybe use ConditionKernelCommandLine=!flatcar.oem.id=pxe for pxe? But for ISO, I did not see any kernel cmdline that can be used to do not run the service.

which means that if there is not flatcar.oem.id parameter, maybe we can skip it. Still, a better and sounder approach should be used.

Leaving here the cmdline in case of normal boot on qemu:

rootflags=rw mount.usrflags=ro BOOT_IMAGE=/flatcar/vmlinuz-a 
 mount.usr=/dev/mapper/usr verity.usr=PARTUUID=7130c94a-213a-4e5a-8e26-6cce9662f132
 rootflags=rw mount.usrflags=ro
 consoleblank=0 root=LABEL=ROOT console=ttyS0,115200n8 console=tty0 flatcar.oem.id=qemu flatcar.autologin 
 verity.usrhash=2d5fc89636c5bc9515e5e98dfdf370f69a0c963b25db9a883bb16a221845a3e0

For ISO:

rootflags=rw mount.usrflags=ro initrd=/flatcar/cpio.gz flatcar.autologin BOOT_IMAGE=/flatcar/vmlinuz

@AverageMarcus
Copy link
Author

FYI - I encountered another issue with this release and image-builder after increasing the boot timeout. It seems like the boot command configured in image-builder is now failing because the first command is returning a warning that seems to be causing the c of the curl command to be lost.

Flatcar

@chewi
Copy link
Contributor

chewi commented Aug 16, 2024

I have reproduced this on 4054.0.0 with QEMU, so it hasn't been magically fixed since. I'll keep investigating.

@tormath1
Copy link
Contributor

Since the ISO configuration is identical to qemu, it has the same OEM ID - we can't really set condition on this.
As it timeouts on getting the ROOT device (which does not exist on ISO):

Aug 09 14:25:14 localhost systemd[1]: Timed out waiting for device dev-disk-by\x2dpartlabel-ROOT.device - /dev/disk/by-partlabel/ROOT.
Aug 09 14:25:14 localhost systemd[1]: dev-disk-by\x2dpartlabel-ROOT.device: Job dev-disk-by\x2dpartlabel-ROOT.device/start failed with result 'timeout'.
$ ls /dev/disk/by-label/
CDROM

I would add a condition to the unit (https://github.com/flatcar/bootengine/blob/flatcar-master/dracut/31decrypt-root/decrypt-root.service) to check for CDROM:

ConditionKernelCommandLine=!flatcar.oem.id=pxe
ConditionPathExists=!/dev/disk/by-label/CDROM

I'm starting to think that some Equinix Metal failures (PXE boot) might be explained by this timeout.

@chewi
Copy link
Contributor

chewi commented Aug 16, 2024

I've now set this up in a way that I can very quickly test changes. Dropping that unit entirely does indeed fix the problem.

Our other Dracut modules check for this situation with this. Perhaps it would be best to do something similar here?

usr=$(cmdline_arg mount.usr)
if [[ -z "${usr}" && -f /usr.squashfs ]]; then

@chewi
Copy link
Contributor

chewi commented Aug 16, 2024

This works:

ConditionKernelCommandLine=!mount.usr
ConditionPathExists=!/usr.squashfs

But only if I remove this line:

Wants=dev-disk-by\x2dpartlabel-ROOT.device

I think that device is dynamic, so I don't know whether removing that is a good idea.

@chewi
Copy link
Contributor

chewi commented Aug 16, 2024

Changing the Wants to Requisite is enough to make this work without even adding the other conditions. I just need to check that it still activates when actually needed.

@chewi
Copy link
Contributor

chewi commented Aug 19, 2024

Bleh, it doesn't seem to work. If I drop to a shell before it switches root, I find that it hasn't decrypted the root partition, but doing it manually works. It works automatically if I change it back to Wants.

That does make sense. The by-partlabel symlink takes time to appear, so we want this to wait for that, but only if we're fairly sure it will appear eventually. I'm having a similar issue with the other boot stuff I'm looking at, so I will keep experimenting.

@chewi
Copy link
Contributor

chewi commented Aug 19, 2024

I've been doing some more thinking. I don't think we should assume that a user would not want to decrypt root when booting from the ISO or via PXE. Serving Flatcar via PXE but storing data on local encrypted disks is a key use case, right?

The point of this decrypt-root service is to make the decryption magically work without explicit configuration. Either we have to wait a reasonable amount of time for the root device to show up, or we have to make it work more reactively via a udev rule.

The question is whether this decrypted disk needs to be ready before some other boot time component. Do we do anything special with these "root" disks? What happens if it's not encrypted? If some other part of the boot process depends on this, doesn't that also need to wait for the symlink to appear?

@jepio
Copy link
Member

jepio commented Aug 19, 2024

There may be a usecase that uses an encrypted root disk with PXE, but then there would need to be a root= flag on the kernel command line so that we know where the root (stateful) partition is located. So in that sense there are three cases:

a) no root= argument and not (PXE or ISO): wait for PARTLABEL=ROOT and try to decrypt
b) no root= argument and (PXE or ISO): don't wait for disk
c) root= argument and (PXE or ISO): wait for root= disk and try to decrypt.

b) is what we want to fix here. a) works. c) is not supported (because of no root= handling) but is also not terribly interesting at this time.

The question is whether this decrypted disk needs to be ready before some other boot time component. Do we do anything special with these "root" disks? What happens if it's not encrypted? If some other part of the boot process depends on this, doesn't that also need to wait for the symlink to appear?

Decrypt-root runs from the initrd, and we need to have the decrypted root mounted at /sysroot before we can proceed with switch-root. So yes, for non-PXE, non-ISO boot we need to wait for PARTLABEL=ROOT for sure.

There is the added fun that decrypt-root.service only runs when ignition isn't provided. The first boot (ignition) has different services and ordering becacuse that's the boot that takes care of creating the encrypted root partition.

@chewi
Copy link
Contributor

chewi commented Aug 19, 2024

I wanted to allow for (c), and I had an idea, so I reworked this around a udev rule.

ACTION=="add", SUBSYSTEM=="block", ENV{ID_PART_ENTRY_NAME}=="ROOT", ENV{ID_FS_TYPE}=="crypto_LUKS", TAG+="systemd", ENV{SYSTEMD_WANTS}+="decrypt-root.service"

I also made decrypt-root.service a bit simpler. I haven't tried (a) yet, but (b) and (c) work really well. This also decrypts root even if you don't actually specify root=/dev/mapper/rootencrypted. If you do specify it, then systemd knows to wait. I'll make a PR tomorrow.

@jepio
Copy link
Member

jepio commented Aug 19, 2024 via email

@chewi
Copy link
Contributor

chewi commented Aug 20, 2024

Good news all round after more testing.

The rule works correctly on first boot and subsequently for (a) and (c) because of ENV{ID_FS_TYPE}=="crypto_LUKS". Even if ROOT already exists, the rule won't kick in until it's recreated as LUKS.

I also tried with flatcar.first_boot=1. This initially failed, as it decrypted ROOT, which was then wiped by Ignition. Adding After=ignition-disks.service fixed this, and now it wipes ROOT and decrypts it reliably every time.

In theory, this could be used to automatically decrypt any LUKS partitions but that's probably taking it too far.

chewi added a commit to flatcar/bootengine that referenced this issue Aug 20, 2024
PXE and ISO boots were taking over 1m30s because decrypt-root.service
was waiting for the ROOT partition to appear just in case it needed to
be decrypted.

Making this service conditional on non-PXE/ISO boots was considered, but
users may well want an encrypted ROOT when booting using these methods.
Adding conditions didn't avoid the delay caused by the `Wants` directive
in any case.

This change makes decrypt-root.service more reactive by having it
triggered by a udev rule matching a LUKS-encrypted partition named ROOT.
systemd knows to wait for `/dev/mapper/rootencrypted` to appear before
attempting to switch root.

Adding `After=ignition-disks.service` ensures that the service does not
try to decrypt ROOT before Ignition wipes it again due to
`flatcar.first_boot=1` being given.

The service itself has been simplified. crypttab is just as happy with a
block device path, so there is no need to determine the UUID. The udev
rule only matches LUKS-encrypted partitions, so there is no need to
check the header either. The rest is simple enough to include in the
service unit directly, removing the need for a separate shell script.
systemd-cryptsetup-generator can read the crypttab from stdin, avoiding
the creation of a temporary file. It is also important to call
daemon-reload before attempting to start the generated unit.

Closes: flatcar/Flatcar#1514
Signed-off-by: James Le Cuirot <[email protected]>
chewi added a commit to flatcar/scripts that referenced this issue Aug 20, 2024
chewi added a commit to flatcar/scripts that referenced this issue Aug 21, 2024
chewi added a commit to flatcar/scripts that referenced this issue Aug 21, 2024
chewi added a commit to flatcar/scripts that referenced this issue Aug 22, 2024
chewi added a commit to flatcar/scripts that referenced this issue Aug 22, 2024
chewi added a commit to flatcar/scripts that referenced this issue Aug 22, 2024
@chewi
Copy link
Contributor

chewi commented Aug 22, 2024

Sorry, this was closed a little prematurely. I'll close it when the fix is in the stable release.

@chewi chewi reopened this Aug 22, 2024
chewi added a commit to flatcar/scripts that referenced this issue Aug 29, 2024
@chewi
Copy link
Contributor

chewi commented Aug 29, 2024

I've backported this to the 4012 Beta.

@tormath1
Copy link
Contributor

Hello @AverageMarcus, the fix has been ported to latest Beta release (4054.1.0) - do you think you could confirm that it solves the initial issue with the image-builder (i.e by removing the boot_wait override) ? That could help to promote this fix to Stable. Thanks!

@AverageMarcus
Copy link
Author

Sure, not sure when I'll get the chance though as catching up on stuff right after being away for the past week. I'll report back when I can.

@tormath1
Copy link
Contributor

@AverageMarcus nevermind, I see this change is not available yet on Beta. It will be available in the next Beta. Sorry about that 🙏 but it's available on Alpha if you want to test it.

tormath1 pushed a commit to flatcar/scripts that referenced this issue Sep 16, 2024
tormath1 added a commit to flatcar/scripts that referenced this issue Sep 20, 2024
 udev rule

See: flatcar/bootengine#98
See: flatcar/Flatcar#1514
Signed-off-by: James Le Cuirot <[email protected]>
Signed-off-by: Mathieu Tortuyaux <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
Status: ⚒️ In Progress
Development

Successfully merging a pull request may close this issue.

5 participants