Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Ubuntu cloudimg deployment on bare metal fails due to missing kernel firmware #195

Open
wilkmar opened this issue Jan 19, 2024 · 2 comments
Labels
triaged Triaged to be addressed in a given cycle

Comments

@wilkmar
Copy link

wilkmar commented Jan 19, 2024

In order to deploy custom Ubuntu (ubuntu-cloudimg.pkr.hcl) with MAAS one must build the image with the custom kernel. Otherwise the deployment fails at the very early stage as MAAS is not able o choose a kernel for a custom ubuntu image. However, when an image is built with the custom kernel specified, the deployment continues but the bare metal machin won't boot due to the missing linux-firmware for the network drivers. Following are snippets from two different HP machine MAAS deployments:
1:
Reading state information...
linux-image-5.4.0-167-generic is already the newest version (5.4.0-167.184).
0 upgraded, 0 newly installed, 0 to remove and 23 not upgraded.
Setting up swapspace version 1, size = 8 GiB (8589930496 bytes)
no label, UUID=1ab8fe33-2939-43e6-928e-ce616cfe6d93
Removing 'local diversion of /usr/sbin/update-initramfs to /usr/sbin/update-initramfs.curtin-disabled'
update-initramfs: Generating /boot/initrd.img-5.4.0-167-generic
W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e2-7.13.11.0.fw for module bnx2x
W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e1h-7.13.11.0.fw for module bnx2x
W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e1-7.13.11.0.fw for module bnx2x
finish: cmd-install/stage-curthooks/builtin/cmd-curthooks: SUCCESS: curtin command curthooks

2:
Reading state information...
linux-image-5.4.0-153-generic is already the newest version (5.4.0-153.170).
The following packages were automatically installed and are no longer required:
grub-pc-bin linux-headers-5.4.0-152 linux-headers-5.4.0-152-generic
linux-modules-5.4.0-152-generic
Use 'apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 9 not upgraded.
chattr: Operation not supported while setting flags on /tmp/tmpeeu7gg9o/target//swap.img
Setting up swapspace version 1, size = 83.8 GiB (89999273984 bytes)
no label, UUID=fce29163-5728-45e2-bbb9-c189d91a028e
Removing 'local diversion of /usr/sbin/update-initramfs to /usr/sbin/update-initramfs.curtin-disabled'
update-initramfs: Generating /boot/initrd.img-5.4.0-153-generic
W: Possible missing firmware /lib/firmware/tigon/tg3_tso5.bin for module tg3
W: Possible missing firmware /lib/firmware/tigon/tg3_tso.bin for module tg3
W: Possible missing firmware /lib/firmware/tigon/tg3.bin for module tg3

When MAAS deploys official Ubuntu images it installs the linux-generic which depends on linux-image-generic package which depends on linux-firmware and next it generates initrd.img which contains the kernel firmware. This doesn't happen for the custom images with custom kernels. The linux-firmware package never gets installed.

Machine 1 is HP ProLiant BL460c G6 with Broadcom Inc. and subsidiaries NetXtreme II BCM57711E 10-Gigabit PCIe which requires the bnx2x drivers.

I did a few more tests and found out the installing extra linux-firmware is not enough, see below.

Steps to recreate
I have tested the latest packer-maas:
commit ef85e2d1b1efa06096e3e9efee00a40aec0a8967 (HEAD -> main, origin/main, origin/HEAD)
MAAS version: 2.7.2 (8283-g.72f2ee59d-0ubuntu1~18.04.1)

Image build command: sudo packer build -var ubuntu_series=focal --var kernel=linux-image-5.4.0-169-generic -only='cloudimg.*' .
Image upload command: maas maas-lab boot-resources create name='custom/focal4' title='UbuntuFocal4' architecture='amd64/generic' filetype='tgz' content@=focal4.tar.gz

I managed to successfully deploy this bare metal when using image built like this:
Image build command: sudo packer build -var ubuntu_series=focal --var kernel=linux-image-5.4.0-169-generic -var customize_script=custom-script.sh -only='cloudimg.*' .
and:
cat custom-script.sh
#!/usr/bin/env bash
apt-get install -y linux-generic
Image upload command: maas maas-lab boot-resources create name='custom/focal5' title='UbuntuFocal5' architecture='amd64/generic' filetype='tgz' content@=focal5.tar.gz

In my opinion, if the ubuntu-cloudimg.pkr.hcl build is run WITHOUT the 'kernel' variable it should NOT uninstall all 'linux-image-*' packages [1] and leave what currently comes with the base image [2].
If the build is run with a custom kernel specified, it should install additional packages that are needed to support bare metal hardware: linux-firmware, intel-microcode, amd64-microcode, linux-image--generic, linux-modules-extra--generic, linux-headers--generic. These get installed by default with the official Ubuntu images.

[1] https://github.com/canonical/packer-maas/blob/main/ubuntu/scripts/cloudimg/setup-boot.sh#L40
[2] https://github.com/canonical/packer-maas/blob/main/ubuntu/ubuntu-cloudimg.pkr.hcl#L39

Copy link

github-actions bot commented Mar 1, 2024

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Mar 1, 2024
@billwear
Copy link
Contributor

including linux-generic in the custom image via a customization script (custom-script.sh) that installs linux-generic (and consequently, linux-firmware) is practical. it equips the custom image with a bunch of drivers, covering most hardware configurations, including the network drivers needed for your HP machines.

what we probably need to do on our end:

  • include linux-firmware and maybe more microcode packages (intel-microcode, amd64-microcode) explicitly in the image; this can be done through a customization script as you've figured out, but we could make this part of the deal.
  • maybe also scan the build process in ubuntu-cloudimg.pkr.hcl (and scripts) to make sure they don't clobber essential packages, or on the other side, just tweak the process to include packages based on the custom kernel being built.

triaging this, since it looks like we have work to do.

@billwear billwear added triaged Triaged to be addressed in a given cycle and removed stale labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Triaged to be addressed in a given cycle
Projects
None yet
Development

No branches or pull requests

2 participants