Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade LXD VM from jammy to noble stops during "do-release-upgrade" when updating the lxd-agent-loader #14033

Closed
toabctl opened this issue Sep 3, 2024 · 13 comments
Assignees
Labels
Bug Confirmed to be a bug
Milestone

Comments

@toabctl
Copy link
Member

toabctl commented Sep 3, 2024

ProblemType
Bug

Date
Tue Sep 3 16:14:22 2024

CurrentDesktop
ubuntu:GNOME

ProcEnviron
LANG=en_US.UTF-8
PATH=(custom, no user)
SHELL=/usr/bin/zsh
TERM=xterm-256color
XDG_RUNTIME_DIR=

DistroRelease
Ubuntu 24.10

Uname
Linux 6.8.0-41-generic x86_64

Architecture
amd64

Snap
lxd 5.21.2-22f93f4 (5.21/stable)

SnapChanges
no changes found

SnapConnections
Interface Plug Slot Notes
content lxd:ceph-conf - -
content lxd:ovn-certificates - -
content lxd:ovn-chassis - -
lxd multipass:lxd lxd:lxd -
lxd-support lxd:lxd-support :lxd-support -
network lxd:network :network -
network-bind lxd:network-bind :network-bind -
system-observe lxd:system-observe :system-observe -

SnapInfo.lxd
name: lxd
summary: LXD - container and VM manager
publisher: Canonical**
store-url: https://snapcraft.io/lxd
contact: https://github.com/canonical/lxd/issues
license: AGPL-3.0
description: |
LXD is a system container and virtual machine manager.

It offers a simple CLI and REST API to manage local or remote instances,
uses an image based workflow and support for a variety of advanced
features.

Images are available for all Ubuntu releases and architectures as well
as for a wide number of other Linux distributions. Existing
integrations with many deployment and operation tools, makes it work
just like a public cloud, except everything is under your control.

LXD containers are lightweight, secure by default and a great
alternative to virtual machines when running Linux on Linux.

LXD virtual machines are modern and secure, using UEFI and secure-boot
by default and a great choice when a different kernel or operating
system is needed.

With clustering, up to 50 LXD servers can be easily joined and managed
together with the same tools and APIs and without needing any external
dependencies.

Supported configuration options for the snap (snap set lxd
[=...]):

- ceph.builtin: Use snap-specific Ceph configuration [default=false]
- ceph.external: Use the system's ceph tools (ignores ceph.builtin)
[default=false]
- criu.enable: Enable experimental live-migration support [default=false]
- daemon.debug: Increase logging to debug level [default=false]
- daemon.group: Set group of users that have full control over LXD
[default=lxd]
- daemon.user.group: Set group of users that have restricted LXD access
[default=lxd]
- daemon.preseed: Pass a YAML configuration to `lxd init` on initial
start
- daemon.syslog: Send LXD log events to syslog [default=false]
- daemon.verbose: Increase logging to verbose level [default=false]
- lvm.external: Use the system's LVM tools [default=false]
- lxcfs.pidfd: Start per-container process tracking [default=false]
- lxcfs.loadavg: Start tracking per-container load average
[default=false]
- lxcfs.cfs: Consider CPU shares for CPU usage [default=false]
- lxcfs.debug: Increase logging to debug level [default=false]
- openvswitch.builtin: Run a snap-specific OVS daemon [default=false]
- openvswitch.external: Use the system's OVS tools (ignores
openvswitch.builtin) [default=false]
- ovn.builtin: Use snap-specific OVN configuration [default=false]
- ui.enable: Enable the web interface [default=false]

For system-wide configuration of the CLI, place your configuration in
/var/snap/lxd/common/global-conf/ (config.yml and servercerts)
commands:

  • lxd.buginfo
  • lxd.check-kernel
  • lxd.lxc
  • lxd
    services:
    lxd.activate: oneshot, enabled, inactive
    lxd.daemon: simple, enabled, active
    lxd.user-daemon: simple, enabled, inactive
    snap-id: J60k4JY0HppjwOjW8dZdYc8obXKxujRu
    tracking: 5.21/stable
    refresh-date: 2024-08-23T19:41:04+02:00
    channels:
    5.21/stable: 5.21.2-22f93f4 2024-08-22T19:10:03Z (29948) 109MB -
    5.21/candidate: 5.21.2-2f4ba6b 2024-09-02T10:05:06Z (30131) 109MB -
    5.21/beta: ^
    5.21/edge: git-75a87af 2024-09-03T13:53:07Z (30149) 112MB -
    latest/stable: 6.1-efad198 2024-08-22T19:10:03Z (29943) 109MB -
    latest/candidate: 6.1-78a3d8f 2024-09-02T10:03:03Z (30130) 109MB -
    latest/beta: ^
    latest/edge: git-8708ed1 2024-09-02T14:11:38Z (30143) 118MB -
    6.1/stable: 6.1-78a3d8f 2024-09-03T08:27:37Z (30130) 109MB -
    6.1/candidate: ^
    6.1/beta: ^
    6.1/edge: ^
    5.20/stable: 5.20-f3dd836 2024-02-09T14:19:17Z (27049) 155MB -
    5.20/candidate: ^
    5.20/beta: ^
    5.20/edge: ^
    5.19/stable: 5.19-8635f82 2024-01-29T14:24:34Z (26200) 159MB -
    5.19/candidate: ^
    5.19/beta: ^
    5.19/edge: ^
    5.0/stable: 5.0.3-80aeff7 2024-07-19T05:10:03Z (29351) 91MB -
    5.0/candidate: 5.0.3-80aeff7 2024-07-11T10:56:11Z (29351) 91MB -
    5.0/beta: ^
    5.0/edge: git-f595ea1 2024-08-29T16:13:08Z (30092) 119MB -
    4.0/stable: 4.0.10-e664786 2024-07-31T02:10:05Z (29619) 96MB -
    4.0/candidate: 4.0.10-e664786 2024-07-24T16:10:15Z (29619) 96MB -
    4.0/beta: ^
    4.0/edge: git-64f0709 2024-07-24T15:56:32Z (29617) 96MB -
    3.0/stable: 3.0.4 2019-10-10T21:18:29Z (11348) 55MB -
    3.0/candidate: 3.0.4 2019-10-10T21:18:29Z (11348) 55MB -
    3.0/beta: ^
    3.0/edge: git-81b81b9 2019-10-10T21:18:29Z (11362) 55MB -
    installed: 5.21.2-22f93f4 (29948) 109MB -

SnapInfo.core22
name: core22
summary: Snap runtime environment
publisher: Canonical**
store-url: https://snapcraft.io/core22
license: unset
description: |
Base snaps are a specific type of snap that include libraries and
dependencies common to many applications. They provide a consistent and
reliable execution environment for the snap packages that use them.

The core22 base snap provides a runtime environment based on Ubuntu 22.04
LTS (Jammy Jellyfish).

Other Ubuntu environment base snaps include:

Using a base snap

Base snaps are installed automatically when a snap package requires them.
Only one of each type of base snap is ever installed.

Manually removing a base snap may affect the stability of your system.

Building snaps with core22

Snap developers can use this base in their own snaps by adding the
following to the snap's snapcraft.yaml:

  base: core22

Additional Information*

For more details, and guidance on using base snaps, see our documentation:
https://snapcraft.io/docs/base-snaps
type: base
snap-id: amcUKQILKXHHTlmSa7NMdnXSx02dNeeT
tracking: latest/stable
refresh-date: 2024-08-27T20:17:21+02:00
channels:
latest/stable: 20240809 2024-08-29T23:10:03Z (1586) 77MB -
latest/candidate: 20240823 2024-09-03T08:30:12Z (1612) 77MB -
latest/beta: 20240823 2024-08-26T16:30:25Z (1612) 77MB -
latest/edge: 20240830 2024-09-01T23:22:46Z (1615) 77MB -
fips-updates/stable: 20231019 2023-11-03T07:27:10Z (952) 78MB -
fips-updates/candidate: ^
fips-updates/beta: ^
fips-updates/edge: ^
fips-preview/stable: 20231019 2023-11-03T07:24:05Z (952) 78MB -
fips-preview/candidate: ^
fips-preview/beta: ^
fips-preview/edge: ^
fips/stable: --
fips/candidate: --
fips/beta: --
fips/edge: 20231019 2023-11-03T07:27:35Z (952) 78MB -
installed: 20240809 (1586) 77MB base

SnapGitOwner
canonical

SnapGitName
lxd

CrashDB
snap-github

NonfreeKernelModules
zfs

InstallationMedia
Ubuntu 24.04 LTS "Noble Numbat" - Release amd64 (20240424)

Tags
oracular wayland-session

InstallationDate
Installed on 2024-07-18 (47 days ago)

UpgradeStatus
Upgraded to oracular on 2024-08-26 (8 days ago)

ProcVersionSignature
Ubuntu 6.8.0-41.41-generic 6.8.12

ProcCpuinfoMinimal
processor : 23
vendor_id : AuthenticAMD
cpu family : 25
model : 97
model name : AMD Ryzen 9 7900 12-Core Processor
stepping : 2
microcode : 0xa601206
cpu MHz : 2840.654
cache size : 1024 KB
physical id : 0
siblings : 24
core id : 13
cpu cores : 12
apicid : 27
initial apicid : 27
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca fsrm flush_l1d
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips : 7386.17
TLB size : 3584 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

ApportVersion
2.30.0-0ubuntu1

CasperMD5CheckResult
pass

@toabctl toabctl changed the title Issue submitted via apport Upgrade LXD VM from jammy to noble stops in between Sep 3, 2024
@toabctl
Copy link
Member Author

toabctl commented Sep 3, 2024

steps to reproduce:

lxc launch ubuntu:jammy jammy-to-noble --vm
lxc shell jammy-to-noble
# now inside the VM
do-release-upgrade

[snipped]
Setting up libkeyutils1:amd64 (1.6.3-3build1) ...
Setting up lxd-agent-loader (0.7) ...
Error: websocket: close 1006 (abnormal closure): unexpected EOF

Now I'm out of the VM.

@toabctl toabctl changed the title Upgrade LXD VM from jammy to noble stops in between Upgrade LXD VM from jammy to noble stops during "do-release-upgrade" when updating the ldg-agent-loader Sep 3, 2024
@toabctl toabctl changed the title Upgrade LXD VM from jammy to noble stops during "do-release-upgrade" when updating the ldg-agent-loader Upgrade LXD VM from jammy to noble stops during "do-release-upgrade" when updating the lxd-agent-loader Sep 3, 2024
@tomponline
Copy link
Member

@simondeziel where are we at with updating the lxd-installer to not restart on upgrade?

@tomponline
Copy link
Member

@simondeziel also do you have an existing issue for this?

@tomponline tomponline added the Bug Confirmed to be a bug label Sep 3, 2024
@tomponline tomponline added this to the lxd-6.2 milestone Sep 3, 2024
@simondeziel
Copy link
Member

@toabctl I'm running your reproducer (thanks!) but it's not finished yet. IIRC, d-r-u spawns a screen or tmux session, if that's the case can you re-attach to it and have it continue where it disconnected you?

@simondeziel
Copy link
Member

After the lxd-agent gets restarted causing a disconnect, waiting a little makes lxc shell work again. At that point, the upgrade process can be picked up with screen -r ubuntu-release-upgrade-screen-window.

@toabctl
Copy link
Member Author

toabctl commented Sep 4, 2024

After the lxd-agent gets restarted causing a disconnect, waiting a little makes lxc shell work again. At that point, the upgrade process can be picked up with screen -r ubuntu-release-upgrade-screen-window.

Thanks for the answer. I did reconnect and did a dpkg configure -a afterwards which kicked me out of the session again. Then I did reconnect again and it worked.
It's good to have that screen session from the do-release-upgrade, but the user experience is really bad. I do know how to work around it, but this is very likely not the case for a lot of people so this needs imo fixing.

@tomponline
Copy link
Member

@toabctl yes it does need to be fixed so that lxd-agent doesnt restart when its package is upgraded.

@simondeziel is there a bug you're tracking for this? I cant see it at https://launchpad.net/ubuntu/+source/lxd-agent-loader

@simondeziel
Copy link
Member

@toabctl indeed, reconnecting is a workaround at best. I'll open a bug and work on a fix ASAP but due to SRU delays, this will take some time to land into Noble unfortunately.

@simondeziel
Copy link
Member

Here's the LP bug: https://bugs.launchpad.net/ubuntu/+source/lxd-agent-loader/+bug/2078936

@tomponline
Copy link
Member

@simondeziel can we close this now?

@simondeziel
Copy link
Member

@tomponline lxd-agent-loader version 0.7ubuntu0.1 has still not landed officially in Noble so I think that we should keep this one open for visibility.

@tomponline
Copy link
Member

OK please close when it lands in Noble. Thanks

@simondeziel
Copy link
Member

The fixed lxd-agent-loader version (0.7ubuntu0.1) landed into Noble:

$ rmadison lxd-agent-loader
 lxd-agent-loader | 0.4          | focal          | source, all
 lxd-agent-loader | 0.5          | jammy          | source, all
 lxd-agent-loader | 0.7          | noble          | source, all
 lxd-agent-loader | 0.7ubuntu0.1 | noble-proposed | source, all
 lxd-agent-loader | 0.7ubuntu0.1 | noble-updates  | source, all
 lxd-agent-loader | 0.8          | oracular       | source, all

I confirmed that do-release-upgrade'ing a Jammy VM to Noble result in no lxd-agent disconnection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug
Projects
None yet
Development

No branches or pull requests

3 participants