You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When deploying an RT and isolated VM, if the core chosen to isolate the VM is the first of the machine-rt slice, the VM will never boot.
The associated qemu-system-x86 thread will take 100% of one CPU forever.
To Reproduce
Deploy a Debian SEAPATH machine (standalone or cluster)
Configure the machine-rt and machine-nort allowed CPUs (See my configuration below)
Deploy an RT and isolated machine using the first allowed CPU of the machine-rt slice. In my case, I used the two firsts.
Try to access the machine with virsh console rtVM
Nothing appears on the console
Allowed CPUs in my Ansible inventory :
isolcpus: "2-7" # CPUs to isolate (isolcpus, irqbalance on debian 12)
workqueuemask: "0003" #workqueue mask, here it mean 0 and 1 are the only allowed cpus
cpusystem: "0-1" # CPUs reserves for system
cpuuser: "0-1" # CPUs reserves for user applications
cpumachines: "2-7" # CPUs reserves for VMs
cpumachinesrt: "4-7" # CPUs reserves for VMs realtime
cpumachinesnort: "2-3" # CPUs reserves for VMs non realtime
cpuovs: "0-1" # CPUs reserves for OVS
The qemu-system-x86 thread responsible to manage the VM is always running on the first allowed CPU (here the 4th).
The VM's vCPU is also pinned on this CPU.
I think the two threads will interrupt each other and prevent the VM to boot.
Also, first lines of the top command on the hypervisor:
top - 15:17:15 up 17 min, 2 users, load average: 10.32, 7.45, 4.92
Tasks: 542 total, 2 running, 540 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.6 us, 5.3 sy, 0.0 ni, 89.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 63624.4 total, 59392.4 free, 9856.9 used, 1700.3 buff/cache
MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 53767.5 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28143 libvirt+ 20 0 3703876 441004 41664 S 100.0 0.7 1:27.42 qemu-system-x86
154 root -11 0 0 0 0 S 6.2 0.0 0:00.69 rcuc/13
1763 ceph 20 0 1217808 300224 36096 S 6.2 0.5 0:04.02 ceph-mgr
3160 haclust+ 20 0 81552 25628 15644 S 6.2 0.0 0:00.60 pacemaker-based
31122 root 20 0 11640 5376 3264 R 6.2 0.0 0:00.02 top
1 root 20 0 169984 13788 8796 S 0.0 0.0 0:05.89 systemd
The qemu-system-x86 thread is taking 100% of the CPU.
The text was updated successfully, but these errors were encountered:
The affinity list allows it to move, so why does the scheduler not put it on another core ?
I don't know if it's a libvirt bug or a SEAPATH configuration problem.
Workaround
We can control the management thread of the VM with emulatorpin in libvirt.
This can be done either in the xml :
<emulatorpin cpuset='6,7'/>
Or directly on the target with the command virsh emulatorpin rtVM 6,7
Both of these commands technically solve the problem:
Describe the bug
When deploying an RT and isolated VM, if the core chosen to isolate the VM is the first of the machine-rt slice, the VM will never boot.
The associated qemu-system-x86 thread will take 100% of one CPU forever.
To Reproduce
virsh console rtVM
Allowed CPUs in my Ansible inventory :
My RT VM inventory
Expected behavior
The VM must boot.
The qemu-system-x86 will take 100% of one CPU, but just for a few seconds.
Additional context
On the hypervisor:
The
qemu-system-x86
thread responsible to manage the VM is always running on the first allowed CPU (here the 4th).The VM's vCPU is also pinned on this CPU.
I think the two threads will interrupt each other and prevent the VM to boot.
Also, first lines of the
top
command on the hypervisor:The qemu-system-x86 thread is taking 100% of the CPU.
The text was updated successfully, but these errors were encountered: