-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SMT pinning is broken/wrong #9
Comments
How do you output the cpu to cache map ? |
I used |
btw, where do I set topoext & l3-cache=on,host-cache-info=on ? |
Issues are to direct the author of this project to a problem with their software, not to provide you with support. |
Thank you @gnif. This is known. However, as docs says you should only pass physical threads, not virtual ones: https://github.com/ayufan/pve-helpers#21-cpu_taskset. And depending on CPU the mapping being different. Maybe one thing being missing is documenting how to do with the L3, as when it was written there was no need to support NUMA/many-complexes scenario. Technically it is possible to replicate all SMT topology, but at least I did not find it useful, or required to do a physical-to-virtual cpu-pinning of everything. Doing that is theoretically possible, but only |
@ayufan if I am understanding you correct, you're saying to put two VMs on the same set of cores, but separate threads? If so this is a very very bad idea, the VMs will stall each other and they will be invalidating each others caches. According to your own documentation:
Based on the configuration there VM 1 would be on thread 1 of cores 1-5, and VM 2 would be on thread 2 of cores 1-5. There is no such thing as a "virtual core" on the host system, both threads of a core are equal in every way, they are two identical pipelines running through and sharing some hardware that can cause them to stall each other. There is no "primary" thread, or "real" vs "virtual" thread. If the guest OS knows about the SMT model, the guest scheduler can ensure that high priority threads like those that service interrupts for GPUs are put onto cores that can guarantee the best possible latency. Note I am not stating this because I think it's a problem, I am stating this because it is a problem. We have people coming into our discord reporting issues with Looking Glass that are a result of very poor configuration that result due to this script. Looking Glass relies on low latency servicing of it's threads, and the GPUs driver as it's goal is to be as low latency as possible.
This is just it, you did not due to your use case, but I am stating for a fact it makes a huge difference under certain workloads and you need to fix your scripts for those using such workloads, or stop promoting them. |
You are fully correct. Of course they will. I can imagine this to be a problem in case of Looking Glass which requires effectively two systems to have low latency. In my case where I don't use Looking Glass and rather use a single VM at a time, but have all of them running latency was not a problem, since other VM is mostly idle. How you advise users to handle many VMs? Probably in this setup you expect VMs to not share physical cores, but rather pass full SMT core to them. Anyway, I see this being a problem and happy to document those caveats. Do you have a link where best to redirect people? |
In this case I would suggest you 1/2 the CPUs you give to your VMs and give them both threads of each cores, you will see a general performance uplift due to better management of your hardware.
Not really as we are just supporting people reporting issues with LG. Perhaps the VFIO discord/reddit? |
Hi, I do not use your scripts but we are seeing users in the Looking Glass discord that are who are having latency related issues due to how your script assigns CPUs to the VM.
The issue is that you are not replicating the host topology into the guest, if done properly the guest can know that the extra vCPU is sharing a core and even the L1/2/3 cache arrangement.
Here is how a guest sees a properly configured VM on a SMT host (Using Coreinfo).
Doing this the guest scheduler can make wise decisions on where to run each thread. Obviously you need to pin each CPU properly to the threads of each core to make this work well. If done correctly your cache mapping will also align with the physical hardware... see below:
Here is my host topology (AMD EPYC 7343):
My guest is pinned to CPU cores 8-15, which means
When done correctly you can see that my pinning aligns with the cache map, and allows the guest to make proper use of SMT.
Note AMD processors require the qemu CPU flag
topoext
so they can use SMT.Note2: To get the cache to align you also have to set the QEMU cpu flags
l3-cache=on,host-cache-info=on
The text was updated successfully, but these errors were encountered: