having trouble passing a physical disk through to a container #74

ryanbarry · 2024-11-10T19:21:13Z

i'm trying to setup something like "passthrough" for a phsyical disk, so that it's completely dedicated to a single container and possible to do things like SMART administration on the disk from within that container.

i've followed the guides for devices and mounts and i'm not sure if i'm missing something or if this is just not going to work, but so far i've not been able to get it to work.

if this is feasible but requires implementation work, i'll happily go do that, just hoping to get a "gut check" first from the experts here before i go on some fools' errand to try to make something work that is ill-conceived.

thanks for your consideration!

aither64 · 2024-11-10T20:03:33Z

Hi. It should be possible to access a physical disk, it's a matter of adding the device using osctl ct devices add -p and then possibly running mknod inside the container to get the device file. However I don't think SMART is going to work. Our containers are strictly unprivileged, i.e. using user namespace. I'm not sure how SMART works, but it likely requires privileged operations, see e.g. this discussion: https://discuss.linuxcontainers.org/t/run-smart-disk-monitor-in-lxd/11322

Because of the user namespace, you also cannot mount any filesystems that would be on said disk. So while you could get the disk device inside the container, it wouldn't be of much practical use, if any.

ryanbarry · 2024-11-11T01:44:38Z

Our containers are strictly unprivileged, i.e. using user namespace.
Because of the user namespace, you also cannot mount any filesystems that would be on said disk.

ah yes, 🎯 i think you hit the key point -- thanks for the quick response! ☺️ do you think it's worth trying to figure out if/how i can add privileged containers to the system? if it's not the right fit for this project, that's fine, but if you think there's a path to getting it working then i'd be interested to at least see how far i can get...

snajpa · 2024-11-11T11:07:13Z

the whole point of starting vpsAdminOS was to control the design so that we can be sure it's all properly wrapped in all available namespaces & cgroups, while knowing that will be too limiting on its own, so we'd always go and patch the kernel to enable what we can in user namespace

that said, filesystems are a special beast, as it's really easy to craft an image which will pwn the kernel on mount or access, I wasn't able to find any in-kernel filesystem implementation that we could allow from the inside, in a way that would guarantee this can't happen... honestly the closest from all implementations is the out of tree OpenZFS, when compiled as a debug build, because that turns on too many ASSERTs that would be hard to work around :D

we have this boundary that everything hardware-management related is a task for the init user namespace, ie. that one where root is "the root", so all the monitoring tools are expected to run as a service under runit within vpsadminos

then mounting the filesystem too would be a responsibility of the host, the container could get a bind mount of the mounted fs

it depends on what the original use-case is, what are you trying to do, what will make the most sense - as you can also run that one workload within QEMU, to which you can pass the whole block device - that's the other way how we do things, if it needs privileges :)

snajpa · 2024-11-11T11:13:09Z

though as I'm further thinking about it it'd require passing rather the whole PCI device where the disk is located to the QEMU, if you'd like SMART access on the same level which is going to access the data

snajpa · 2024-11-11T11:17:38Z

a third possibility, if it's something like a backup disk, is to use vpsadmin API/webUI full stack with its automated backups... and do backups to a centralized machine - and have NAS mounts via NFS also from a centralized machine

but the whole stack currently isn't exactly ready for use outside of vpsFree, you'd have to work with us on that one, we'd have to make it ready together :) if you're planning on more than one node with vpsAdminOS I think that would make most sense even :)

ryanbarry · 2024-11-12T19:01:42Z

so many helpful thoughts, thanks! it's clear your system is well thought-through, which is definitely what attracted me to it :)

my immediate use-case is something like "multi-tenant hosts with isolated single-tenant disks," if that makes sense. more concretely: co-locating disks owned by different users on one machine.

i'm not sure exactly how best to implement this, but i had hoped to be able to provide full hardware access for the disk(s) (which is why i mention SMART) to each user, while keeping the users as isolated from one another as is possible (or feasible, since i know isolation on shared hardware is tricky business 😉).

if i could have used kernel-level isolation on the "outer" layer, then run a hardware vm inside each container and provide those VMs to the users, i thought that might be good enough... but it's not sounding like this is a particularly viable strategy, at least not on vpsAdminOS, and maybe not on Linux more generally given its current facilities. (for reference, the approach is inspired by SmartOS/Triton, in case it sounds like a wild idea. though my mistake may well have been assuming an approach fit for another kernel would apply directly here.)

i suppose i could still do that, if i drop the "full hardware pass-through" requirement, but i'm not sure i'm ready to do that... still, something to explore.

in the meantime, it sounds like i can start by messing around with QEMU in the host. i appreciate the note you added about the caveat of needing to pass the whole PCI device, so i'll keep that in mind.

lots for me to learn and think through here, thanks again!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

having trouble passing a physical disk through to a container #74

having trouble passing a physical disk through to a container #74

ryanbarry commented Nov 10, 2024

aither64 commented Nov 10, 2024

ryanbarry commented Nov 11, 2024

snajpa commented Nov 11, 2024

snajpa commented Nov 11, 2024

snajpa commented Nov 11, 2024

ryanbarry commented Nov 12, 2024

having trouble passing a physical disk through to a container #74

having trouble passing a physical disk through to a container #74

Comments

ryanbarry commented Nov 10, 2024

aither64 commented Nov 10, 2024

ryanbarry commented Nov 11, 2024

snajpa commented Nov 11, 2024

snajpa commented Nov 11, 2024

snajpa commented Nov 11, 2024

ryanbarry commented Nov 12, 2024