Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vhost-net: create vhost based Net backend #4461

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

majek
Copy link

@majek majek commented Feb 19, 2024

Changes

vhost-net: create vhost based Net backend

This patch adds a second backend to Net devices. It can be enabled with 'vhost'
bool on the network interface config, like this:

"network-interfaces": [
    {
        "iface_id": "eth0",
        "host_dev_name": "tap0",
        "vhost": true
    }
],

Vhost backend opens host kernel /dev/vhost-net interface, and performs
a setup dance to setup the vhost device with the relevant tap
interface. The effect is that all of the data plane goes directly
between host kernel and the guest. The data doesn't go via
firecracker VMM at all. This drastically reduces the packet latency
and increases throughput, especially in a high-pps scenarios. For
example UDP and TCP without offloads.

The control plane is somewhat hacky. Technically, the interrupts from
host to guest should go through firecracker VMM, but this is avoidable
by splicing the host eventfd into the guest interruptfd, and
force-returning VIRTIO_MMIO_INT_VRING in the relevant virtio register.

There are couple of missing features:

  • persist (no blockers, just work)
  • mmds (no obvious way to do it, perhaps possible with ebpf)
  • rate_limiting (no obvious way to implement it, perhaps with ebpf)
  • tap/vhost feature negotiation

On the latter point, it would be nice to negotiate some more advanced
tap/vhost features, like USO (UDP segmentation offload), TCP offloads
(flag needed if guest wants to use XDP), VIRTIO_NET_F_MRG_RXBUF (this
might be useful for performance, but benchmarks needed first). Right
now there is no way to express these toggles in the net config, but
this can be done in the future.

Reason

Discussion #3707

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • If a specific issue led to this PR, this PR closes the issue.
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this
    PR.
  • API changes follow the Runbook for Firecracker API changes.
  • User-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.
  • New TODOs link to an issue.
  • Commits meet
    contribution quality standards.

  • This functionality cannot be added in rust-vmm.

…cing enum

Right now we only have Virtio implementation, we want to introduce a
Vhost variant in future. This patch introduces no functional changes.
This patch adds a new backend to Net devices. It can be enabled with
'vhost' bool on the network interface config, like this:

    "network-interfaces": [
        {
            "iface_id": "eth0",
            "host_dev_name": "tap0",
            "vhost": true
        }
    ],

Vhost backend opens host kernel /dev/vhost-net interface, and performs
a setup dance to setup the vhost device with the relevant tap
interface. The effect is that all of the data plane goes directly
between host kernel and the guest, skipping the firecracker VMM.
This drastically reduces the packet latency and increases throughput,
especially in a high-pps scenarios. For example UDP and TCP without
offloads.

The control plane does go through firecracker, due to MMIO limitations.
The exception is interrupt from host to guest, technically they should
go through firecracker VMM, but this is avoidable
by splicing the host eventfd into the guest interruptfd, and
force-returning VIRTIO_MMIO_INT_VRING in the relevant virtio register.
This is the same trick block-vhost-user device uses.

There are couple of missing features:

 - persist (no blockers, just work)
 - mmds (no obvious way to do it, perhaps possible with ebpf)
 - rate_limiting (no obvious way to implement it, perhaps with ebpf)
 - tap/vhost feature negotiation

On the latter point, it would be nice to negotiate some more advanced
tap/vhost features, like USO (UDP segmentation offload), TCP offloads
(flag needed if guest wants to use XDP), VIRTIO_NET_F_MRG_RXBUF (this
might be useful for performance, but benchmarks needed first). Right
now there is no way to express these toggles in the net config, but
this can be done in the future.
@majek majek changed the title Marek/enum net device feb vhost-net: create vhost based Net backend Feb 19, 2024
@majek
Copy link
Author

majek commented Feb 19, 2024

Up for discussion: testing, the errors are wrapped simpler than block-vhost-user, benchmarks, regenerate bindgen.sh to avoid declaring stuff like VIRTIO_NET_F_GUEST_USO4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant