One IO app to rule them all #1043

eugeneia · 2016-10-18T12:59:51Z

This diabolical reinvention of virtual_ether_mux pretends to be a regular app, and its transparent indeed. A little change makes it possible:

 function app (config, name, class, arg)
    arg = arg or "nil"
    assert(type(name) == "string", "name must be a string")
    assert(type(class) == "table", "class must be a table")
-   config.apps[name] = { class = class, arg = arg}
+   if class.configure then
+      class:configure(config, name, arg)
+   else
+      config.apps[name] = { class = class, arg = arg}
+   end
 end

The IO app doesn’t actually implement the usual new/push/pull/… methods, but only a configure method. When the IO app is configured it actually configures other apps to implement the functionality specified by its arguments.

+-- Usage:
+--   config.app("VqNIC", IO, {pciaddr="03:00.1", macaddr="00:00:..."})
+--   config.app("RssNIC", IO, {pciaddr="04:00.0", txq=1, rxq=1})
+--   config.app("SoftIO", IO, {vlan=42})
+--   config.app("BenchIO", IO, {bench={sizes={60}}})

This scheme depends on a standard interface shared among NIC drivers: they need to use standard port names, and validate their respective configurations. See #1025 #1019. In cases where implicit apps are created (e.g. for emulation), their generated names begin with an underscore to avoid accidental collisions.

An app network created by

config.app("SoftIO", IO, {vlan=42})
config.app("BenchIO", IO, {bench={sizes={60}}})

might look like this:

                   0 sent on SoftIO.trunk -> _SoftIOBridge0.SoftIO (loss rate: 0%)
                   0 sent on BenchIO.output -> _SoftIOBridge0.BenchIO (loss rate: 0%)
                   0 sent on _SoftIOBridge0.SoftIO -> SoftIO.trunk (loss rate: 0%)
                   0 sent on _SoftIOBridge0.BenchIO -> _Sink_BenchIO.rx (loss rate: 0%)

And the standard links SoftIO.rx and SoftIO.tx can be used as if they were from a network driver.

The meat of this PR is f38a8c4, the remaining commits are pulling in @petebristow’s intel_mp driver. Comments are welcome!

TODO:

needs to be tested with intel_mp, we need Intel i350 NICs in the lab. @lukego can I have? AFAIK the ones installed in the luganos are used for the uplink.
maybe we want to skip the Emu app when possible, for this we would need a way to create dummy apps and alias links into them so that Foo.rx is actually _SoftIOBridge0.Foo

# Conflicts: # src/apps/intel/intel1g.lua

lukego · 2016-10-21T08:34:51Z

Exciting stuff!

This is of course something that we sorely need, i.e. a uniform way to incorporate I/O sources into an app network when your requirements are farily generic (e.g. send and receive packets, tag/untag, switch on L2 header, hash on L3/L4 header). I suspect this would suit most Snabb applications and so this IO app has the potential to make them all very flexible: connect to hardware NICs, kernel interfaces, PCAP files, etc.

I have two topics that I want to discuss in this context. I will start with the simple one :-).

URI

Should we create a uniform syntax for selecting IO interfaces?

As a user I would really like a concise way to tell a Snabb application what IO mechanism to use. Examples:

snabb nfv traffic pci:01:00.0
snabb nfv traffic tap:tap0
snabb nfv traffic raw:eth0
snabb nfv traffic replay:imix.pcap

and so on. I know that people have already done work in this direction e.g. @dpino for lwAFTR.

Could also be that the syntax would need to support more elaborate information, like the IO app parameters of MAC address + VLAN tag + RSS hash bucket, in which case we would need to choose a flexible syntax. (URI?)

Control vs Traffic

I have a feeling that we should make a separation between control (defining queues, setting up switching rules, choosing hash functions, etc) verses traffic (sending and receiving packets on a specific queue pair.) I say this because I think certain NICs are designed with the expectation that a single entity (traditionally the kernel) will handle all control and that it could be awkward to support these NICs in a strictly peer-to-peer fashion.

This would mandate having two kinds of app:

IO: Many instances that each perform transmit/receive on a queue pair.
IOControl: Single instance that handles all queue setup.

For example the usage could be like this:

-- Define the queues.
app("ctrl", IOControl, {queues = {{id="a", mac="10:10:10:10:10:10", vlan=42},
                                  {id="b", mac="20:20:20:20:20:20", vlan=43}},
                        rssbuckets = 2})
-- Attach to named queues.
app("a1", IO, {queue = "a", rssbucket = 1}
app("a2", IO, {queue = "a", rssbucket = 2}
app("b1", IO, {queue = "b", rssbucket = 1}
app("b2", IO, {queue = "b", rssbucket = 2}

It could be that I am not being sufficiently imaginative and that we can support all NICs without needing this IOControl app. I suspect this will be an implementation challenge though. Let's dig in and find out.

The intel_mp driver on this branch does not need an IO controller. However, this seems to be an easy case: Intel NICs have a fixed set of hardware queues (0..n) and RSS hashing does not require much state.

The intel10g app currently does keep state inside the app for controlling VMDq i.e. for maintaining the mapping between hardware queues (fixed) and MAC addresses (dynamic). Could be a challenge to support in intel_mp?

The new Mellanox ConnectX driver currently does use a separate app for control as shown in the example above. The reason is that controlling the NIC is fairly stateful: the Mellanox firmware API is based on CRUD operations on objects with opaque IDs assigned by the NIC. There are a bunch of different objects to manage: Send Queue, Work Queue, Completion Queue, Receive Queue, Receive Queue Table, Flow Table, Flow Group, etc, etc, etc, and each instance is assigned an ID by the card that needs to be remembered by the controller. So, on the one hand it is possible for each IO app to lock the NIC and perform updates (e.g. recreate Flow Tables to update MAC dispatching), but on the other hand this means defining a shared representation of all the relevant state.

I also think that in the future some Snabb hacker somewhere will decide they want to interace with a NIC using a vendor-supplied software stack (as we have with Solarflare in the past) and that this software may not lend itself well to distributing control of the NIC between processes in a peer-to-peer style e.g. if the state we are manging are opaque data structures allocated by the vendor library that we have no way to transport between processes.

So, what do we think? How awkward will it be to separately define the IOControl from the IO? Does this awkwardness pay for itself in terms of simplifying implementation?

cc @petebristow

eugeneia · 2016-10-31T14:06:54Z

I believe having IOControl is a good conservative choice. From my experience tinkering with the software emulation of the IO layer, “incrementally declarative” IO instance management gets complicated even with no hardware involved. I.e. this PR would be much simpler if there was IOControl.

eugeneia · 2016-11-10T17:59:30Z

Superseded by #1068.

petebristow and others added 26 commits April 16, 2016 13:13

rewrite pci.c in lua, support both flocked and unflocked memory maps

e758971

fix build issues

e120a99

move intel1g driver to a new home

27e4516

Merge remote-tracking branch 'github/register_bits_bytes' into intel

8788f96

Merge remote-tracking branch 'github/races' into intel

70feb97

Merge remote-tracking branch 'github/map_pci_memory' into intel

53a940f

Merge remote-tracking branch 'github/delayed_start' into intel

8a9bdb2

rejig intel1g to support multiprocess RSS queues

cdee3fc

fix i350 support

d6b18fb

fix typo in variable names in close_pci_resource

9f04635

add speed tests to intel1g

384e445

move to 3 space indents

689c2d1

Better tests that reliably pass, minor layout changes

04d2845

modularise register definitions to support intel 82599

4bf42b2

split out some reusable functions

aedf75c

working RSS support

39b3fba

rename from intel1g to intel_mp

5f17ea1

Replace all the tests and make them all pass.

c2b60ab

support a configurable mtu, based on PR954

d34a639

move to no HT cpuids

38701dc

Merge branch 'master' into intel_mp-stats

5b8a7d0

# Conflicts: # src/apps/intel/intel1g.lua

intel_mp: updated vendor/model mappings in lib.hardware.pci.

a11714b

apps.intel.intel_mp: add SHM statistics counters for master.

2847e9b

Move lib.io to program/snabbnfv.

e6da37d

WIP: extend virtual_ether_mux to cover RSS queues.

d5f050b

Replace virtual_ether_mux with IO macro app from hell.

f38a8c4

eugeneia added wip rfc labels Oct 18, 2016

lukego mentioned this pull request Oct 28, 2016

Plan for Mellanox ConnectX-4 support #706

Open

35 tasks

eugeneia mentioned this pull request Nov 10, 2016

Common I/O, take 2 #1068

Open

eugeneia closed this Nov 10, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

One IO app to rule them all #1043

One IO app to rule them all #1043

eugeneia commented Oct 18, 2016 •

edited

Loading

lukego commented Oct 21, 2016

eugeneia commented Oct 31, 2016

eugeneia commented Nov 10, 2016

One IO app to rule them all #1043

One IO app to rule them all #1043

Conversation

eugeneia commented Oct 18, 2016 • edited Loading

lukego commented Oct 21, 2016

URI

Control vs Traffic

eugeneia commented Oct 31, 2016

eugeneia commented Nov 10, 2016

eugeneia commented Oct 18, 2016 •

edited

Loading