diff --git a/.gitignore b/.gitignore index 6422517199..c3051f3928 100644 --- a/.gitignore +++ b/.gitignore @@ -16,3 +16,4 @@ obj /deps/*.vsn /src/programs.inc .images +/lib/luajit/usr diff --git a/README.md b/README.md index b699edd4f8..a562a24cce 100644 --- a/README.md +++ b/README.md @@ -10,10 +10,10 @@ ways to do things. The Snabb community are active in [applying modern programming techniques](http://blog.ipspace.net/2014/09/snabb-switch-deep-dive-on-software-gone.html), [do-it-yourself operator networking](http://blog.ipspace.net/2014/12/l2vpn-over-ipv6-with-snabb-switch-on.html), -[high-level device drivers](https://github.com/SnabbCo/snabbswitch/blob/master/src/apps/intel/intel10g.lua), +[high-level device drivers](https://github.com/snabbco/snabb/blob/master/src/apps/intel/intel10g.lua), [fast userspace virtio networking](http://www.virtualopensystems.com/en/solutions/guides/snabbswitch-qemu/), [universal SIMD protocol offloads](https://groups.google.com/d/msg/snabb-devel/aez4pEnd4ow/WrXi5N7nxfkJ), and -[applying compiler technology to networking](https://fosdem.org/2015/schedule/event/packet_filtering_pflua/). +[applying compiler technology to networking](https://archive.fosdem.org/2015/schedule/event/packet_filtering_pflua/). You are welcome to join our community. If you have an application that you want to build, or you want to use one that we are already @@ -32,8 +32,9 @@ Snabb is written using these main techniques: Snabb compiles into a stand-alone executable called `snabb`. This single binary includes multiple applications and runs on -any modern Linux distribution. (You could think of it as a -[busybox](http://en.wikipedia.org/wiki/BusyBox#Single_binary) for +any modern [Linux/x86-64](src/doc/porting.md) distribution. (You could +think of it as a +[busybox](https://en.wikipedia.org/wiki/BusyBox#Single_binary) for networking.) ## How is it being used? @@ -67,14 +68,14 @@ documentation](src/program/lwaftr/doc/) for more details. ### VPWS VPWS (Virtual Private Wire Service) is a Layer-2 VPN application being -developed by Alexander Gall at [SWITCH](http://switch.ch). His Github +developed by Alexander Gall at [SWITCH](http://www.switch.ch/). His Github [`vpn` branch](https://github.com/alexandergall/snabbswitch/tree/vpn) is the master line of development. ### packetblaster [packetblaster](src/program/packetblaster/) generates load by -replaying a [pcap format](http://en.wikipedia.org/wiki/Pcap) trace +replaying a [pcap format](https://en.wikipedia.org/wiki/Pcap) trace file or synthesizing customizable packets onto any number of Intel 82599 10-Gigabit network interfaces. This is very efficient: only a small % of one core per CPU is required even for hundreds of Gbps of traffic. Because so little diff --git a/default.nix b/default.nix new file mode 100644 index 0000000000..a3932d5eab --- /dev/null +++ b/default.nix @@ -0,0 +1,39 @@ +# Run like this: +# nix-build /path/to/this/directory +# ... and the files are produced in ./result/ + +{ pkgs ? (import {}) +}: + +with pkgs; + +stdenv.mkDerivation rec { + # TODO: get the version from somewhere? + name = "snabb"; + + src = ./.; + + buildInputs = [ makeWrapper ]; + + patchPhase = '' + patchShebangs . + + # some hardcodeism + for f in $(find src/program/snabbnfv/ -type f); do + substituteInPlace $f --replace "/bin/bash" "${bash}/bin/bash" + done + + # We need a way to pass $PATH to the scripts + sed -i '2iexport PATH=${git}/bin:${mariadb}/bin:${which}/bin:${procps}/bin:${coreutils}/bin' src/program/snabbnfv/neutron_sync_master/neutron_sync_master.sh.inc + sed -i '2iexport PATH=${git}/bin:${coreutils}/bin:${diffutils}/bin:${nettools}/bin' src/program/snabbnfv/neutron_sync_agent/neutron_sync_agent.sh.inc + ''; + + preBuild = '' + make clean + ''; + + installPhase = '' + mkdir -p $out/bin + cp src/snabb $out/bin + ''; +} diff --git a/release.nix b/release.nix new file mode 100644 index 0000000000..8501d4a488 --- /dev/null +++ b/release.nix @@ -0,0 +1,67 @@ +# Run like this: +# nix-build /path/to/this/directory/release.nix -A manual +# ... and the files are produced in ./result/ + +{ pkgs ? (import {}) +}: + +with pkgs; + +let + # see https://github.com/snabbco/snabb/blob/master/src/doc/testing.md + test_env = fetchurl { + url = "http://lab1.snabb.co:2008/~max/assets/vm-ubuntu-trusty-14.04-dpdk-snabb.tar.gz"; + sha256 = "0323591i925jhd6wv8h268wc3ildjpa6j57n4p9yg9d6ikwkw06j"; + }; + optionalGetEnv = first: default: let + maybeEnv = builtins.getEnv first; + in if (maybeEnv != "") then maybeEnv else default; +in rec { + manual = import ./src/doc {}; + snabb = import ./default.nix {}; + tests = stdenv.mkDerivation rec { + name = "snabb-tests"; + + src = snabb.src; + + # allow sudo + __noChroot = true; + requiredSystemFeatures = [ "performance" ]; + + buildInputs = [ git telnet tmux numactl bc iproute which qemu ]; + + buildPhase = '' + export PATH=$PATH:/var/setuid-wrappers/ + export HOME=$TMPDIR + + # make sure we reuse the snabb built in another derivation + ln -s ${snabb}/bin/snabb src/snabb + sed -i 's/testlog snabb/testlog/' src/Makefile + + # setup the environment + mkdir ~/.test_env + tar xvzf ${test_env} -C ~/.test_env/ + ''; + + doCheck = true; + checkPhase = '' + export SNABB_PCI0=${ optionalGetEnv "SNABB_PCI0" "0000:01:00.0"} + export SNABB_PCI_INTEL0=${ optionalGetEnv "SNABB_PCI_INTEL0" "0000:01:00.0"} + export SNABB_PCI_INTEL1=${ optionalGetEnv "SNABB_PCI_INTEL1" "0000:01:00.1"} + export FAIL_ON_FIRST=true + + # run tests + sudo -E make test -C src/ + ''; + + installPhase = '' + mkdir -p $out/nix-support + + # keep the logs + cp src/testlog/* $out/ + for f in $(ls $out/* | sort); do + echo "file log $f" >> $out/nix-support/hydra-build-products + done + ''; + }; +} diff --git a/src/Makefile b/src/Makefile index 831d6fc60c..610186ce0b 100644 --- a/src/Makefile +++ b/src/Makefile @@ -5,6 +5,7 @@ E= @echo #E= @: export TEST_SKIPPED=43 +CC ?= gcc SRCDIR = $(shell find . -type d -not -regex './obj.*' -printf '%P ') OBJDIR = $(patsubst %,obj/%,$(SRCDIR)) @@ -50,7 +51,7 @@ PATH := ../lib/luajit/usr/local/bin:$(PATH) snabb: $(LUAOBJ) $(PFLUAOBJ) $(HOBJ) $(COBJ) $(ARCHOBJ) $(ASMOBJ) $(INCOBJ) $(LUAJIT_A) $(E) "LINK $@" - $(Q) gcc $(DEBUG) -Wl,--no-as-needed -Wl,-E -Werror -Wall -o $@ $^ \ + $(Q) $(CC) $(DEBUG) -Wl,--no-as-needed -Wl,-E -Werror -Wall -o $@ $^ \ ../lib/luajit/src/libluajit.a \ -lrt -lc -ldl -lm -lpthread @echo -n "BINARY " @@ -124,15 +125,15 @@ $(PFLUAOBJ): obj/%_lua.o: ../lib/pflua/src/%.lua Makefile $(COBJ): obj/%_c.o: %.c $(CHDR) Makefile | $(OBJDIR) $(E) "C $@" - $(Q) gcc $(DEBUG) -Wl,-E -I ../lib/luajit/src -I . -include $(CURDIR)/../gcc-preinclude.h -c -Wall -Werror -o $@ $< + $(Q) $(CC) $(DEBUG) -Wl,-E -I ../lib/luajit/src -I . -include $(CURDIR)/../gcc-preinclude.h -c -Wall -Werror -o $@ $< obj/arch/avx2_c.o: arch/avx2.c Makefile $(E) "C(AVX2) $@" - $(Q) gcc -O2 -mavx2 $(DEBUG) -Wl,-E -I ../lib/luajit/src -I . -include $(CURDIR)/../gcc-preinclude.h -c -Wall -Werror -o $@ $< + $(Q) $(CC) -O2 -mavx2 $(DEBUG) -Wl,-E -I ../lib/luajit/src -I . -include $(CURDIR)/../gcc-preinclude.h -c -Wall -Werror -o $@ $< obj/arch/sse2_c.o: arch/sse2.c Makefile $(E) "C(SSE2) $@" - $(Q) gcc -O2 -msse2 $(DEBUG) -Wl,-E -I ../lib/luajit/src -I . -include $(CURDIR)/../gcc-preinclude.h -c -Wall -Werror -o $@ $< + $(Q) $(CC) -O2 -msse2 $(DEBUG) -Wl,-E -I ../lib/luajit/src -I . -include $(CURDIR)/../gcc-preinclude.h -c -Wall -Werror -o $@ $< $(HOBJ): obj/%_h.o: %.h Makefile | $(OBJDIR) $(E) "H $@" @@ -178,7 +179,7 @@ obj/jit_tprof.o: extra/tprof.lua | $(OBJDIR) obj/jit_vmprof.o: extra/vmprof.c | $(OBJDIR) $(E) "C $@" - $(Q) gcc $(DEBUG) -Wl,-E -O2 -I ../lib/luajit/src -c -Wall -Werror -o $@ $< + $(Q) $(CC) $(DEBUG) -Wl,-E -O2 -I ../lib/luajit/src -c -Wall -Werror -o $@ $< book: obj/doc/snabb.pdf obj/doc/snabb.html obj/doc/snabb.epub @@ -187,7 +188,7 @@ obj/doc/snabb.markdown: markdown Makefile doc/genbook.sh obj/doc/snabb.pdf: obj/doc/snabb.markdown $(E) "PANDOC $@" - $(Q) (cd obj/doc; pandoc --template=template.latex --latex-engine=lualatex -V fontsize=10pt -V monofont=droidsansmono -V monoscale=.70 -V verbatimspacing=.85 -V mainfont=droidserif -V sansfont=droidsans -V documentclass:book -V geometry:top=1.0in -V geometry:bottom=0.75in -S --toc --chapters -o ../../$@ ../../$<) + $(Q) (cd obj/doc; pandoc --template=../../doc/template.latex --latex-engine=lualatex -V fontsize=10pt -V monofont=droidsansmono -V monoscale=.70 -V verbatimspacing=.85 -V mainfont=droidserif -V sansfont=droidsans -V documentclass:book -V geometry:top=1.0in -V geometry:bottom=0.75in -S --toc --chapters -o ../../$@ ../../$<) obj/doc/snabb.html: obj/doc/snabb.markdown $(E) "PANDOC $@" diff --git a/src/apps/basic/README.md b/src/apps/basic/README.md index 7473b17abf..0e0215ff14 100644 --- a/src/apps/basic/README.md +++ b/src/apps/basic/README.md @@ -6,8 +6,10 @@ functionality for use in you app networks. ## Source The `Source` app is a synthetic packet generator. On each breath it fills -each attached output link with new packets. The packet data is -uninitialized garbage and each packet is 60 bytes long. +each attached output link with new packets. It accepts a number as its +configuration argument which is the byte size of the generated packets. By +default, each packet is 60 bytes long. The packet data is initialized with +zero bytes. DIAGRAM: Source +--------+ diff --git a/src/apps/basic/basic_apps.lua b/src/apps/basic/basic_apps.lua index ca2cb35093..7f689c4794 100644 --- a/src/apps/basic/basic_apps.lua +++ b/src/apps/basic/basic_apps.lua @@ -3,7 +3,6 @@ module(...,package.seeall) local app = require("core.app") -local freelist = require("core.freelist") local packet = require("core.packet") local link = require("core.link") local transmit, receive = link.transmit, link.receive diff --git a/src/apps/intel/intel10g.lua b/src/apps/intel/intel10g.lua index ab3d91acbc..51e2d640fa 100644 --- a/src/apps/intel/intel10g.lua +++ b/src/apps/intel/intel10g.lua @@ -23,8 +23,7 @@ local timer = require("core.timer") local bits, bitset = lib.bits, lib.bitset local band, bor, lshift = bit.band, bit.bor, bit.lshift -num_descriptors = 512 ---num_descriptors = 32 +num_descriptors = 1024 -- Defaults for configurable items local default = { diff --git a/src/apps/intel/intel_app.lua b/src/apps/intel/intel_app.lua index 80cace3d9f..b1f3449aa1 100644 --- a/src/apps/intel/intel_app.lua +++ b/src/apps/intel/intel_app.lua @@ -9,7 +9,6 @@ local lib = require("core.lib") local pci = require("lib.hardware.pci") local register = require("lib.hardware.register") local intel10g = require("apps.intel.intel10g") -local freelist = require("core.freelist") local receive, transmit, full, empty = link.receive, link.transmit, link.full, link.empty Intel82599 = {} Intel82599.__index = Intel82599 diff --git a/src/apps/keyed_ipv6_tunnel/selftest.cap.input b/src/apps/keyed_ipv6_tunnel/selftest.cap.input deleted file mode 100644 index 22fc1e478e..0000000000 Binary files a/src/apps/keyed_ipv6_tunnel/selftest.cap.input and /dev/null differ diff --git a/src/apps/keyed_ipv6_tunnel/tunnel.lua b/src/apps/keyed_ipv6_tunnel/tunnel.lua index afd2aa0612..9e394cfcfc 100644 --- a/src/apps/keyed_ipv6_tunnel/tunnel.lua +++ b/src/apps/keyed_ipv6_tunnel/tunnel.lua @@ -243,9 +243,8 @@ prepare_header_template() function selftest () print("Keyed IPv6 tunnel selftest") local ok = true - - local input_file = "apps/keyed_ipv6_tunnel/selftest.cap.input" - local output_file = "apps/keyed_ipv6_tunnel/selftest.cap.output" + local Synth = require("apps.test.synth").Synth + local Match = require("apps.test.match").Match local tunnel_config = { local_address = "00::2:1", remote_address = "00::2:1", @@ -255,19 +254,19 @@ function selftest () } -- should be symmetric for local "loop-back" test local c = config.new() - config.app(c, "source", pcap.PcapReader, input_file) config.app(c, "tunnel", SimpleKeyedTunnel, tunnel_config) - config.app(c, "sink", pcap.PcapWriter, output_file) + config.app(c, "match", Match) + config.app(c, "comparator", Synth) + config.app(c, "source", Synth) config.link(c, "source.output -> tunnel.decapsulated") + config.link(c, "comparator.output -> match.comparator") config.link(c, "tunnel.encapsulated -> tunnel.encapsulated") - config.link(c, "tunnel.decapsulated -> sink.input") + config.link(c, "tunnel.decapsulated -> match.rx") app.configure(c) - app.main({duration = 0.25}) -- should be long enough... + app.main({duration = 0.0001, report = {showapps=true,showlinks=true}}) -- Check results - if io.open(input_file):read('*a') ~= - io.open(output_file):read('*a') - then + if #engine.app_table.match:errors() ~= 0 then ok = false end diff --git a/src/apps/lwaftr/vlan.lua b/src/apps/lwaftr/vlan.lua deleted file mode 100644 index ec1fd1f6ed..0000000000 --- a/src/apps/lwaftr/vlan.lua +++ /dev/null @@ -1,65 +0,0 @@ -module(..., package.seeall) - -local packet = require("core.packet") -local bit = require("bit") -local ffi = require("ffi") - -local C = ffi.C -local receive, transmit = link.receive, link.transmit -local cast = ffi.cast - -Tagger = {} -Untagger = {} - --- 802.1q -local dotq_tpid = 0x8100 -local o_ethernet_ethertype = 12 -local uint32_ptr_t = ffi.typeof('uint32_t*') - -local function make_vlan_tag(tag) - return ffi.C.htonl(bit.bor(bit.lshift(dotq_tpid, 16), tag)) -end - -function Tagger:new(conf) - local o = setmetatable({}, {__index=Tagger}) - o.tag = make_vlan_tag(assert(conf.tag)) - return o -end - -function Tagger:push () - local input, output = self.input.input, self.output.output - local tag = self.tag - for _=1,link.nreadable(input) do - local pkt = receive(input) - local payload = pkt.data + o_ethernet_ethertype - local length = pkt.length - pkt.length = length + 4 - C.memmove(payload + 4, payload, length - o_ethernet_ethertype) - cast(uint32_ptr_t, payload)[0] = tag - transmit(output, pkt) - end -end - -function Untagger:new(conf) - local o = setmetatable({}, {__index=Untagger}) - o.tag = make_vlan_tag(assert(conf.tag)) - return o -end - -function Untagger:push () - local input, output = self.input.input, self.output.output - local tag = self.tag - for _=1,link.nreadable(input) do - local pkt = receive(input) - local payload = pkt.data + o_ethernet_ethertype - if cast(uint32_ptr_t, payload)[0] ~= tag then - -- Incorrect VLAN tag; drop. - packet.free(pkt) - else - local length = pkt.length - pkt.length = length - 4 - C.memmove(payload, payload + 4, length - o_ethernet_ethertype - 4) - transmit(output, pkt) - end - end -end diff --git a/src/apps/socket/raw.lua b/src/apps/socket/raw.lua index f7e6dc706a..a6066b338b 100644 --- a/src/apps/socket/raw.lua +++ b/src/apps/socket/raw.lua @@ -80,16 +80,20 @@ end function selftest () -- Send a packet over the loopback device and check -- that it is received correctly. - -- XXX Beware of a race condition with unrelated traffic over the - -- loopback device. local datagram = require("lib.protocol.datagram") local ethernet = require("lib.protocol.ethernet") local ipv6 = require("lib.protocol.ipv6") - - -- Initialize RawSocket. - local lo = RawSocket:new("lo") - lo.input, lo.output = {}, {} - lo.input.rx, lo.output.tx = link.new("test1"), link.new("test2") + local Match = require("apps.test.match").Match + + -- Initialize RawSocket and Match. + local c = config.new() + config.app(c, "lo", RawSocket, "lo") + config.app(c, "match", Match, {fuzzy=true}) + config.link(c, "lo.tx->match.rx") + engine.configure(c) + local link_in, link_cmp = link.new("test_in"), link.new("test_cmp") + engine.app_table.lo.input.rx = link_in + engine.app_table.match.input.comparator = link_cmp -- Construct packet. local dg_tx = datagram:new() local src = ethernet:pton("02:00:00:00:00:01") @@ -99,22 +103,16 @@ function selftest () dst = localhost, next_header = 59, -- No next header. hop_limit = 1})) - dg_tx:push(ethernet:new({src = src, - dst = dst, + dg_tx:push(ethernet:new({src = src, + dst = dst, type = 0x86dd})) - -- Transmit packet. - link.transmit(lo.input.rx, dg_tx:packet()) - lo:push() - -- Receive packet. - lo:pull() - local dg_rx = datagram:new(link.receive(lo.output.tx), ethernet) - -- Assert packet was received OK. - assert(dg_rx:parse({{ethernet, function(eth) - return(eth:src_eq(src) and eth:dst_eq(dst) and eth:type() == 0x86dd) - end }, { ipv6, function(ipv6) - return(ipv6:src_eq(localhost) and ipv6:dst_eq(localhost)) - end } }), "loopback test failed") - lo:stop() + -- Transmit packets. + link.transmit(link_in, dg_tx:packet()) + link.transmit(link_cmp, packet.clone(dg_tx:packet())) + engine.app_table.lo:push() + -- Run engine. + engine.main({duration = 0.01, report = {showapps=true,showlinks=true}}) + assert(#engine.app_table.match:errors() == 0) print("selftest passed") -- XXX Another useful test would be to feed a pcap file with diff --git a/src/apps/solarflare/solarflare.lua b/src/apps/solarflare/solarflare.lua index e1e54a56c2..b410bc7241 100644 --- a/src/apps/solarflare/solarflare.lua +++ b/src/apps/solarflare/solarflare.lua @@ -3,7 +3,6 @@ module(...,package.seeall) local lib = require("core.lib") -local freelist = require("core.freelist") local packet = require("core.packet") require("apps.solarflare.ef_vi_h") local pci = require("lib.hardware.pci") diff --git a/src/apps/tap/selftest.cap b/src/apps/tap/selftest.cap deleted file mode 100644 index d4c2b2b100..0000000000 Binary files a/src/apps/tap/selftest.cap and /dev/null differ diff --git a/src/apps/tap/selftest.sh b/src/apps/tap/selftest.sh index 0d807d489a..56a658be43 100755 --- a/src/apps/tap/selftest.sh +++ b/src/apps/tap/selftest.sh @@ -1,15 +1,15 @@ #!/usr/bin/env bash -sudo ip netns add snabbtest || exit $TEST_SKIPPED -sudo ip netns exec snabbtest ip link add name snabbtest type bridge -sudo ip netns exec snabbtest ip link set up dev snabbtest +ip netns add snabbtest || exit $TEST_SKIPPED +ip netns exec snabbtest ip link add name snabbtest type bridge +ip netns exec snabbtest ip link set up dev snabbtest -sudo ip netns exec snabbtest ip tuntap add tapsrc mode tap -sudo ip netns exec snabbtest ip link set up dev tapsrc -sudo ip netns exec snabbtest ip link set master snabbtest dev tapsrc -sudo ip netns exec snabbtest ip tuntap add tapdst mode tap -sudo ip netns exec snabbtest ip link set up dev tapdst -sudo ip netns exec snabbtest ip link set master snabbtest dev tapdst +ip netns exec snabbtest ip tuntap add tapsrc mode tap +ip netns exec snabbtest ip link set up dev tapsrc +ip netns exec snabbtest ip link set master snabbtest dev tapsrc +ip netns exec snabbtest ip tuntap add tapdst mode tap +ip netns exec snabbtest ip link set up dev tapdst +ip netns exec snabbtest ip link set master snabbtest dev tapdst -sudo SNABB_TAPTEST=yes ip netns exec snabbtest ./snabb snsh -t apps.tap.tap +SNABB_TAPTEST=yes ip netns exec snabbtest ./snabb snsh -t apps.tap.tap -sudo ip netns delete snabbtest +ip netns delete snabbtest diff --git a/src/apps/tap/tap.lua b/src/apps/tap/tap.lua index 98c561a25d..5ae3f47a4e 100644 --- a/src/apps/tap/tap.lua +++ b/src/apps/tap/tap.lua @@ -78,44 +78,27 @@ end function selftest() -- tapsrc and tapdst are bridged together in linux. Packets are sent out of tapsrc and they are expected - -- to arrive back on tapdst. Linux may create other control-plane packets so to avoid races if a packet doesn't - -- match the one we just sent keep looking until it does match. + -- to arrive back on tapdst. -- The linux bridge does mac address learning so some care must be taken with the preparation of selftest.cap -- A mac address should appear only as the source address or destination address -- This test should only be run from inside apps/tap/selftest.sh if not os.getenv("SNABB_TAPTEST") then os.exit(engine.test_skipped_code) end - local pcap = require("lib.pcap.pcap") - local tapsrc = Tap:new("tapsrc") - local tapdst = Tap:new("tapdst") - local linksrc = link.new("linksrc") - local linkreturn = link.new("linkreturn") - tapsrc.input = { input = linksrc } - tapdst.output = { output = linkreturn } - local records = pcap.records("apps/tap/selftest.cap") - local i = 0 - repeat - i = i + 1 - local data, record, extra = records() - if data then - local p = packet.from_string(data) - link.transmit(linksrc, packet.clone(p)) - tapsrc:push() - while true do - local ok, err = S.select({readfds = {tapdst.sock}}, 10) - if err then error("Select error: " .. tostring(err)) end - if ok.count == 0 then error("select timed out or packet " .. tostring(i) .. " didn't match") end - - tapdst:pull() - local pret = link.receive(linkreturn) - if packet.length(pret) == packet.length(p) and C.memcmp(packet.data(pret), packet.data(p), packet.length(pret)) then - packet.free(pret) - break - end - packet.free(pret) - end - packet.free(p) - end - until not data + local Synth = require("apps.test.synth").Synth + local Match = require("apps.test.match").Match + local c = config.new() + config.app(c, "tap_in", Tap, "tapsrc") + config.app(c, "tap_out", Tap, "tapdst") + config.app(c, "match", Match, {fuzzy=true,modest=true}) + config.app(c, "comparator", Synth, {dst="00:50:56:fd:19:ca", + src="00:0c:29:3e:ca:7d"}) + config.app(c, "source", Synth, {dst="00:50:56:fd:19:ca", + src="00:0c:29:3e:ca:7d"}) + config.link(c, "comparator.output->match.comparator") + config.link(c, "source.output->tap_in.input") + config.link(c, "tap_out.output->match.rx") + engine.configure(c) + engine.main({duration = 0.01, report = {showapps=true,showlinks=true}}) + assert(#engine.app_table.match:errors() == 0) end diff --git a/src/apps/test/README.md b/src/apps/test/README.md new file mode 100644 index 0000000000..8af07fbda0 --- /dev/null +++ b/src/apps/test/README.md @@ -0,0 +1,71 @@ +# Test Apps + +## Match (apps.test.match) + +The `Match` app compares packets received on its input port `rx` with those +received on the reference input port `comparator`, and reports mismatches as +well as packets from `comparator` that were not matched. + + DIAGRAM: Match + +----------+ + | | + rx ----* | + | Match | + comparator ---* | + | | + +----------+ + +— Method **Match:errors** + +Returns the recorded errors as an array of strings. + +### Configuration + +The `Match` app accepts a table as its configuration argument. The following +keys are defined: + +— Key **fuzzy** + +*Optional.* If this key is `true` packets from `rx` that do not match the next +packet from `comparator` are ignored. The default is `false`. + +— Key **modest** + +*Optional.* If this key is `true` unmatched packets from `comparator` are +ignored if at least one packet from ´rx´ was successfully matched. The default +is `false`. + + +## Synth (apps.test.synth) + +The `Synth` app generates synthetic packets with Ethernet headers and +alternating payload sizes. On each breath it fills each attached output link +with new packets. + + DIAGRAM: Synth + +-------+ + | | + | *---- (any) + | | + | Synth *---- (any) + | | + | *---- (any) + | | + +-------+ + +### Configuration + +The `Synth` app accepts a table as its configuration argument. The following +keys are defined: + +— Key **src** + +— Key **dst** + +Source and destination MAC addresses in human readable from. The default is +`"00:00:00:00:00:00"`. + +— Key **sizes** + +An array of numbers designating the packet payload sizes. The default is +`{64}`. diff --git a/src/apps/test/lwaftr.lua b/src/apps/test/lwaftr.lua new file mode 100644 index 0000000000..aec09f6f22 --- /dev/null +++ b/src/apps/test/lwaftr.lua @@ -0,0 +1,409 @@ +module(...,package.seeall) + +local lib = require("core.lib") +local app = require("core.app") +local packet = require("core.packet") +local link = require("core.link") +local ethernet = require("lib.protocol.ethernet") +local ipv4 = require("lib.protocol.ipv4") +local ipv6 = require("lib.protocol.ipv6") +local ipsum = require("lib.checksum").ipsum + +local ffi = require("ffi") +local C = ffi.C +local cast = ffi.cast +local copy = ffi.copy + +local PROTO_IPV4_ENCAPSULATION = 0x4 +local PROTO_VLAN = C.htons(0x8100) +local PROTO_IPV4 = C.htons(0x0800) +local PROTO_IPV6 = C.htons(0x86DD) + +local DEFAULT_TTL = 255 +local MAGIC = 0xaffeface + +local ether_header_t = ffi.typeof[[ +struct { + uint8_t ether_dhost[6]; + uint8_t ether_shost[6]; + uint16_t ether_type; +} __attribute__((packed)) +]] +local ether_header_ptr_type = ffi.typeof("$*", ether_header_t) +local ethernet_header_size = ffi.sizeof(ether_header_t) +local OFFSET_ETHERTYPE = 12 + +local ether_vlan_header_type = ffi.typeof([[ +struct { + uint16_t tag; + uint16_t ether_type; +} +]]) +ether_vlan_header_ptr_type = ffi.typeof("$*", ether_vlan_header_type) +ether_vlan_header_size = ffi.sizeof(ether_vlan_header_type) +local OFFSET_ETHERTYPE_VLAN = OFFSET_ETHERTYPE + ether_vlan_header_size + +local ipv4hdr_t = ffi.typeof[[ +struct { + uint16_t ihl_v_tos; // ihl:4, version:4, tos(dscp:6 + ecn:2) + uint16_t total_length; + uint16_t id; + uint16_t frag_off; // flags:3, fragmen_offset:13 + uint8_t ttl; + uint8_t protocol; + uint16_t checksum; + uint8_t src_ip[4]; + uint8_t dst_ip[4]; +} __attribute__((packed)) +]] +local ipv4_header_ptr_type = ffi.typeof("$*", ipv4hdr_t) + +local ipv6_ptr_type = ffi.typeof([[ +struct { + uint32_t v_tc_fl; // version, tc, flow_label + uint16_t payload_length; + uint8_t next_header; + uint8_t hop_limit; + uint8_t src_ip[16]; + uint8_t dst_ip[16]; +} __attribute__((packed)) +]]) +local ipv6_header_ptr_type = ffi.typeof("$*", ipv6_ptr_type) +local ipv6_header_size = ffi.sizeof(ipv6_ptr_type) + +local udp_header_t = ffi.typeof[[ +struct { + uint16_t src_port; + uint16_t dst_port; + uint16_t len; + uint16_t checksum; +} __attribute__((packed)) +]] +local udp_header_ptr_type = ffi.typeof("$*", udp_header_t) +local udp_header_size = ffi.sizeof(udp_header_ptr_type) + +local payload_t = ffi.typeof[[ +struct { + uint32_t magic; + uint32_t number; +} __attribute__((packed)) +]] +local payload_ptr_type = ffi.typeof("$*", payload_t) +local payload_size = ffi.sizeof(payload_t) + +local uint16_ptr_t = ffi.typeof("uint16_t*") +local uint32_ptr_t = ffi.typeof("uint32_t*") + +local n_cache_src_ipv6 = ipv6:pton("::") + +local function rd32(offset) + return cast(uint32_ptr_t, offset)[0] +end + +local function wr32(offset, val) + cast(uint32_ptr_t, offset)[0] = val +end + +local function inc_ipv6(ipv6) + for i=15,0,-1 do + if ipv6[i] == 255 then + ipv6[i] = 0 + else + ipv6[i] = ipv6[i] + 1 + break + end + end + return ipv6 +end + +Lwaftrgen = {} + +local receive, transmit = link.receive, link.transmit + +function Lwaftrgen:new(arg) + local conf = arg and config.parse_app_arg(arg) or {} + local dst_mac = ethernet:pton(conf.dst_mac) + local src_mac = ethernet:pton(conf.src_mac) + local vlan = conf.vlan + local b4_ipv6 = conf.b4_ipv6 and ipv6:pton(conf.b4_ipv6) + local b4_ipv4 = conf.b4_ipv4 and ipv4:pton(conf.b4_ipv4) + local public_ipv4 = conf.public_ipv4 and ipv4:pton(conf.public_ipv4) + local aftr_ipv6 = conf.aftr_ipv6 and ipv6:pton(conf.aftr_ipv6) + + local ipv4_pkt = packet.allocate() + local eth_hdr = cast(ether_header_ptr_type, ipv4_pkt.data) + eth_hdr.ether_dhost, eth_hdr.ether_shost = dst_mac, src_mac + + local ipv4_hdr, udp_offset + if vlan then + udp_offset = 38 + eth_hdr.ether_type = PROTO_VLAN + local vlan_hdr = cast(ether_vlan_header_ptr_type, ipv4_pkt.data + ethernet_header_size) + vlan_hdr.ether_type = PROTO_IPV4 + vlan_hdr.tag = C.htons(vlan) + ipv4_hdr = cast(ipv4_header_ptr_type, ipv4_pkt.data + ethernet_header_size + ether_vlan_header_size) + else + udp_offset = 34 + eth_hdr.ether_type = PROTO_IPV4 + ipv4_hdr = cast(ipv4_header_ptr_type, ipv4_pkt.data + ethernet_header_size) + end + + ipv4_hdr.src_ip = public_ipv4 + ipv4_hdr.dst_ip = b4_ipv4 + ipv4_hdr.ttl = 15 + ipv4_hdr.ihl_v_tos = C.htons(0x4500) -- v4 + ipv4_hdr.id = 0 + ipv4_hdr.frag_off = 0 + + local ipv4_udp_hdr, ipv4_payload + + ipv4_hdr.protocol = 17 -- UDP(17) + ipv4_udp_hdr = cast(udp_header_ptr_type, ipv4_pkt.data + udp_offset) + ipv4_udp_hdr.src_port = C.htons(12345) + ipv4_udp_hdr.checksum = 0 + ipv4_payload = cast(payload_ptr_type, ipv4_pkt.data + udp_offset + udp_header_size) + ipv4_payload.magic = MAGIC + ipv4_payload.number = 0 + + -- IPv4 in IPv6 packet + copy(n_cache_src_ipv6, b4_ipv6, 16) + local ipv6_pkt = packet.allocate() + local eth_hdr = cast(ether_header_ptr_type, ipv6_pkt.data) + eth_hdr.ether_dhost, eth_hdr.ether_shost = dst_mac, src_mac + + + local ipv6_hdr, ipv6_ipv4_hdr + if vlan then + eth_hdr.ether_type = PROTO_VLAN + local vlan_hdr = cast(ether_vlan_header_ptr_type, ipv6_pkt.data + ethernet_header_size) + vlan_hdr.ether_type = PROTO_IPV6 + vlan_hdr.tag = C.htons(vlan) + ipv6_hdr = cast(ipv6_header_ptr_type, ipv6_pkt.data + ethernet_header_size + ether_vlan_header_size) + ipv6_ipv4_hdr = cast(ipv4_header_ptr_type, ipv6_pkt.data + ethernet_header_size + ether_vlan_header_size + ipv6_header_size) + else + eth_hdr.ether_type = PROTO_IPV6 + ipv6_hdr = cast(ipv6_header_ptr_type, ipv6_pkt.data + ethernet_header_size) + ipv6_ipv4_hdr = cast(ipv4_header_ptr_type, ipv6_pkt.data + ethernet_header_size + ipv6_header_size) + end + + lib.bitfield(32, ipv6_hdr, 'v_tc_fl', 0, 4, 6) -- IPv6 Version + lib.bitfield(32, ipv6_hdr, 'v_tc_fl', 4, 8, 1) -- Traffic class + ipv6_hdr.next_header = PROTO_IPV4_ENCAPSULATION + ipv6_hdr.hop_limit = DEFAULT_TTL + ipv6_hdr.dst_ip = aftr_ipv6 + + ipv6_ipv4_hdr.dst_ip = public_ipv4 + ipv6_ipv4_hdr.ttl = 15 + ipv6_ipv4_hdr.ihl_v_tos = C.htons(0x4500) -- v4 + ipv6_ipv4_hdr.id = 0 + ipv6_ipv4_hdr.frag_off = 0 + + local ipv6_ipv4_udp_hdr, ipv6_payload + + local total_packet_count = 0 + for _,size in ipairs(conf.sizes) do + -- count for IPv4 and IPv6 packets (40 bytes IPv6 encap header) + if conf.ipv4_only or conf.ipv6_only then + total_packet_count = total_packet_count + 1 + else + total_packet_count = total_packet_count + 2 + end + end + + ipv6_ipv4_hdr.protocol = 17 -- UDP(17) + ipv6_ipv4_udp_hdr = cast(udp_header_ptr_type, ipv6_pkt.data + udp_offset + ipv6_header_size) + ipv6_ipv4_udp_hdr.dst_port = C.htons(12345) + ipv6_ipv4_udp_hdr.checksum = 0 + ipv6_payload = cast(payload_ptr_type, ipv6_pkt.data + udp_offset + ipv6_header_size + udp_header_size) + ipv6_payload.magic = MAGIC + ipv6_payload.number = 0 + + local o = { + b4_ipv6 = b4_ipv6, + b4_ipv4 = b4_ipv4, + b4_port = conf.b4_port, + current_port = conf.b4_port, + b4_ipv4_offset = 0, + ipv6_address = n_cache_src_ipv6, + count = conf.count, + single_pass = conf.single_pass, + current_count = 0, + ipv4_pkt = ipv4_pkt, + ipv4_hdr = ipv4_hdr, + ipv4_payload = ipv4_payload, + ipv6_hdr = ipv6_hdr, + ipv6_pkt = ipv6_pkt, + ipv6_payload = ipv6_payload, + ipv6_ipv4_hdr = ipv6_ipv4_hdr, + ipv4_udp_hdr = ipv4_udp_hdr, + ipv6_ipv4_udp_hdr = ipv6_ipv4_udp_hdr, + ipv4_only = conf.ipv4_only, + ipv6_only = conf.ipv6_only, + vlan = vlan, + udp_offset = udp_offset, + protocol = conf.protocol, + rate = conf.rate, + sizes = conf.sizes, + total_packet_count = total_packet_count, + bucket_content = conf.rate * 1e6, + ipv4_packets = 0, ipv4_bytes = 0, + ipv6_packets = 0, ipv6_bytes = 0, + ipv4_packet_number = 0, ipv6_packet_number = 0, + last_rx_ipv4_packet_number = 0, last_rx_ipv6_packet_number = 0, + lost_packets = 0 + } + return setmetatable(o, {__index=Lwaftrgen}) +end + +function Lwaftrgen:push () + + local output = self.output.output + local input = self.input.input + local ipv6_packets = self.ipv6_packets + local ipv6_bytes = self.ipv6_bytes + local ipv4_packets = self.ipv4_packets + local ipv4_bytes = self.ipv4_bytes + local lost_packets = self.lost_packets + local udp_offset = self.udp_offset + + if self.current == 0 then + main.exit(0) + end + + -- count and trash incoming packets + for _=1,link.nreadable(input) do + local pkt = receive(input) + if cast(uint16_ptr_t, pkt.data + OFFSET_ETHERTYPE)[0] == PROTO_IPV6 then + ipv6_bytes = ipv6_bytes + pkt.length + ipv6_packets = ipv6_packets + 1 + local payload = cast(payload_ptr_type, pkt.data + udp_offset + ipv6_header_size + udp_header_size) + if payload.magic == MAGIC then + if self.last_rx_ipv6_packet_number > 0 then + lost_packets = lost_packets + payload.number - self.last_rx_ipv6_packet_number - 1 + end + self.last_rx_ipv6_packet_number = payload.number + end + else + ipv4_bytes = ipv4_bytes + pkt.length + ipv4_packets = ipv4_packets + 1 + local payload = cast(payload_ptr_type, pkt.data + udp_offset + udp_header_size) + if payload.magic == MAGIC then + if self.last_rx_ipv4_packet_number > 0 then + lost_packets = lost_packets + payload.number - self.last_rx_ipv4_packet_number - 1 + end + self.last_rx_ipv4_packet_number = payload.number + end + end + packet.free(pkt) + end + + local cur_now = tonumber(app.now()) + self.period_start = self.period_start or cur_now + local elapsed = cur_now - self.period_start + if elapsed > 1 then + local ipv6_packet_rate = ipv6_packets / elapsed / 1e6 + local ipv4_packet_rate = ipv4_packets / elapsed / 1e6 + local ipv6_octet_rate = ipv6_bytes * 8 / 1e9 / elapsed + local ipv4_octet_rate = ipv4_bytes * 8 / 1e9 / elapsed + local lost_rate = math.abs(lost_packets / (ipv6_octet_rate + ipv4_octet_rate) / 10000) + print(string.format('v6+v4: %.3f+%.3f = %.6f MPPS, %.3f+%.3f = %.6f Gbps, lost %.3f%%', + ipv6_packet_rate, ipv4_packet_rate, ipv6_packet_rate + ipv4_packet_rate, + ipv6_octet_rate, ipv4_octet_rate, ipv6_octet_rate + ipv4_octet_rate, lost_rate)) + self.period_start = cur_now + self.ipv6_bytes, self.ipv6_packets = 0, 0 + self.ipv4_bytes, self.ipv4_packets = 0, 0 + self.lost_packets = 0 + else + self.ipv4_bytes, self.ipv4_packets = ipv4_bytes, ipv4_packets + self.ipv6_bytes, self.ipv6_packets = ipv6_bytes, ipv6_packets + self.lost_packets = lost_packets + end + + local ipv4_hdr = self.ipv4_hdr + local ipv6_hdr = self.ipv6_hdr + local ipv6_ipv4_hdr = self.ipv6_ipv4_hdr + local ipv4_udp_hdr = self.ipv4_udp_hdr + local ipv6_ipv4_udp_hdr = self.ipv6_ipv4_udp_hdr + + local cur_now = tonumber(app.now()) + local last_time = self.last_time or cur_now + self.bucket_content = self.bucket_content + self.rate * 1e6 * (cur_now - last_time) + self.last_time = cur_now + + while link.nwritable(output) > self.total_packet_count and + self.total_packet_count <= self.bucket_content do + self.bucket_content = self.bucket_content - self.total_packet_count + + ipv4_hdr.dst_ip = self.b4_ipv4 + ipv6_ipv4_hdr.src_ip = self.b4_ipv4 + ipv6_hdr.src_ip = self.b4_ipv6 + local ipdst = C.ntohl(rd32(ipv4_hdr.dst_ip)) + ipdst = C.htonl(ipdst + self.b4_ipv4_offset) + wr32(ipv4_hdr.dst_ip, ipdst) + wr32(ipv6_ipv4_hdr.src_ip, ipdst) + + ipv4_udp_hdr.dst_port = C.htons(self.current_port) + ipv6_ipv4_udp_hdr.src_port = C.htons(self.current_port) + + for _,size in ipairs(self.sizes) do + + if not self.ipv6_only then + ipv4_hdr.total_length = C.htons(size) + if self.vlan then + ipv4_udp_hdr.len = C.htons(size - 28 + 4) + self.ipv4_pkt.length = size + ethernet_header_size + 4 + else + ipv4_udp_hdr.len = C.htons(size - 28) + self.ipv4_pkt.length = size + ethernet_header_size + end + ipv4_hdr.checksum = 0 + ipv4_hdr.checksum = C.htons(ipsum(self.ipv4_pkt.data + ethernet_header_size, 20, 0)) + self.ipv4_payload.number = self.ipv4_packet_number; + self.ipv4_packet_number = self.ipv4_packet_number + 1 + local ipv4_pkt = packet.clone(self.ipv4_pkt) + transmit(output, ipv4_pkt) + end + + if not self.ipv4_only then + ipv6_hdr.payload_length = C.htons(size) + ipv6_ipv4_hdr.total_length = C.htons(size) + if self.vlan then + ipv6_ipv4_udp_hdr.len = C.htons(size - 28 + 4) + self.ipv6_pkt.length = size + 54 + 4 + else + ipv6_ipv4_udp_hdr.len = C.htons(size - 28) + self.ipv6_pkt.length = size + 54 + end + self.ipv6_payload.number = self.ipv6_packet_number; + self.ipv6_packet_number = self.ipv6_packet_number + 1 + local ipv6_pkt = packet.clone(self.ipv6_pkt) + transmit(output, ipv6_pkt) + end + + self.current_count = self.current_count + 1 + self.current_port = self.current_port + self.b4_port + + self.b4_ipv6 = inc_ipv6(self.b4_ipv6) + + if self.current_port > 65535 then + self.current_port = self.b4_port + self.b4_ipv4_offset = self.b4_ipv4_offset + 1 + end + + if self.current_count >= self.count * self.total_packet_count then + if self.single_pass then + print(string.format("generated %d packets", self.current_count)) + -- make sure we won't generate more packets in the same breath, then exit + self.current = 0 + self.bucket_content = 0 + end + self.current_count = 0 + self.current_port = self.b4_port + self.b4_ipv4_offset = 0 + copy(self.b4_ipv6, self.ipv6_address, 16) + end + + end + end +end + diff --git a/src/apps/test/match.lua b/src/apps/test/match.lua new file mode 100644 index 0000000000..48427d50bc --- /dev/null +++ b/src/apps/test/match.lua @@ -0,0 +1,95 @@ +module(...,package.seeall) + +local ffi = require("ffi") +local C = ffi.C +local lib = require("core.lib") + +local function dump (p) + return lib.hexdump(ffi.string(packet.data(p), packet.length(p))) +end + +Match = {} + +function Match:new (arg) + local conf = arg and config.parse_app_arg(arg) or {} + return setmetatable({ fuzzy = conf.fuzzy, + modest = conf.modest, + seen = 0, + errs = { } }, + { __index=Match }) +end + +function Match:push () + while not link.empty(self.input.rx) do + local p = link.receive(self.input.rx) + local cmp = link.front(self.input.comparator) + if not cmp then + elseif packet.length(cmp) ~= packet.length(p) + or C.memcmp(packet.data(cmp), packet.data(p), packet.length(cmp)) ~= 0 then + if not self.fuzzy then + table.insert(self.errs, "Mismatch:\n"..dump(cmp).."\n"..dump(p)) + end + else + self.seen = self.seen + 1 + packet.free(link.receive(self.input.comparator)) + end + packet.free(p) + end +end + +function Match:report () + for _, error in ipairs(self:errors()) do + print(error) + end +end + +function Match:errors () + if not (self.modest and self.seen > 0) then + while not link.empty(self.input.comparator) do + local p = link.receive(self.input.comparator) + table.insert(self.errs, "Not matched:\n"..dump(p)) + packet.free(p) + end + end + return self.errs +end + +function selftest() + local basic_apps = require("apps.basic.basic_apps") + local c = config.new() + + config.app(c, "sink", Match, {modest=true}) + config.app(c, "comparator", basic_apps.Source, 8) + config.link(c, "comparator.output -> sink.comparator") + engine.configure(c) + engine.app_table.sink.input.rx = link.new("null") + engine.app_table.sink.seen = 1 + engine.main({duration=0.0001}) + assert(#engine.app_table.sink:errors() == 0) + + engine.configure(config.new()) + config.app(c, "sink", Match) + config.app(c, "src", basic_apps.Source, 8) + config.link(c, "src.output -> sink.rx") + engine.configure(c) + engine.main({duration=0.0001}) + assert(#engine.app_table.sink:errors() == 0) + + engine.configure(config.new()) + config.app(c, "comparator", basic_apps.Source, 12) + engine.configure(c) + engine.main({duration=0.0001}) + assert(#engine.app_table.sink:errors() > 0) + + engine.configure(config.new()) + config.app(c, "sink", Match, {fuzzy=true}) + config.app(c, "comparator", basic_apps.Source, 8) + config.app(c, "garbage", basic_apps.Source, 12) + config.app(c, "join", basic_apps.Join) + config.link(c, "src.output -> join.src") + config.link(c, "garbage.output -> join.garbage") + config.link(c, "join.out -> sink.rx") + engine.configure(c) + engine.main({duration=0.0001}) + assert(#engine.app_table.sink:errors() == 0) +end diff --git a/src/apps/test/synth.lua b/src/apps/test/synth.lua index 9da8590dd2..0ffc1c97fb 100644 --- a/src/apps/test/synth.lua +++ b/src/apps/test/synth.lua @@ -17,11 +17,14 @@ function Synth:new (arg) conf.dst = conf.dst or '00:00:00:00:00:00' local packets = {} for i, size in ipairs(conf.sizes) do - local ether = ethernet:new({ src = ethernet:pton(conf.src), - dst = ethernet:pton(conf.dst) }) local payload_size = size - ethernet:sizeof() + assert(payload_size >= 0 and payload_size <= 1536, + "Invalid payload size: "..payload_size) local data = ffi.new("char[?]", payload_size) local dgram = datagram:new(packet.from_pointer(data, payload_size)) + local ether = ethernet:new({ src = ethernet:pton(conf.src), + dst = ethernet:pton(conf.dst), + type = payload_size }) dgram:push(ether) packets[i] = dgram:packet() end @@ -46,19 +49,16 @@ end function selftest () local pcap = require("apps.pcap.pcap") + local Match = require("apps.test.match").Match local c = config.new() + config.app(c, "match", Match) + config.app(c, "reader", pcap.PcapReader, "apps/test/synth.pcap") config.app(c, "synth", Synth, { sizes = {32, 64, 128}, src = "11:11:11:11:11:11", dst = "22:22:22:22:22:22" }) - config.app(c, "writer", pcap.PcapWriter, "apps/test/synth.pcap.output") - config.link(c, "synth.output->writer.input") + config.link(c, "reader.output->match.comparator") + config.link(c, "synth.output->match.rx") engine.configure(c) - engine.main({ duration = 0.00000001, -- hack: one breath. - report = { showlinks = true } }) - - if io.open("apps/test/synth.pcap"):read('*a') ~= - io.open("apps/test/synth.pcap.output"):read('*a') - then - error("synth.pcap and synth.pcap.output differ.") - end + engine.main({ duration = 0.0001, report = {showapps=true,showlinks=true}}) + assert(#engine.app_table.match:errors() == 0) end diff --git a/src/apps/test/synth.pcap b/src/apps/test/synth.pcap index 4dd3dad5ed..0f231e4551 100644 Binary files a/src/apps/test/synth.pcap and b/src/apps/test/synth.pcap differ diff --git a/src/apps/test/synth.pcap.output b/src/apps/test/synth.pcap.output deleted file mode 100644 index 4dd3dad5ed..0000000000 Binary files a/src/apps/test/synth.pcap.output and /dev/null differ diff --git a/src/apps/vlan/README.md b/src/apps/vlan/README.md new file mode 100644 index 0000000000..e47ee1787b --- /dev/null +++ b/src/apps/vlan/README.md @@ -0,0 +1,54 @@ +# VLAN + +There are three VLAN related apps, `Tagger`, `Untagger` and `VlanMux`. The +`Tagger` and `Untagger` apps add or remove a VLAN tag whereas the `VlanMux` app +can multiplex and demultiplex packets to different output ports based on tag. + +## Tagger (apps.vlan.vlan) + +The `Tagger` app adds a VLAN tag, with the configured value, to packets +received on its `input` port and transmits them on its `output` port. + +### Configuration + +— Key **tag** + +*Required*. VLAN tag to add or remove from the packet. + + +## Untagger (apps.vlan.vlan) + +The `Untagger` app checks packets received on its `input` port for a VLAN tag, +removes it if it matches with the configured VLAN tag and transmits them on its +`output` port. Packets with other VLAN tags than the configured tag will be +dropped. + +### Configuration + +— Key **tag** + +*Required*. VLAN tag to add or remove from the packet. + + +## VlanMux (apps.vlan.vlan) + +Despite the name, the `VlanMux` app can act both as a multiplexer, i.e. receive +packets from multiple different input ports, add a VLAN tag and transmit them +out onto one, as well as receiving packets from its `trunk` port and +demultiplex it over many output ports based on the VLAN tag of the received +packet. + +Packets received on its `trunk` input port with Ethernet type 0x8100 are +inspected for the VLAN tag and transmitted on an output port `vlanX` where *X* +is the VLAN tag parsed from the packet. If no such output port exists the +packet is dropped. Received packets with an Ethernet type other than 0x8100 are +transmitted on its `native` output port, + +Packets received on its `native` input port are transmitted verbatim on its +`trunk` output port. + +Packets received on input ports named `vlanX`, where *X* is a VLAN tag, will +have the VLAN tag *X* added and then be transmitted on its `trunk` output port. + +There is no configuration for the `VlanMux` app, simply connect it to your +other apps and it will base its actions on the name of the ports. diff --git a/src/apps/vlan/vlan.lua b/src/apps/vlan/vlan.lua new file mode 100644 index 0000000000..763017beb8 --- /dev/null +++ b/src/apps/vlan/vlan.lua @@ -0,0 +1,177 @@ +module(..., package.seeall) + +local packet = require("core.packet") +local bit = require("bit") +local ffi = require("ffi") +local lib = require("core.lib") + +local C = ffi.C +local receive, transmit = link.receive, link.transmit +local cast = ffi.cast +local htons, htonl = lib.htons, lib.htonl +local ntohs, ntohl = htons, htonl + +Tagger = {} +Untagger = {} + +-- 802.1q +local dot1q_tpid = 0x8100 +local o_ethernet_ethertype = 12 +local uint32_ptr_t = ffi.typeof('uint32_t*') + + +-- build a VLAN tag consisting of 2 bytes of TPID set to 0x8100 followed by the +-- TCI field which in turns consists of PCP, DEI and VID (VLAN id). Both PCP +-- and DEI is always 0 +local function build_tag(vid) + return htonl(bit.bor(bit.lshift(dot1q_tpid, 16), vid)) +end + +-- pop a VLAN tag (4 byte of TPID and TCI) from a packet +function pop_tag(pkt) + local payload = pkt.data + o_ethernet_ethertype + local length = pkt.length + pkt.length = length - 4 + C.memmove(payload, payload + 4, length - o_ethernet_ethertype - 4) +end + +-- push a VLAN tag onto a packet +function push_tag(pkt, tag) + local payload = pkt.data + o_ethernet_ethertype + local length = pkt.length + pkt.length = length + 4 + C.memmove(payload + 4, payload, length - o_ethernet_ethertype) + cast(uint32_ptr_t, payload)[0] = tag +end + +-- extract TCI (2 bytes) from packet, no check is performed to verify that the +-- packet is carrying a VLAN tag, if it's an untagged frame these bytes will be +-- Ethernet payload +function extract_tci(pkt) + return ntohs(cast("uint16_t*", packet.data(pkt) + o_ethernet_ethertype + 2)[0]) +end + +-- extract VLAN id from TCI +function tci_to_vid(tci) + return bit.band(tci, 0xFFF) +end + + +function Tagger:new(conf) + local o = setmetatable({}, {__index=Tagger}) + o.tag = build_tag(assert(conf.tag)) + return o +end + +function Tagger:push () + local input, output = self.input.input, self.output.output + local tag = self.tag + for _=1,link.nreadable(input) do + local pkt = receive(input) + push_tag(pkt, tag) + transmit(output, pkt) + end +end + +function Untagger:new(conf) + local o = setmetatable({}, {__index=Untagger}) + o.tag = build_tag(assert(conf.tag)) + return o +end + +function Untagger:push () + local input, output = self.input.input, self.output.output + local tag = self.tag + for _=1,link.nreadable(input) do + local pkt = receive(input) + local payload = pkt.data + o_ethernet_ethertype + if cast(uint32_ptr_t, payload)[0] ~= tag then + -- Incorrect VLAN tag; drop. + packet.free(pkt) + else + pop_tag(pkt) + transmit(output, pkt) + end + end +end + + +VlanMux = {} +function VlanMux:new() + local self = setmetatable({}, {__index=VlanMux}) + self.dot1q_tpid = htons(dot1q_tpid) + return self +end + +function VlanMux:push() + local noutputs = #self.output + if noutputs > 0 then + for name, l in pairs(self.input) do + local maxoutput = link.max + -- find out max number of packets we can put out an interface + -- this is kind of bad because we limit ourselves by the interface with + -- the fullest queue, yet packets might go out a different interface. We + -- don't know until we've looked in the packet and parsed the VLAN id. I + -- suppose we kind of get some HOLB with this :( + for _, o in ipairs(self.output) do + maxoutput = math.min(maxoutput, link.nwritable(o)) + end + + if type(name) == "string" then + for _ = 1, math.min(link.nreadable(l), maxoutput) do + local p = receive(l) + local ethertype = cast("uint16_t*", packet.data(p) + o_ethernet_ethertype)[0] + + if name == "trunk" then -- trunk + -- check for ethertype 0x8100 (802.1q VLAN tag) + if ethertype == self.dot1q_tpid then + -- dig out TCI field + local tci = extract_tci(p) + local vid = tci_to_vid(tci) + local oif = self.output["vlan"..vid] + pop_tag(p) + self:transmit(oif, p) + + else -- untagged, send to native output + self:transmit(self.output.native, p) + end + elseif name == "native" then + self:transmit(self.output.trunk, p) + else -- some vlanX interface + local vid = tonumber(string.sub(name, 5)) + push_tag(p, build_tag(vid)) + self:transmit(self.output.trunk, p) + end + end + end + end + end +end + +-- transmit packet out interface if given interface exists, otherwise drop +function VlanMux:transmit(o, pkt) + if o == nil then + packet.free(pkt) + else + transmit(o, pkt) + end +end + + +function selftest() + local app = require("core.app") + local basic_apps = require("apps.basic.basic_apps") + + local c = config.new() + config.app(c, "source", basic_apps.Source) + config.app(c, "vlan_mux", VlanMux) + config.app(c, "sink", basic_apps.Sink) + + config.link(c, "source.output -> vlan_mux.vlan1") + config.link(c, "vlan_mux.trunk -> sink.input") + app.configure(c) + app.main({duration = 1}) + + print("source sent: " .. link.stats(app.app_table.source.output.output).txpackets) + print("sink received: " .. link.stats(app.app_table.sink.input.input).rxpackets) +end diff --git a/src/core/counter.lua b/src/core/counter.lua index a479df4040..f0afc0748b 100644 --- a/src/core/counter.lua +++ b/src/core/counter.lua @@ -44,7 +44,7 @@ local private = {} local numbers = {} -- name -> number function open (name, readonly) - if numbers[name] then error("counter already opened: " .. name) end + if numbers[name] then return private[numbers[name]] end local n = #public+1 if readonly then public[n] = shm.open(name, counter_t, readonly) diff --git a/src/core/freelist.lua b/src/core/freelist.lua deleted file mode 100644 index 34d7f51f90..0000000000 --- a/src/core/freelist.lua +++ /dev/null @@ -1,33 +0,0 @@ --- Use of this source code is governed by the Apache 2.0 license; see COPYING. - -module(...,package.seeall) - -local ffi = require("ffi") - -function new (type, size) - return { nfree = 0, - max = size, - -- XXX Better LuaJIT idiom for specifying the array type? - list = ffi.new(type.."[?]", size) } -end - -function add (freelist, element) - -- Safety check - if _G.developer_debug then assert(freelist.nfree < freelist.max, "freelist overflow") end - freelist.list[freelist.nfree] = element - freelist.nfree = freelist.nfree + 1 -end - -function remove (freelist) - if freelist.nfree == 0 then - error("no free packets") - else - freelist.nfree = freelist.nfree - 1 - return freelist.list[freelist.nfree] - end -end - -function nfree (freelist) - return freelist.nfree -end - diff --git a/src/core/main.lua b/src/core/main.lua index a3efec8e66..eac20ec884 100644 --- a/src/core/main.lua +++ b/src/core/main.lua @@ -34,6 +34,10 @@ debug_on_error = false function main () zone("startup") require "lib.lua.strict" + -- Warn on unsupported platforms + if ffi.arch ~= 'x64' or ffi.os ~= 'Linux' then + error("fatal: "..ffi.os.."/"..ffi.arch.." is not a supported platform\n") + end initialize() local program, args = select_program(parse_command_line()) if not lib.have_module(modulename(program)) then diff --git a/src/core/memory.lua b/src/core/memory.lua index ec25748211..046eaa3091 100644 --- a/src/core/memory.lua +++ b/src/core/memory.lua @@ -50,9 +50,18 @@ end --- ### HugeTLB: Allocate contiguous memory in bulk from Linux function allocate_hugetlb_chunk () + local fd, err = syscall.open("/proc/sys/vm/nr_hugepages","rdonly") + assert(fd, tostring(err)) + fd:flock("ex") for i =1, 3 do local page = C.allocate_huge_page(huge_page_size) - if page ~= nil then return page else reserve_new_page() end + if page ~= nil then + fd:flock("un") + fd:close() + return page + else + reserve_new_page() + end end end @@ -60,16 +69,15 @@ function reserve_new_page () -- Check that we have permission lib.root_check("error: must run as root to allocate memory for DMA") -- Is the kernel shm limit too low for huge pages? - if huge_page_size > tonumber(syscall.sysctl("kernel.shmmax")) then + if huge_page_size > tonumber(lib.firstline("/proc/sys/kernel/shmmax")) then -- Yes: fix that - local old = syscall.sysctl("kernel.shmmax", tostring(huge_page_size)) + local old = lib.writefile("/proc/sys/kernel/shmmax", tostring(huge_page_size)) io.write("[memory: Enabling huge pages for shm: ", "sysctl kernel.shmmax ", old, " -> ", huge_page_size, "]\n") else - -- No: try provisioning an additional page - local have = tonumber(syscall.sysctl("vm.nr_hugepages")) + local have = tonumber(lib.firstline("/proc/sys/vm/nr_hugepages")) local want = have + 1 - syscall.sysctl("vm.nr_hugepages", tostring(want)) + lib.writefile("/proc/sys/vm/nr_hugepages", tostring(want)) io.write("[memory: Provisioned a huge page: sysctl vm.nr_hugepages ", have, " -> ", want, "]\n") end end diff --git a/src/core/packet.lua b/src/core/packet.lua index 5572ef3762..df0baf516f 100644 --- a/src/core/packet.lua +++ b/src/core/packet.lua @@ -7,11 +7,9 @@ local debug = _G.developer_debug local ffi = require("ffi") local C = ffi.C -local freelist = require("core.freelist") local lib = require("core.lib") local memory = require("core.memory") local counter = require("core.counter") -local freelist_add, freelist_remove, freelist_nfree = freelist.add, freelist.remove, freelist.nfree require("core.packet_h") @@ -22,10 +20,41 @@ local header_size = 8 local max_payload = tonumber(C.PACKET_PAYLOAD_SIZE) -- Freelist containing empty packets ready for use. -local max_packets = 1e5 + +ffi.cdef[[ +struct freelist { + uint64_t nfree; + uint64_t max; + struct packet *list[?]; +}; +]] + +local function freelist_add(freelist, element) + -- Safety check + if _G.developer_debug then + assert(freelist.nfree < freelist.max, "freelist overflow") + end + freelist.list[freelist.nfree] = element + freelist.nfree = freelist.nfree + 1 +end + +local function freelist_remove(freelist) + if freelist.nfree == 0 then + error("no free packets") + else + freelist.nfree = freelist.nfree - 1 + return freelist.list[freelist.nfree] + end +end + +local function freelist_nfree(freelist) + return freelist.nfree +end + +local max_packets = 1e6 local packet_allocation_step = 1000 local packets_allocated = 0 -local packets_fl = freelist.new("struct packet *", max_packets) +local packets_fl = ffi.new("struct freelist", max_packets, 0, max_packets) -- Return an empty packet. function allocate () @@ -107,9 +136,8 @@ function data (p) return p.data end function length (p) return p.length end function preallocate_step() - if _G.developer_debug then - assert(packets_allocated + packet_allocation_step <= max_packets) - end + assert(packets_allocated + packet_allocation_step <= max_packets, + "packet allocation overflow") for i=1, packet_allocation_step do free_internal(new_packet(), true) diff --git a/src/core/shm.lua b/src/core/shm.lua index 05a2bf53a3..77b5a93917 100644 --- a/src/core/shm.lua +++ b/src/core/shm.lua @@ -67,6 +67,7 @@ module(..., package.seeall) local ffi = require("ffi") local lib = require("core.lib") local S = require("syscall") +local const = require("syscall.linux.constants") -- Root directory where the object tree is created. root = "/var/run/snabb" @@ -131,7 +132,7 @@ function mkdir (name) if not S.stat(root) then local mask = S.umask(0) local status, err = S.mkdir(root, "01777") - assert(status, ("Unable to create %s: %s"):format( + assert(status or err.errno == const.E.EXIST, ("Unable to create %s: %s"):format( root, tostring(err or "unspecified error"))) S.umask(mask) end diff --git a/src/doc/code-of-conduct.md b/src/doc/code-of-conduct.md new file mode 100644 index 0000000000..3cecd6449d --- /dev/null +++ b/src/doc/code-of-conduct.md @@ -0,0 +1,7 @@ +### Code of conduct + +Follow the [Contributor Covenant](http://contributor-covenant.org/) and report problems to `conduct@snabb.co`. + +Help people achieve their goals. Don't stand between people who want to cooperate. + +Keep people in the loop. Use Github Issues for important discussions. diff --git a/src/doc/core-data-structures.md b/src/doc/core-data-structures.md new file mode 100644 index 0000000000..c72c66664f --- /dev/null +++ b/src/doc/core-data-structures.md @@ -0,0 +1,69 @@ +## Core data structures + +Software architectures can sometimes be summarized with a few key data +structures. + +Unix is about processes, pipes, and files. Processes are executing +code, pipes are FIFO byte buffers, and files are binary storage. + +Emacs is about text, buffers, and windows. Text is strings of +characters with key-value properties, buffers are collections of text +and positional markers, and windows are user-visible screen areas that +display parts of buffers. + +Snabb Switch is about **packets**, **links**, and **apps**. + +### Packets + +Packets are the basic inputs and outputs of Snabb Switch. A packet is +simply a variable-size array of binary data. Packets usually contain +data in an Ethernet-based format but this is only a convention. + +``` +struct packet { + unsigned char payload[10240]; + uint16_t length; +} +``` + +Packets on the wire in physical networks are bits encoded as a series +of electrical or optical impulses. Snabb Switch just encodes those +same bits into memory. + +### Links + +A link collects a series of packets for processing by an app. Links between apps serve a similar purpose to ethernet cables between network devices, except that links are unidirectional. Links are represented as simple [ring buffers](https://en.wikipedia.org/wiki/Circular_buffer) of packets. + +``` +struct link { + struct packet *packets[256]; + int read, write; // ring cursor positions +} +``` + +### Apps + +Apps are the active part of Snabb Switch. Each app performs either or both of these functions: + +1. "Pull" new packets into Snabb Switch by receiving data from the outside world (e.g. a network interface card) and placing them onto output links for processing. +2. "Push" existing packets from input links through the next step of their processing: output onto a real network, transfer onto one or more output links for processing by other apps, perform filtering or transformation, and so on. + +In principle an app is a piece of machine code: anything that can execute. In practice an app is represented as a Lua object and executes code compiled by LuaJIT. (This code can easily call out to C, assembler, or other languages but in practice it seldom does.) + +``` +{ + input = { ... }, -- Table of named input links + output = { ... }, -- Table of named output links + pull = , -- Function to "pull" new packets into the system. + push = -- Function to "push" existing packets onward. +} +``` + +### Summary + +Those are the most important data structures in Snabb Switch. To do +serious Snabb Switch development you only need to write some code that +manipulates packets and links. Usually we write apps in Lua using some +common libraries, but you can realistically write them from scratch in +Lua, C, assembler, or anything else you care to link in. + diff --git a/src/doc/default.nix b/src/doc/default.nix new file mode 100644 index 0000000000..673d319931 --- /dev/null +++ b/src/doc/default.nix @@ -0,0 +1,43 @@ +# Run like this: +# nix-build /path/to/this/directory +# ... and the files are produced in ./result/ + +{ pkgs ? (import {}) +}: + +with pkgs; + +stdenv.mkDerivation rec { + name = "snabb-manual"; + src = ../../.; + + buildInputs = [ ditaa pandoc git + (texlive.combine { + inherit (texlive) scheme-small luatex luatexbase sectsty titlesec cprotect bigfoot titling droid; + }) + ]; + + patchPhase = '' + patchShebangs src/doc + patchShebangs src/scripts + ''; + + buildPhase = '' + # needed for font cache + export TEXMFCACHE=`pwd` + + make book -C src + ''; + + installPhase = '' + mkdir -p $out/share/doc + cp src/obj/doc/snabb.* $out/share/doc + + # Give manual to Hydra + mkdir -p $out/nix-support + echo "doc-pdf manual $out/share/doc/snabb.pdf" >> $out/nix-support/hydra-build-products; + echo "doc HTML $out/share/doc/snabb.html" >> $out/nix-support/hydra-build-products; + echo "doc epub $out/share/doc/snabb.epub" >> $out/nix-support/hydra-build-products; + echo "doc markdown $out/share/doc/snabb.markdown" >> $out/nix-support/hydra-build-products; + ''; +} diff --git a/src/doc/genbook.sh b/src/doc/genbook.sh index 196507778e..83572e0284 100755 --- a/src/doc/genbook.sh +++ b/src/doc/genbook.sh @@ -78,4 +78,8 @@ $(cat $mdroot/program/snabbnfv/README.md) $(cat $mdroot/lib/watchdog/README.md) +# Snabblab + +$(cat $mdroot/doc/snabblab.md) + EOF diff --git a/src/doc/lstopo.png b/src/doc/lstopo.png new file mode 100644 index 0000000000..c10e901955 Binary files /dev/null and b/src/doc/lstopo.png differ diff --git a/src/doc/performance-tuning.md b/src/doc/performance-tuning.md new file mode 100644 index 0000000000..a42ee98995 --- /dev/null +++ b/src/doc/performance-tuning.md @@ -0,0 +1,262 @@ +# Maximizing deployment performance + +For maximum performance, several hardware, operating system and Snabb parameters need +to be tuned. Note that this document is only on tuning for deployment +performance, not on how to write performant Snabb code. + +## Snabb +### ring-buffer/num_descriptors +Defined in src/apps/intel10g.lua and adjustable before (!) the NIC gets initialized: + +``` +require('apps.intel.intel10g').num_descriptors = ring_buffer_size +... +config.app(c, "nic", require(device_info.driver).driver, {...}) +``` + +The default of 512 seems too small, based on load test at IMIX line rate tests against lwaftr, 1024 or 2048 gave equally good results. Num_descriptors controls the Receive Descriptor Length on the Intel 82599 Controller, which determines the number of bytes allocated to the circular buffer. This value must be a multiple of 128 (the maximum cache line size). Since each descriptor is 16 bytes in length, the total number of receive descriptors is always a multiple of 8. In networking terms, this defines the ingress buffer size in packets (TODO: is this correct?). Larger ingress buffer can reduce packet loss while Snabb is busy handling other packets, but it will also increase latency for packets waiting in the queue to be picked up by Snabb. + +### Enable engine.busywait +Defined in src/core/app.lua and enabled before calling engine.main() via + +``` +engine.busywait = true +engine.main(...) +``` +If true then the engine will poll for new data in a tight loop (100% CPU) instead of sleeping according to the Hz setting. This will reduce overall packet latency and increase throughput at the cost of utilizing the CPU hosting Snabb at 100%. + +### Monitor ifInDiscards +Snabb offers SNMP based ifInDiscards counters when SNMP is enabled. (TODO: need an easier way to expose these counters from the Intel register QPRDC). + +Enable SNMP in Snabb: + +``` +config.app(c, nic_id, require(device_info.driver).driver, + {..., snmp = { directory = "/tmp", status_timer = 1 }, ... }) +``` + +Then access ifInDiscards counter via od (the exact offset can be calculated from the file /tmp/0000:81:00.0.index): + +``` +od -j 305 -A none -N 4 -t u4 /tmp/0000\:81\:00.0 + 94543 +``` +Above example shows 94543 discarded packets at ingress on Port 0 since launching Snabb. + +## Qemu & Vhost-User +[Vhost-User](http://www.virtualopensystems.com/en/solutions/guides/snabbswitch-qemu/) is used to connect Snabb with a high performance virtual interface attached to a Qemu based virtual machine. This requires hugepages (explained further down) made available to Qemu: + +``` +cd /bin/x86_64-softmmu/ +qemu-system-x86_64 -enable-kvm -m 8000 -smp 2 \ + -chardev socket,id=char0,path=./xe0.socket,server \ + -netdev type=vhost-user,id=net0,chardev=char0 \ + -device virtio-net-pci,netdev=net0,mac=02:cf:69:15:0b:00 \ + -object memory-backend-file,id=mem,size=8000M,mem-path=/hugetlbfs,share=on \ + -numa node,memdev=mem -mem-prealloc \ + -realtime mlock=on \ + /path/to/img +``` + +The allocated memory must match the memory-backend-file size (example shows 8GB). While qemu will fail to boot if there isn't enough hugepages allocated, it is recommended to have some spare and note that the pages are split amongst the NUMA nodes. Check the paragraph on NUMA in this document. +It is recommended to specify the qemu option '-realtime mlock=on', despite it being the default. This ensures memory doesn't get swapped out. + +## Hardware / BIOS +### Disable Hyper-Threading +Disable hyper-threading (HT) in the BIOS. Even with isolating the correct hyper-threaded CPU's, can create latency spikes, leading to packet loss, when enabled. (TODO: do we have one of the automated tests showing this?) +According to [Intel on Hyper-Threading](http://www.intel.com/content/www/us/en/architecture-and-technology/hyper-threading/hyper-threading-technology.html): "Intel® Hyper-Threading Technology (Intel® HT Technology) uses processor resources more efficiently, enabling multiple threads to run on each core. As a performance feature, it also increases processor throughput, improving overall performance on threaded software.". Snabb runs single threaded, so can't benefit directly from HT. +### Performance Profile set to Max +Servers are optimized for energy efficiency. While this is great for application servers, Virtual Network Functions like Snabb benefit from performance optimized settings. Each vendor offers different BIOS settings to enable or disable energy efficiency settings or profiles. They are typically named "Max performance", "Energy Efficiency" and "Custom". Select "Max performance" for latency sensitive Snabb use. +### Turbo Mode +Intel Turbo Boost Technology allows processor cores to run faster than the rated operating frequency if they're operating below power, current, and temperature specification limits. (TODO: impact not yet analyzed on Snabb, nor if it is controlled by the performance profile). +## Linux Kernel +### Disable IOMMU +Sandybridge CPUs have a known issue on its IOTBL huge page support, impacting small packet performance. Newer CPUs don't have this issue. +(TODO, only found this info here: [http://dpdk.org/ml/archives/dev/2014-October/007411.html]() +(TODO: pass through mode: [https://lwn.net/Articles/329174/]()) + +Add IOMMU=pt (pass through) to the kernel: + +``` +GRUB_CMDLINE_LINUX_DEFAULT="... iommu=pt ... " +``` +### Enable huge pages +Required for Snabb to function. Select size 1G. On NUMA systems (more than one CPU socket/node), the pages are equally spread between all sockets. + +``` +GRUB_CMDLINE_LINUX_DEFAULT="... default_hugepagesz=1GB hugepagesz=1G hugepages=64 ..." +``` +Actual use of hugepages can be monitored with + +``` +$ cat /proc/meminfo |grep Huge +AnonHugePages: 12310528 kB +HugePages_Total: 64 +HugePages_Free: 58 +HugePages_Rsvd: 0 +HugePages_Surp: 0 +Hugepagesize: 1048576 kB +``` + On NUMA systems, the allocation and usage per node can be seen with + +``` +$ cat /sys/devices/system/node/node*/meminfo|grep Huge +Node 0 AnonHugePages: 12288 kB +Node 0 HugePages_Total: 32 +Node 0 HugePages_Free: 32 +Node 0 HugePages_Surp: 0 +Node 1 AnonHugePages: 12298240 kB +Node 1 HugePages_Total: 32 +Node 1 HugePages_Free: 26 +Node 1 HugePages_Surp: 0 +``` +(Above example shows six 1G pages in use on Node 1 with 2 Snabb processes serving one 10GE port each). + +### Disable irqbalance +The purpose of irqbalance is to distribute hardware interrupts across processors on a multiprocessor system in order to increase performance. Ubuntu has this installed and running in its default server installation. Snabb doesn't use interrupts to read packets. Disabling irqbalance forces CPU 0 to serve all hardware interrupts. To disable, either uninstall irqbalance or disable it: + +``` +$ sudo service irqbalance +``` + +To make it permanent, set ENABLED to 0 in /etc/default/irqbalance: + +``` +$ cat /etc/default/irqbalance +#Configuration for the irqbalance daemon + +#Should irqbalance be enabled? +ENABLED="0" +... +``` + +Various interrupt counters per CPU core can be retrieved via + +``` +$ cat /proc/interrupts + CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 CPU16 CPU17 CPU18 CPU19 + 0: 41 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 2-edge timer + 8: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 8-edge rtc0 +... +``` + +### CPU governor +settings should be performance, rather than ondemand or powersaving. (TODO: put the command here. Is this a linux kernel parameter?) + +## CPU Isolation/Pinning +Snabb makes 100% use of a single CPU core, hence its important that no other task ever uses that core. Thats best achieved by telling the kernel scheduler via kernel option 'isolcpus' to ignore cores designated to Snabb. In /etc/default/grub (example reserves cores 18 and 19 for Snabb): + +``` +GRUB_CMDLINE_LINUX_DEFAULT="... isolcpus=18-19 ..." +``` +Note: Never use CPU 0 for Snabb, because the Linux kernel uses CPU 0 to handle interrupts, including NMI (non-maskable interrupts). + +Launch Snabb either via 'taskset' or 'numactl'. Examples to ping it on CPU 18: + +``` +taskset -c 18 ./snabb ... +numactl --physcpubind=18 ./snabb ... +``` + +Note: Always use 'numactl' on NUMA servers, to limit allocation of memory to specified NUMA nodes with option '--membind=nodes': + +``` +numactl --physcpubind=18 --membind=1 ./snabb ... +``` + +## NUMA +Non-uniform memory access (NUMA) enabled systems have two or more CPU sockets (also called nodes), each with its own memory. While accessing memory across sockets is possible (and happening all the time), its slower than accessing local memory. PCI slots are hard wired to a specific node. It is imperative to pin Snabb to the same node )CPU and memory) as the NIC. Linux offers the command 'lstopo' to get an overall picture in text and graphical form: + +``` +lstopo --of pdf > lstopo.pdf +``` + +Example from a Lenovo RD650 (Intel(R) Xeon(R) CPU E5-2650 v3): +![lstopo.png](lstopo.png) + +### PCI Card and Snabb on same NUMA node +Memory shared between Snabb (by means of huge page mapping) and the NIC must share the same node. This is achieved via 'numactl', once the correct node is identified for a given NIC port/PCI address. +To find the node for a given PCI address, use cpulistaffinity combined with numactl (TODO: is there a more direct way??): + +``` +$ lspci|grep 10- +81:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) +81:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ +$ cat /sys/class/pci_bus/0000:81/cpulistaffinity +14-27,42-55 + +$ numactl -H|grep cpus +node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 +node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 +``` + +Above example shows the 10GE ports be served by node 1. + +Use 'numactl' and pin Snabb to a specific core (ideally excluded from the kernel scheduler) adn memory node: + +``` +numactl --physcpubind=18 --membind=1 ./snabb ... +``` + +Snabb applications like snabbnfv and snabbvmx share memory also with one or more QEMU processes running Virtual Machines via VhostUser. These QEMU processes must also be pinned to the same NUMA node with numactl with optional CPU pinning: + +``` +numactl --membind=1 /usr/local/bin/qemu-system-x86_64 ... +numactl --membind=1 --physcpubind=16-17 /usr/local/bin/qemu-system-x86_64 ... +``` + +Actual memory usage per node can be displayed with 'numastat': + +``` +$ sudo numastat -c snabb + +Per-node process memory usage (in MBs) +PID Node 0 Node 1 Total +--------------- ------ ------ ----- +... +6049 (snabb) 0 3731 3731 +6073 (snabb) 0 3732 3732 +... +--------------- ------ ------ ----- +Total 5 7753 7758 +$ sudo numastat -c qemu + +Per-node process memory usage (in MBs) +PID Node 0 Node 1 Total +--------------- ------ ------ ----- +1899 (qemu-syste 0 7869 7869 +1913 (qemu-syste 0 4171 4171 +... +--------------- ------ ------ ----- +Total 4 12040 12044 +``` +Above example shows two snabb processes (6049 & 6073) using memory only from node 1 as desired based on the NIC ports served by node 1. Two QEMU based Virtual Machines also only use memory from node 1. +If some memory is still used by another node for a given process, investigate its source and fix it. Possible candidates are SHM based filesystems use by Snabb in /var/run/snabb and SNMP SHM location if enabled. + +### Watch for TLB shootdowns +TLB (Translation Lookaside Buffer) is a cache of the translations from virtual memory addresses to physical memory addresses. When a processor changes the virtual-to-physical mapping of an address, it needs to tell the other processors to invalidate that mapping in their caches. +The actions of one processor causing the TLBs to be flushed on other processors is what is called a TLB shootdown. + +Occasional packet loss at ingress has been observed while TLB shootdowns happened on the system. There are per cpu counters of TLB shootdowns available in the output of 'cat /proc/interrupts': + +``` +$ cat /proc/interrupts |grep TLB + TLB: 2012 2141 1778 1942 2357 2076 2041 2129 2209 1960 486 0 0 0 0 0 0 145 0 0 TLB shootdowns +``` +They must remain 0 for the CPU serving Snabb (pinned via taskset or numactl). One possible source of such TLB shootdowns can be the use of SHM between processes on different nodes or pinning Snabb to a list of CPUs instead a single one (TODO: needs confirmation. Have seen periodic TLB shootdowns before optimization based on this doc); + +## Docker +CPU pinning via tasket and numactl work also within Docker Containers running in privileged mode. It is important to note that the Container can use all CPU cores, including the ones specifically excluded by the kernel option isolcpus: + +``` +$ taskset -cp $$ +pid 9822's current affinity list: 0,4-8,12-15 + +$ docker run --name ubuntu -ti ubuntu:14.04.4 +root@c819e0f106c4:/# taskset -cp $$ +pid 1's current affinity list: 0-15 +``` + + + + diff --git a/src/doc/porting.md b/src/doc/porting.md new file mode 100644 index 0000000000..8fabdae323 --- /dev/null +++ b/src/doc/porting.md @@ -0,0 +1,35 @@ +# Porting Snabb Switch + +Snabb Switch targets Linux/x86-64 on the master branch. The source +code includes device drivers, assembler code, and optimization for +specific CPU families. + +You are welcome to port Snabb Switch to a new platform. To do this you +can create a branch for your port and advertise this in +[branches.md](branches.md). See below for some technical tips. + +Currently there is no roadmap for supporting more platforms on the +master branch. The first step in this direction would be to have a +well-maintained port that is important for users. + +The master branch accepts code that is specific to Linux/x86-64. It +does not accept code intended for other platforms. + +## Technical tips + +Here are a few challenges you are likely to encounter when porting +Snabb Switch: + +- The `memory` module encodes the physical address of DMA memory into its virtual address using a 64-bit tagging scheme. This would need to be adapted for a 32-bit CPU. +- Device drivers depend on allocating physically contiguous memory in blocks of at least 10KB. +- Virtio-net code assumes a strict (x86-like) memory model that does not reorder stores. If your processor provides a more relaxed memory model then additional hardware memory barrier operations will be needed. +- The shm/counter mechanism assumes that the processor loads and stores 64-bit values atomically. If your processor does not provide atomic 64-bit loads and stores then additional synchronization may be needed. +- Certain optimizations depend on specific instruction set extensions such as AVX2. These optimizations may need to be ported in order to achieve adequate performance. (Particularly: multiple SIMD-optimized IP checksum routines.) +- Certain functions may only be available platform-specific optimized variants. You would either need to live without these routines, or write a generic fallback routine, or write a new optimized variant. (Particularly: AES-GCM encryption with Intel AES-NI instructions.) + +Like we said: Snabb Switch is a low-level piece of +software. Portability is important but simplicity and performance are +urgent. + +Good luck! + diff --git a/src/doc/snabblab.md b/src/doc/snabblab.md new file mode 100644 index 0000000000..d328073909 --- /dev/null +++ b/src/doc/snabblab.md @@ -0,0 +1,50 @@ +Servers devoted to the Snabb project and usable by all known developers. + +Want to be a known developer? Sure! Just edit the [user account list](https://github.com/snabblab/snabblab-nixos/blob/master/modules/users.nix) with your user and send a pull request. No fuss. + +## Guidelines + +- Feel at home. These servers are here for you to play with and enjoy. +- Please run Snabb processes like this: `sudo lock ./snabb ...`. The `lock` command will automatically wait if somebody else is running a Snabb process on the same machine and that helps us avoid conflicts for access to hardware resources. +- Tell `luke@snabb.co` your email address(es) to get an invitation to the [Lab Slack](http://snabb.slack.com/). +- Don't keep precious data on the servers. We might want to reinstall them at short notice. + +## Servers + +Name | Purpose | SSH | Xeon model | NICs +------------|---------------------------------------------------|-------------------------| -------- | ------------------------------------------------ +lugano-1 | General use | lugano-1.snabb.co | E3 1650v3 | 2 x 10G (82599), 4 x 10G (X710), 2 x 40G (XL710) +lugano-2 | General use | lugano-2.snabb.co | E3 1650v3 | 2 x 10G (82599), 4 x 10G (X710), 2 x 40G (XL710) +lugano-3 | General use | lugano-3.snabb.co | E3 1650v3 | 2 x 10G (82599), 2 x 100G (ConnectX-4) +lugano-4 | General use | lugano-4.snabb.co | E3 1650v3 | 2 x 10G (82599), 2 x 100G (ConnectX-4) +davos | Continuous Integration tests & driver development | lab1.snabb.co port 2000 | 2x E5 2603 | Diverse 10G/40G: Intel, SolarFlare, Mellanox, Chelsio, Broadcom. Installed upon request. +grindelwald | Snabb NFV testing | lab1.snabb.co port 2010 | 2x E5 2697v2 | 12 x 10G (Intel 82599) +interlaken | Haswell/AVX2 testing | lab1.snabb.co port 2030 | 2x E5 2620v3 | 12 x 10G (Intel 82599) + +## Get started + +You are welcome to play, test, and develop on the `lugano-1` .. `lugano-4` servers. Once your account is added you can connect like this: + + $ ssh user@lugano-1.snabb.co + +and check the PCI devices and their addresses with `lspci`. + +Certain cards (82599 and ConnectX-4) are cabled to themselves. That is, dual-port cards have their ports connected to each other. Certain other cards (X710/XL710) are currently not cabled. If you have special cabling needs then please open an issue on the [snabblab-nixos](https://github.com/snabblab/snabblab-nixos). + +## Using the lab + +All servers run the latest stable version of [NixOS Linux distribution](http://nixos.org/nixos/about.html). + +To quickly install a package: + + $ nox + +For other operations such as uninstalling a package, refer to `man nix-env`. + +## Questions + +If you have any questions or trouble, ask on [the #lab channel](https://snabb.slack.com/messages/lab/) or [open an issue](https://github.com/snabblab/snabblab-nixos). + +## Thanks + +We are grateful to [Silicom](http://www.silicom-usa.com/) for their sponsorship in the form of discounted network cards for `chur` and to [Netgate](http://www.netgate.com/) for giving us `jura`. Thanks gang! diff --git a/src/doc/testing.md b/src/doc/testing.md index 055f7159a1..cbb50eacb2 100644 --- a/src/doc/testing.md +++ b/src/doc/testing.md @@ -134,8 +134,5 @@ SnabbBot is configured through the following environment variables: * `REPO`—Optional. Target GitHub repository. Default is `snabbco/snabb` (upstream). -* `CURRENT`—Optional. The branch to merge pull requests with. Default is - `master`. - * `SNABBBOTDIR`—Optional. SnabbBot cache directory. Default is `/tmp/snabb_bot`. diff --git a/src/doc/trademark.md b/src/doc/trademark.md new file mode 100644 index 0000000000..9c8bd9b183 --- /dev/null +++ b/src/doc/trademark.md @@ -0,0 +1,34 @@ +## Trademark policy + +The Snabb logo is a copyrighted design, and the name "Snabb" is a registered trademark. If you wish to use the name or logo in any way, you must comply with this policy. + +1. The name and logo, or derivatives of it may be used to promote Snabb based products or services. +2. The name and logo, or derivatives of it may be used to promote or serve Snabb or related projects and their communities. +3. You may not state or otherwise lead people to believe, that you represent the Snabb open source community in any way other than as an individual or corporate contributor to the project. + +### Examples + +#### Acceptable use + +The following are examples of acceptable uses of the logo: + +- Inclusion of the name or logo on your website offering Snabb support or consultancy services. +- Inclusion of the name or logo on your website from which you offer Snabb-related or Snabb-derived software products. +- Inclusion of a modified version of the logo on a Snabb user forum that you run. +- Inclusion of the name or logo in your software or documentation to indicate that it runs on or with Snabb. +- Use of the name or logo on websites, publications or merchandise, modified or unmodified, for Snabb user groups, events, or non-profit organizations which support Snabb. +- Use of the name or logo to advertise Snabb-related content in a public event. +- Use of the name or logo in news articles or blogs which reference the Snabb Project. + +#### Unacceptable use + +The following are example of uses of the logo which are NOT acceptable: + +- Use of the name or logo in a software product that is unrelated to Snabb and does not run on it, or work with it. +- Use of the name or logo to promote a product that directly competes with Snabb but is otherwise unrelated. +- Use of the name or logo for a club or function that is unrelated to Snabb. + +### Contact + +If you have any questions about this policy, or its interpretation, +please open a Github Issue on the SnabbCo/snabbswitch repository. diff --git a/src/lib/bloom_filter.lua b/src/lib/bloom_filter.lua deleted file mode 100644 index c254f49b0d..0000000000 --- a/src/lib/bloom_filter.lua +++ /dev/null @@ -1,324 +0,0 @@ --- Use of this source code is governed by the Apache 2.0 license; see COPYING. - --- This module implements a basic Bloom filter as described in --- . --- --- Given the expected number of items n to be stored in the filter and --- the maxium acceptable false-positive rate p when the filter --- contains that number of items, the size m of the storage cell in --- bits and the number k of hash calculations are determined by --- --- m = -n ln(p)/ln(2)^2 --- k = m/n ln(2) = -ln(p)/ln(2) --- --- According to --- , the k --- independent hash functions can be replaced by two h1, h2 and the --- "linear combinations" h[i] = h1 + i*h2 (i=1..k) without changing --- the statistics of the filter. Furthermore, h1 and h2 can be --- derived from the same hash function using double hashing or seeded --- hashing. This implementation requires the "x64_128" variant of the --- Murmur hash family provided by lib.hash.murmur. --- --- Storing a sequence of bytes of length l in the filter proceeds as --- follows. First, the hash function is applied to the data with seed --- value 0. --- --- h1 = hash(data, l, 0) --- --- In this pseudo-code, h1 represents the lower 64 bits of the actual --- hash. The second hash is obtained by using h1 as seed --- --- h2 = hash(data, l, h1) --- --- Finally, k values in the range [0, m-1] are calculated as --- --- k_i = (h1 + i*h2) % m --- --- In order to be able to implement the mod m operation using bitops, --- m is rounded up to the next power of 2. In that case, the k_i can --- be calculated efficiently by --- --- k_i = bit.band(h1 + i*h2, m-1) --- --- The values k_i represent the original data. Such a set of values --- is called an *item*. The actual filter consists of a data --- structure that stores one bit for each of the m elements in the --- filter, called a *cell*. To store an item in a cell, the bits at --- the positions given by the values k_i are set to one. - -module(..., package.seeall) -local ffi = require("ffi") -local lib = require("core.lib") -local bit = require("bit") -local murmur = require("lib.hash.murmur") - -local band, bor, rshift, lshift = bit.band, bit.bor, bit.rshift, bit.lshift - -local filter = subClass(nil) -filter._name = "Bloom filter" -local selftest_config = { verbose = false, performance = false } - --- n = expected maximum number of objects to store --- p = maximum false positive rate in the range 0 < p < 1 -function filter:new (n, p) - assert(n > 0 and p < 1 and p > 0, self:name()..": invalid parameters") - local o = filter:superClass().new(self) - local ln2 = math.log(2) - local m = - n * math.log(p) / ln2^2 - m = 2^math.ceil(math.log(m)/ln2) -- Round up to the next power of two - o._m = m - o._k = math.ceil(m/n*ln2) - o._mod = ffi.new("uint64_t", m-1) - o._hash1 = murmur.MurmurHash3_x64_128:new() - o._hash2 = murmur.MurmurHash3_x64_128:new() - return o -end - --- Return the number of hash functions for this filter. -function filter:nhashes () - return self._k -end - --- Return the number of bits contained in a storage cell for this --- filter. This is guaranteed to be a power of 2. -function filter:width() - return self._m -end - --- Create a new storage cell, consisting of an array of 64-bit --- integers which is large enough to hold the m bits of the filter. -function filter:cell_new () - local nblocks = rshift(self._m, 6) - if band(self._m, 0x3FULL) ~= 0ULL then - nblocks = nblocks + 1 - end - return ffi.typeof("uint64_t [$]", nblocks)() -end - --- Remove all stored items from a cell by setting all bits to zero. -function filter:cell_clear (cell) - ffi.fill(cell, ffi.sizeof(cell)) -end - --- Copy cell d to cell s (assuming that they are of equal type) -function filter:cell_copy (s, d) - ffi.copy(d, s, ffi.sizeof(d)) -end - --- Return the ratio of the number of bits which are set to one to the --- total number of bits in the cell's bitset, which is a measure of --- how full the storage cell is. -function filter:cell_usage (cell) - local set = 0 - local width = self._m - for i = 0, width-1 do - if band(cell[rshift(i, 6)], lshift(1ULL, band(i, 0x3FULL))) ~= 0ULL then - set = set+1 - end - end - return set/width -end - --- Create a new storage item, consisting of an array of 64-bit --- integers of size k. We don't use Lua numbers to avoid conversions --- in the store_value() and check_value() methods. The item can --- optionally be filled with the given value. -function filter:item_new (v, l) - local item = ffi.typeof("uint64_t[$]", self._k)() - if v and l then - self:store_value(v, l, item, nil) - end - return item -end - --- Return an array that contains the item's data as Lua values. This --- is primarily intended for debugging or diagnostic purposes. -function filter:item_dump (item) - dump = {} - for i = 0, self._k-1 do - table.insert(dump, tonumber(item[i])) - end - return dump -end - --- Store a value in a item and/or cell. The value is represented by a --- pointer to a location in memory where the data is stored and its --- size l in bytes. The argument v is, in fact, not the pointer itself --- but a cdata object of type "uint8_t *[1]". Example usage where --- "value" is the pointer to the actual data and sizeofvalue is its --- size: --- --- local vptr = ffi.new("uint8_t *[1]") --- vptr[0] = value --- filter:store_value(vptr, sizeofvalue, item) --- --- This avoids the allocation of a cdata object of type "uint8_t*" for --- the method call, removing the dependence on the sink optimizer to --- generate code that is free of garbage. -function filter:store_value (v, l, item, cell) - local h1 = self._hash1:hash(v[0], l, 0ULL) - local h2 = self._hash2:hash(v[0], l, h1.u64[0]) - for i = 1, self._k do - local index = band(h1.u64[0] + i*h2.u64[0], self._mod) - if cell then - local block = rshift(index, 6) - cell[block] = bor(cell[block], lshift(1ULL, band(index, 0x3FULL))) - end - if item then - item[i-1] = index - end - end -end - --- Store an item in a cell -function filter:store_item (item, cell) - for i = 0, self._k-1 do - local index = item[i] - local block = rshift(index, 6) - cell[block] = bor(cell[block], lshift(1ULL, band(index, 0x3FULL))) - end -end - --- Check whether a value is contained in a cell and return the result --- as a boolean. The value is represented in the same manner as for --- the store_value() method. --- --- Due to the nature of a Bloom filter, a positive outcome does not --- guarantee that the value has actually been stored in the cell --- before (but the rate of these false positives is bounded by the --- parameter 'p' passed to the constructor of the filter). OTOH, a --- negative outcome guarantees that the value has not been stored in --- the cell. -function filter:check_value (v, l, cell) - local h1 = self._hash1:hash(v[0], l, 0ULL) - local h2 = self._hash2:hash(v[0], l, h1.u64[0]) - for i = 1, self._k do - local index = band(h1.u64[0] + i*h2.u64[0], self._mod) - if band(cell[rshift(index, 6)], lshift(1ULL, band(index, 0x3FULL))) == 0ULL then - return false - end - end - return true -end - --- Check whether a given item is contained in a cell. -function filter:check_item (item, cell) - for i = 0, self._k-1 do - local index = item[i] - if band(cell[rshift(index, 6)], lshift(1ULL, band(index, 0x3FULL))) == 0ULL then - return false - end - end - return true -end - -local function check_buckets(filter, item, expected) - for k, i in ipairs(filter:item_dump(item)) do - assert(i == expected[k], "wrong bucket index "..k - .." (expected "..expected[k]..", got "..tostring(i)..")") - end -end - -function selftest() - local murmur = require("lib.hash.murmur") - - local f = filter:new(100, 0.001) - assert(f:width() == 2048, "woring size of bitset, expected 2048, got "..f:width()) - assert(f:nhashes() == 15, "wrong number of hashes, expected 15, got "..f:nhashes()) - local cell, item = f:cell_new(), f:item_new() - - local data1 = ffi.new("uint8_t[9]", 'foobarbaz') - local data2 = ffi.new("uint8_t[1]", 'a') - local s1, s2 = ffi.sizeof(data1), ffi.sizeof(data2) - local dptr1, dptr2 = ffi.new("uint8_t*[1]"), ffi.new("uint8_t *[1]") - dptr1[0] = data1 - dptr2[0] = data2 - local expected_buckets = { 149, 986, 1823, 612, 1449, 238, 1075, - 1912, 701, 1538, 327, 1164, 2001, 790, - 1627 } - - f:store_value(dptr1, s1, item, nil) - check_buckets(f, item, expected_buckets) - f:store_value(dptr1, s1, nil, cell) - assert(f:check_item(item, cell), "check from item failed") - assert(f:check_value(dptr1, s1, cell), "check from value failed") - assert(not f:check_value(dptr2, s2, cell), "non-existance check failed") - f:cell_clear(cell) - assert(not f:check_value(dptr1, s1, cell), "clear store failed") - f:store_value(dptr2, s2, item) - f:store_value(dptr1, s1, item) - check_buckets(f, item, expected_buckets) - - data1 = ffi.new("union { uint32_t i; uint8_t b[4]; }") - dptr1[0] = data1.b - data2 = ffi.new("union { uint32_t i; uint8_t b[4]; }") - dptr2[0] = data2.b - s1, s2 = ffi.sizeof(data1), ffi.sizeof(data2) - - local min, max, step, samples = 50, 150, 10, 20000 - data1 = ffi.new("uint32_t [?]", max) - data2 = ffi.new("uint32_t [?]", samples) - for i = 0, max-1 do - data1[i] = i - end - for i = 0, samples-1 do - data2[i] = i+max - end - for j = min, max, step do - f:cell_clear(cell) - local fail = 0 - for i = 0, j-1 do - dptr1[0] = ffi.cast("uint8_t *", data1 + i) - f:store_value(dptr1, ffi.sizeof("uint32_t"), item, cell) - assert(f:check_item(item, cell)) - assert(f:check_value(dptr1, s1, cell)) - end - - local fp = 0 - for i = 0, samples-1 do - dptr1[0] = ffi.cast("uint8_t *", data2 + i) - if f:check_value(dptr1, ffi.sizeof("uint32_t"), cell) then - fp = fp+1 - end - end - if selftest_config.verbose then - print(string.format("False-positive rate @%d (occupancy %02.2f%%): %.4f, %d", - j, 100*f:cell_usage(cell), fp/samples, fp)) - end - if j == 100 then - assert(fp/samples <= 0.001, - "Maximum false-positives rate exceeded, expected 0.1%, got "..fp/samples) - end - end - - if selftest_config.performance then - - local function perfloop (iter, desc, call, ...) - jit.flush() - local start = ffi.C.get_time_ns() - for i = 1, iter do - call(...) - end - local stop = ffi.C.get_time_ns() - print(desc..": "..math.floor(iter/(tonumber(stop-start)/1e9)) - .." iterations per second") - end - - print("Bloom filter performance tests") - data1 = ffi.new("uint8_t [6]", '\x01\x02\x03\x04\x05\x06') - dptr1[0] = ffi.cast("uint8_t *", data1) - local iter = 1e8 - f:store_value(dptr1, 6, item, cell) - perfloop(iter, "check item", f.check_item, f, item, cell) - perfloop(iter, "store item", f.store_item, f, item, cell) - perfloop(iter, "store value cell/item", f.store_value, f, dptr1, 6, item, cell) - perfloop(iter, "store value cell", f.store_value, f, dptr1, 6, nil, cell) - perfloop(iter, "store value item", f.store_value, f, dptr1, 6, item) - perfloop(iter, "check value", f.check_value, f, dptr1, 6, cell) - end -end - -filter.selftest = selftest - -return filter diff --git a/src/lib/hardware/pci.lua b/src/lib/hardware/pci.lua index b1cc789f28..d75ee4a046 100644 --- a/src/lib/hardware/pci.lua +++ b/src/lib/hardware/pci.lua @@ -163,7 +163,7 @@ end -- -- example: qualified("01:00.0") -> "0000:01:00.0" function qualified (address) - return address:gsub("^%d%d:%d%d[.]%d+$", "0000:%1") + return address:gsub("^%x%x:%x%x[.]%x+$", "0000:%1") end --- ### Selftest @@ -175,6 +175,7 @@ function selftest () print("selftest: pci") assert(qualified("0000:01:00.0") == "0000:01:00.0", "qualified 1") assert(qualified( "01:00.0") == "0000:01:00.0", "qualified 2") + assert(qualified( "0a:00.0") == "0000:0a:00.0", "qualified 3") assert(canonical("0000:01:00.0") == "01:00.0", "canonical 1") assert(canonical( "01:00.0") == "01:00.0", "canonical 2") scan_devices() diff --git a/src/lib/virtio/net_device.lua b/src/lib/virtio/net_device.lua index 1f9f321191..9c93e66fbb 100644 --- a/src/lib/virtio/net_device.lua +++ b/src/lib/virtio/net_device.lua @@ -5,7 +5,6 @@ module(...,package.seeall) -local freelist = require("core.freelist") local lib = require("core.lib") local link = require("core.link") local memory = require("core.memory") diff --git a/src/lib/virtio/virtq_device.lua b/src/lib/virtio/virtq_device.lua index 0c698b2ee1..2ac95bb35a 100644 --- a/src/lib/virtio/virtq_device.lua +++ b/src/lib/virtio/virtq_device.lua @@ -5,7 +5,6 @@ module(...,package.seeall) -local freelist = require("core.freelist") local lib = require("core.lib") local memory = require("core.memory") local ffi = require("ffi") diff --git a/src/program/lwaftr/setup.lua b/src/program/lwaftr/setup.lua index 873d08fb7a..fdd6f277a5 100644 --- a/src/program/lwaftr/setup.lua +++ b/src/program/lwaftr/setup.lua @@ -10,7 +10,7 @@ local pcap = require("apps.pcap.pcap") local bt = require("apps.lwaftr.binding_table") local ipv4_apps = require("apps.lwaftr.ipv4_apps") local ipv6_apps = require("apps.lwaftr.ipv6_apps") -local vlan = require("apps.lwaftr.vlan") +local vlan = require("apps.vlan.vlan") local ethernet = require("lib.protocol.ethernet") function lwaftr_app(c, conf) diff --git a/src/program/packetblaster/README b/src/program/packetblaster/README index 034c9303d0..13faf95b59 100644 --- a/src/program/packetblaster/README +++ b/src/program/packetblaster/README @@ -1,6 +1,7 @@ Usage: packetblaster replay packetblaster synth + packetblaster lwaftr Use --help for per-command usage. Example: diff --git a/src/program/packetblaster/lwaftr/README b/src/program/packetblaster/lwaftr/README new file mode 100644 index 0000000000..c575470d5a --- /dev/null +++ b/src/program/packetblaster/lwaftr/README @@ -0,0 +1,171 @@ +Generate IPv6 B4 and IPv4 Internet traffic towards lightweight 4o6 AFTR server +and report MPPS, Gbps and packet loss on received traffic. + +Usage: packetblaster lwaftr [OPTIONS] + + --pci PCI Interface PCI address, e.g. 0000:05:00.0 + + --tap TAP_INTERFACE Linux tap interface name, e.g. tap0 + + --int Linux network interface, e.g. eth0 + + --pcap PCAP_FILE Write packets into the pcap file. Program terminates + after one cycle of packets (COUNT x SIZES per protocol) + + --sock Socket name for virtio interface + + --vlan VLANID VLAN tag traffic with VLANID if set + + --src_mac SOURCE + Source MAC-Address + Default: 00:00:00:00:00:00 + --dst_mac DESTINATION + Destination MAC-Address + Default: 00:00:00:00:00:00 + --size SIZES + A comma separated list of numbers. Send packets of + SIZES bytes. The size specifies the lenght of the IPv4 + packet. The actual packet size on the wire is 14 Bytes + longer (Ethernet header). Smallest allowed IPv4 packet + size is 28 (20 Bytes for IPv4 header, 8 Bytes payload + for packet loss detection) + Default: 64,64,64,64,64,64,64,594,594,594,1500 (IMIX) + --b4 IPV6,IPV4,PORT + First B4 Client IPv6 mapped to IPv4 and UDP Port. + B4 IPv6,IPv4 and Port are incremented for every count, + then rolled over. Port is incremented by the port number: + e.g. 1024 -> 1024, 2048, 3096 .. 64512 (63 in total) + Default: 2001:db8::,10.0.0.0,1024 + --aftr IPV6 + IPv6 address of lwaftr server (softwire tunnel endpoint) + Default: 2001:db8:ffff::100 + --ipv4 IPV4 + Public IPv4. Used as source for IPv4 traffic and + as destination in IPv6 packets from B4 + Default: 8.8.8.8 + --count COUNT + Number of B4 clients to simulate. + Default: 1 + --rate RATE + Rate in MPPS for the generated traffic. Fractions are + allowed (e.g. 3.148 for IMIX line rate). If set too high, + the actual transmitted rate depends on the interfaces capacity. + Setting rate to 0 turns it to listening only mode while + reporting on incoming packets + Default: 1 MPPS + --v4only, -4 + Generate only IPv4 packets from the Internet towards lwaftr + --v6only, -6 + Generate only IPv6 packets from B4 to lwaftr + --duration DURATION + Run for DURATION seconds. + Default: unlimited + -V, --verbose + Display verbose link information every second + -h, --help + Print usage information. + +This tool generates two types of traffic according to RFC7596: + + - B4 IPv4-in-IPv6 traffic towards lwAFTR (left to right) + - Internet IPv4 to IPv4 traffic towards lwAFTR (right to left) + +Topology From RFC7596, Figure 1: + ++--------+ +---------+ IPv4-in-IPv6 +------+ +-------------+ +|IPv4 LAN|---|lwB4/NAPT|================|lwAFTR|---|IPv4 Internet| ++--------+ +---------+ +------+ +-------------+ +lw4o6 NAPT model: subscriber state in the lwAFTR, NAPT state in the lwB4 + +Example 1: Measure performance of single stick LWAFTR (handling IPv4 and IPv6 +traffic over a single interface). Packetblaster lwaftr generates 50% IPv4 and +50% IPv6 encapsulated traffic of IMIX line traffic: + + $ sudo ./snabb packetblaster lwaftr --rate 3.2 --count 1000000000:05:00.0 + packetblaster lwaftr: Sending 1000000 clients at 3.200 MPPS to 0000:05:00.0 + + IPv6: 2001:db8:: > 2001:db8:ffff::100: 10.0.0.0:1024 > 8.8.8.8:12345 + source IPv6 and source IPv4/Port adjusted per client + IPv6 packet sizes: 104,104,104,104,104,104,104,634,634,634,1540 + + IPv4: 8.8.8.8:12345 > 10.0.0.0:1024 + destination IPv4 and Port adjusted per client + IPv4 packet sizes: 64,64,64,64,64,64,64,594,594,594,1500 + + v6+v4: 1.416+1.416 = 2.833 MPPS, 4.445+3.983 = 8.428 Gbps, lost 190697 pkts + v6+v4: 1.574+1.574 = 3.148 MPPS, 4.949+4.446 = 9.395 Gbps, lost 46 pkts + v6+v4: 1.574+1.574 = 3.148 MPPS, 4.950+4.446 = 9.395 Gbps, lost 0 pkts + v6+v4: 1.574+1.574 = 3.148 MPPS, 4.950+4.446 = 9.395 Gbps, lost 0 pkts + v6+v4: 1.574+1.574 = 3.148 MPPS, 4.950+4.446 = 9.395 Gbps, lost 0 pkts + v6+v4: 1.574+1.574 = 3.148 MPPS, 4.950+4.446 = 9.395 Gbps, lost 0 pkts + v6+v4: 1.574+1.574 = 3.148 MPPS, 4.950+4.446 = 9.395 Gbps, lost 0 pkts + +Example 2: Uni-directional througput test with two instances. +Run one client1 with 0 MPPS and on client2 on the other end of the wire +with the desired packet rate. The first instance will report on received +traffic: + + client1$ sudo ./snabb packetblaster lwaftr --rate 0 --pci 0000:05:00 --size 40 + packetblaster lwaftr: Sending 1 clients at 0.000 MPPS to 0000:05:00.0 + + IPv6: 2001:db8:: > 2001:db8:ffff::100: 10.0.0.0:1024 > 8.8.8.8:12345 + source IPv6 and source IPv4/Port adjusted per client + IPv6 packet sizes: 80 + + IPv4: 8.8.8.8:12345 > 10.0.0.0:1024 + destination IPv4 and Port adjusted per client + IPv4 packet sizes: 40 + + v6+v4: 2.364+2.370 = 4.734 MPPS, 1.778+1.137 = 2.915 Gbps, lost 1513191 pkts + v6+v4: 2.499+2.499 = 4.999 MPPS, 1.880+1.200 = 3.079 Gbps, lost 1073 pkts + v6+v4: 2.500+2.500 = 5.000 MPPS, 1.880+1.200 = 3.080 Gbps, lost 0 pkts + v6+v4: 2.500+2.500 = 4.999 MPPS, 1.880+1.200 = 3.080 Gbps, lost 633 pkts + v6+v4: 2.500+2.500 = 5.000 MPPS, 1.880+1.200 = 3.080 Gbps, lost 0 pkts + v6+v4: 2.500+2.500 = 5.000 MPPS, 1.880+1.200 = 3.080 Gbps, lost 54 pkts + v6+v4: 2.500+2.500 = 5.000 MPPS, 1.880+1.200 = 3.080 Gbps, lost 0 pkts + v6+v4: 2.500+2.500 = 5.000 MPPS, 1.880+1.200 = 3.080 Gbps, lost 0 pkts + v6+v4: 2.500+2.500 = 5.000 MPPS, 1.880+1.200 = 3.080 Gbps, lost 0 pkts + v6+v4: 2.500+2.500 = 5.000 MPPS, 1.880+1.200 = 3.080 Gbps, lost 0 pkts + + client2$ sudo ./snabb packetblaster lwaftr --rate 5 --size 40 --pci 0000:04:00.0 + packetblaster lwaftr: Sending 1 clients at 5.000 MPPS to 0000:04:00.0 + + IPv6: 2001:db8:: > 2001:db8:ffff::100: 10.0.0.0:1024 > 8.8.8.8:12345 + source IPv6 and source IPv4/Port adjusted per client + IPv6 packet sizes: 80 + + IPv4: 8.8.8.8:12345 > 10.0.0.0:1024 + destination IPv4 and Port adjusted per client + IPv4 packet sizes: 40 + + v6+v4: 0.000+0.000 = 0.000 MPPS, 0.000+0.000 = 0.000 Gbps, lost 0 pkts + v6+v4: 0.000+0.000 = 0.000 MPPS, 0.000+0.000 = 0.000 Gbps, lost 0 pkts + v6+v4: 0.000+0.000 = 0.000 MPPS, 0.000+0.000 = 0.000 Gbps, lost 0 pkts + v6+v4: 0.000+0.000 = 0.000 MPPS, 0.000+0.000 = 0.000 Gbps, lost 0 pkts + v6+v4: 0.000+0.000 = 0.000 MPPS, 0.000+0.000 = 0.000 Gbps, lost 0 pkts + v6+v4: 0.000+0.000 = 0.000 MPPS, 0.000+0.000 = 0.000 Gbps, lost 0 pkts + +Example with Linux tap interface: + + $ sudo ip tuntap add dev tap0 mode tap + $ sudo ifconfig tap0 mtu 9000 up + $ sudo tcpdump -n -i tap0 -e -s 1500 -c 12 & + $ sudo ./snabb packetblaster lwaftr --rate 0.001 --pci tap0 -v4only -D 1 + packetblaster lwaftr: Sending 1 clients at 0.001 MPPS to tap0 + + IPv4: 8.8.8.8:12345 > 10.0.0.0:1024 + destination IPv4 and Port adjusted per client + IPv4 packet sizes: 64,64,64,64,64,64,64,594,594,594,1500 + + 14:04:43.572712 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 78: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 28 + 14:04:43.572733 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 78: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 28 + 14:04:43.572741 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 78: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 28 + 14:04:43.572749 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 78: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 28 + 14:04:43.572756 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 78: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 28 + 14:04:43.572780 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 78: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 28 + 14:04:43.572788 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 78: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 28 + 14:04:43.572866 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 608: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 558 + 14:04:43.573019 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 608: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 558 +14:04:43.573161 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 608: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 558 + 14:04:43.573297 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 1514: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 1464 + 14:04:43.573430 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 78: 8.8.8.8.12345 > 0.0.0.0.1024: UDP, length 28 diff --git a/src/program/packetblaster/lwaftr/README.inc b/src/program/packetblaster/lwaftr/README.inc new file mode 120000 index 0000000000..100b93820a --- /dev/null +++ b/src/program/packetblaster/lwaftr/README.inc @@ -0,0 +1 @@ +README \ No newline at end of file diff --git a/src/program/packetblaster/lwaftr/lwaftr.lua b/src/program/packetblaster/lwaftr/lwaftr.lua new file mode 100644 index 0000000000..90dd22c879 --- /dev/null +++ b/src/program/packetblaster/lwaftr/lwaftr.lua @@ -0,0 +1,252 @@ +-- Use of this source code is governed by the Apache 2.0 license; see COPYING. + +module(..., package.seeall) + +local engine = require("core.app") +local config = require("core.config") +local timer = require("core.timer") +local pci = require("lib.hardware.pci") +local intel10g = require("apps.intel.intel10g") +local intel_app = require("apps.intel.intel_app") +local main = require("core.main") +local S = require("syscall") +local Lwaftrgen = require("apps.test.lwaftr").Lwaftrgen +local Tap = require("apps.tap.tap").Tap +local raw = require("apps.socket.raw") +local pcap = require("apps.pcap.pcap") +local VhostUser = require("apps.vhost.vhost_user").VhostUser +local lib = require("core.lib") +local ffi = require("ffi") +local C = ffi.C + +local usage = require("program.packetblaster.lwaftr.README_inc") + +local long_opts = { + pci = "p", -- PCI address + tap = "t", -- tap interface + int = "i", -- Linux network interface, e.g. eth0 + sock = "k", -- socket name for virtio + duration = "D", -- terminate after n seconds + verbose = "V", -- verbose, display stats + help = "h", -- display help text + size = "S", -- packet size list (defaults to IMIX) + src_mac = "s", -- source ethernet address + dst_mac = "d", -- destination ethernet address + vlan = "v", -- VLAN id + b4 = "b", -- B4 start IPv6_address,IPv4_address,port + aftr = "a", -- fix AFTR public IPv6_address + ipv4 = "I", -- fix public IPv4 address + count = "c", -- how many b4 clients to simulate + rate = "r", -- rate in MPPS (0 => listen only) + v4only = "P", -- generate only public IPv4 traffic + v6only = "E", -- generate only public IPv6 encapsulated traffic + pcap = "o" -- output packet to the pcap file +} + +local function dir_exists(path) + local stat = S.stat(path) + return stat and stat.isdir +end + +function run (args) + local opt = {} + local duration + local c = config.new() + + function opt.D (arg) + duration = assert(tonumber(arg), "duration is not a number!") + end + + local verbose + function opt.V (arg) + verbose = true + end + + function opt.h (arg) + print(usage) + main.exit(0) + end + + local sizes = { 64, 64, 64, 64, 64, 64, 64, 594, 594, 594, 1500 } + local sizes_ipv6 = { 104, 104, 104, 104, 104, 104, 104, 634, 634, 634, 1540 } + function opt.S (arg) + sizes = {} + sizes_ipv6 = {} + for size in string.gmatch(arg, "%d+") do + local s = tonumber(size) + if s < 28 then + s = 28 + print("Warning: Increasing IPv4 packet size to 28") + end + sizes[#sizes+1] = s + sizes_ipv6[#sizes_ipv6+1] = s + 40 + end + end + + local src_mac = "00:00:00:00:00:00" + function opt.s (arg) src_mac = arg end + + local dst_mac = "00:00:00:00:00:00" + function opt.d (arg) dst_mac = arg end + + local b4_ipv6, b4_ipv4, b4_port = "2001:db8::", "10.0.0.0", 1024 + function opt.b (arg) + for s in string.gmatch(arg, "[%w.:]+") do + if string.find(s, ":") then + b4_ipv6 = s + elseif string.find(s, '.',1,true) then + b4_ipv4 = s + else + b4_port = assert(tonumber(s), string.format("UDP port %s is not a number!", s)) + end + end + end + + local public_ipv4 = "8.8.8.8" + function opt.I (arg) public_ipv4 = arg end + + local aftr_ipv6 = "2001:db8:ffff::100" + function opt.a (arg) aftr_ipv6 = arg end + + local count = 1 + function opt.c (arg) + count = assert(tonumber(arg), "count is not a number!") + end + + local rate = 1 + function opt.r (arg) + rate = assert(tonumber(arg), "rate is not a number!") + end + + local target + local pciaddr + function opt.p (arg) + pciaddr = arg + target = pciaddr + end + + local tap_interface + function opt.t (arg) + tap_interface = arg + target = tap_interface + end + + local int_interface + function opt.i (arg) + int_interface = arg + target = int_interface + end + + local sock_interface + function opt.k (arg) + sock_interface = arg + target = sock_interface + end + + local ipv4_only = false + function opt.v4 () ipv4_only = true end + opt["4"] = opt.v4 + + local ipv6_only = false + function opt.v6 () ipv6_only = true end + opt["6"] = opt.v6 + + local vlan = nil + function opt.v (arg) + vlan = assert(tonumber(arg), "duration is not a number!") + end + + local pcap_file, single_pass + function opt.o (arg) + pcap_file = arg + target = pcap_file + single_pass = true + end + + args = lib.dogetopt(args, opt, "VD:hS:s:a:d:b:iI:c:r:46p:v:o:t:i:k:", long_opts) + + if not target then + print("either --pci, --tap, --sock, --int or --pcap are required parameters") + main.exit(1) + end + + print(string.format("packetblaster lwaftr: Sending %d clients at %.3f MPPS to %s", count, rate, target)) + print() + + if not ipv4_only then + print(string.format("IPv6: %s > %s: %s:%d > %s:12345", b4_ipv6, aftr_ipv6, b4_ipv4, b4_port, public_ipv4)) + print(" source IPv6 and source IPv4/Port adjusted per client") + print("IPv6 packet sizes: " .. table.concat(sizes_ipv6,",")) + end + + if not ipv6_only then + print() + print(string.format("IPv4: %s:12345 > %s:%d", public_ipv4, b4_ipv4, b4_port)) + print(" destination IPv4 and Port adjusted per client") + print("IPv4 packet sizes: " .. table.concat(sizes,",")) + end + + if ipv4_only and ipv6_only then + print("Remove options v4only and v6only to generate both") + main.exit(1) + end + + config.app(c, "generator", Lwaftrgen, { + sizes = sizes, count = count, aftr_ipv6 = aftr_ipv6, rate = rate, + src_mac = src_mac, dst_mac = dst_mac, vlan = vlan, + b4_ipv6 = b4_ipv6, b4_ipv4 = b4_ipv4, b4_port = b4_port, + public_ipv4 = public_ipv4, single_pass = single_pass, + ipv4_only = ipv4_only, ipv6_only = ipv6_only }) + + local input, output + + if tap_interface then + if dir_exists(("/sys/devices/virtual/net/%s"):format(tap_interface)) then + config.app(c, "tap", Tap, tap_interface) + input, output = "tap.input", "tap.output" + else + print(string.format("tap interface %s doesn't exist", tap_interface)) + main.exit(1) + end + elseif pciaddr then + local device_info = pci.device_info(pciaddr) + if vlan then + print(string.format("vlan set to %d", vlan)) + end + if device_info then + config.app(c, "nic", require(device_info.driver).driver, + {pciaddr = pciaddr, vmdq = false, macaddr = src_mac, mtu = 9500}) + input, output = "nic.rx", "nic.tx" + else + fatal(("Couldn't find device info for PCI or tap device %s"):format(pciaddr)) + end + elseif int_interface then + config.app(c, "int", raw.RawSocket, int_interface) + input, output = "int.rx", "int.tx" + elseif sock_interface then + config.app(c, "virtio", VhostUser, { socket_path=sock_interface } ) + input, output = "virtio.rx", "virtio.tx" + else + config.app(c, "pcap", pcap.PcapWriter, pcap_file) + input, output = "pcap.input", "pcap.output" + end + + config.link(c, output .. " -> generator.input") + config.link(c, "generator.output -> " .. input) + + engine.busywait = true + engine.configure(c) + + if verbose then + print ("enabling verbose") + local fn = function () + print("Transmissions (last 1 sec):") + engine.report_apps() + end + local t = timer.new("report", fn, 1e9, 'repeating') + timer.activate(t) + end + + if duration then engine.main({duration=duration}) + else engine.main() end +end diff --git a/src/program/packetblaster/lwaftr/test_lwaftr_1.pcap b/src/program/packetblaster/lwaftr/test_lwaftr_1.pcap new file mode 100644 index 0000000000..90b0622b61 Binary files /dev/null and b/src/program/packetblaster/lwaftr/test_lwaftr_1.pcap differ diff --git a/src/program/packetblaster/lwaftr/test_lwaftr_2.pcap b/src/program/packetblaster/lwaftr/test_lwaftr_2.pcap new file mode 100644 index 0000000000..cefd6f4392 Binary files /dev/null and b/src/program/packetblaster/lwaftr/test_lwaftr_2.pcap differ diff --git a/src/program/packetblaster/packetblaster.lua b/src/program/packetblaster/packetblaster.lua index ce0e7e327d..4b1c11b77f 100644 --- a/src/program/packetblaster/packetblaster.lua +++ b/src/program/packetblaster/packetblaster.lua @@ -3,100 +3,13 @@ module(..., package.seeall) local engine = require("core.app") -local config = require("core.config") local timer = require("core.timer") -local pci = require("lib.hardware.pci") local intel10g = require("apps.intel.intel10g") -local intel_app = require("apps.intel.intel_app") -local basic_apps = require("apps.basic.basic_apps") -local main = require("core.main") -local PcapReader= require("apps.pcap.pcap").PcapReader -local Synth = require("apps.test.synth").Synth +local lib = require("core.lib") +local pci = require("lib.hardware.pci") local LoadGen = require("apps.intel.loadgen").LoadGen -local lib = require("core.lib") -local ffi = require("ffi") -local C = ffi.C - -local usage = require("program.packetblaster.README_inc") -local usage_replay = require("program.packetblaster.replay.README_inc") -local usage_synth = require("program.packetblaster.synth.README_inc") - -local long_opts = { - duration = "D", - help = "h", - src = "s", - dst = "d", - sizes = "S" -} - -function run (args) - local opt = {} - local mode = table.remove(args, 1) - local duration - local c = config.new() - function opt.D (arg) - duration = assert(tonumber(arg), "duration is not a number!") - end - function opt.h (arg) - if mode == 'replay' then print(usage_replay) - elseif mode == 'synth' then print(usage_synth) - else print(usage) end - main.exit(1) - end - if mode == 'replay' and #args > 1 then - args = lib.dogetopt(args, opt, "hD:", long_opts) - local filename = table.remove(args, 1) - config.app(c, "pcap", PcapReader, filename) - config.app(c, "loop", basic_apps.Repeater) - config.app(c, "source", basic_apps.Tee) - config.link(c, "pcap.output -> loop.input") - config.link(c, "loop.output -> source.input") - elseif mode == 'synth' and #args >= 1 then - local source - local destination - local sizes - function opt.s (arg) source = arg end - function opt.d (arg) destination = arg end - function opt.S (arg) - sizes = {} - for size in string.gmatch(arg, "%d+") do - sizes[#sizes+1] = tonumber(size) - end - end - - args = lib.dogetopt(args, opt, "hD:s:d:S:", long_opts) - config.app(c, "source", Synth, { sizes = sizes, - src = source, - dst = destination }) - else - opt.h() - end - local patterns = args - local nics = 0 - pci.scan_devices() - for _,device in ipairs(pci.devices) do - if is_device_suitable(device, patterns) then - nics = nics + 1 - local name = "nic"..nics - config.app(c, name, LoadGen, device.pciaddress) - config.link(c, "source."..tostring(nics).."->"..name..".input") - end - end - assert(nics > 0, " matches no suitable devices.") - engine.busywait = true - intel10g.num_descriptors = 32*1024 - engine.configure(c) - local fn = function () - print("Transmissions (last 1 sec):") - engine.report_apps() - end - local t = timer.new("report", fn, 1e9, 'repeating') - timer.activate(t) - if duration then engine.main({duration=duration}) - else engine.main() end -end -function is_device_suitable (pcidev, patterns) +local function is_device_suitable (pcidev, patterns) if not pcidev.usable or pcidev.driver ~= 'apps.intel.intel_app' then return false end @@ -110,3 +23,40 @@ function is_device_suitable (pcidev, patterns) end end +function run_loadgen (c, patterns, duration) + local nics = 0 + pci.scan_devices() + for _,device in ipairs(pci.devices) do + if is_device_suitable(device, patterns) then + nics = nics + 1 + local name = "nic"..nics + config.app(c, name, LoadGen, device.pciaddress) + config.link(c, "source."..tostring(nics).."->"..name..".input") + end + end + assert(nics > 0, " matches no suitable devices.") + engine.busywait = true + engine.configure(c) + local fn = function () + print("Transmissions (last 1 sec):") + engine.report_apps() + end + local t = timer.new("report", fn, 1e9, 'repeating') + timer.activate(t) + if duration then engine.main({duration=duration}) + else engine.main() end +end +local function show_usage(exit_code) + print(require("program.packetblaster.README_inc")) + main.exit(exit_code) +end + +function run(args) + if #args == 0 then show_usage(1) end + local command = string.gsub(table.remove(args, 1), "-", "_") + local modname = ("program.packetblaster.%s.%s"):format(command, command) + if not lib.have_module(modname) then + show_usage(1) + end + require(modname).run(args) +end diff --git a/src/program/packetblaster/replay/replay.lua b/src/program/packetblaster/replay/replay.lua new file mode 100644 index 0000000000..53f7715e3b --- /dev/null +++ b/src/program/packetblaster/replay/replay.lua @@ -0,0 +1,40 @@ +-- Use of this source code is governed by the Apache 2.0 license; see COPYING. + +module(..., package.seeall) + +local config = require("core.config") +local basic_apps = require("apps.basic.basic_apps") +local main = require("core.main") +local PcapReader = require("apps.pcap.pcap").PcapReader +local lib = require("core.lib") + +local packetblaster = require("program.packetblaster.packetblaster") +local usage = require("program.packetblaster.replay.README_inc") + +local long_opts = { + duration = "D", + help = "h" +} + +function run (args) + local opt = {} + local duration + local c = config.new() + function opt.D (arg) + duration = assert(tonumber(arg), "duration is not a number!") + end + function opt.h (arg) + print(usage) + main.exit(1) + end + + args = lib.dogetopt(args, opt, "hD:", long_opts) + local filename = table.remove(args, 1) + print (string.format("filename=%s", filename)) + config.app(c, "pcap", PcapReader, filename) + config.app(c, "loop", basic_apps.Repeater) + config.app(c, "source", basic_apps.Tee) + config.link(c, "pcap.output -> loop.input") + config.link(c, "loop.output -> source.input") + packetblaster.run_loadgen(c, args, duration) +end diff --git a/src/program/packetblaster/selftest.sh b/src/program/packetblaster/selftest.sh index 3c726af70e..1a2ca09f32 100755 --- a/src/program/packetblaster/selftest.sh +++ b/src/program/packetblaster/selftest.sh @@ -1,6 +1,45 @@ #!/usr/bin/env bash echo "selftest: packetblaster" + +# do tests first that don't require PCI + +function test_lwaftr_pcap { + PCAP=$1 + shift + TEMP_PCAP=/tmp/lwaftr$$.pcap + echo "testing lwaftr pcap $PCAP ..." + ./snabb packetblaster lwaftr --pcap $TEMP_PCAP $@ + status=$? + if [ $status != 0 ]; then + echo "Error: lwaftr pcap generation failed for ${PCAP} with ${status}" + rm $TEMP_PCAP + exit 1 + fi + cmp $TEMP_PCAP $PCAP + rm $TEMP_PCAP + if [ $status != 0 ]; then + echo "Error: lwaftr generated pcap differs from ${PCAP}" + exit 1 + fi +} + +test_lwaftr_pcap program/packetblaster/lwaftr/test_lwaftr_1.pcap --count 1 +test_lwaftr_pcap program/packetblaster/lwaftr/test_lwaftr_2.pcap --count 2 --vlan 100 --size 0 + +# lwaftr tap test +sudo ip netns add snabbtest || exit $TEST_SKIPPED +sudo ip netns exec snabbtest ip tuntap add tap0 mode tap +sudo ip netns exec snabbtest ip link set up dev tap0 +sudo ip netns exec snabbtest ./snabb packetblaster lwaftr --tap tap0 -D 1 +status=$? +ip netns exec snabbtest ifconfig tap0 +sudo ip netns delete snabbtest +if [ $status != 0 ]; then + echo "Error: lwaftr tap failed for tap0 with ${status}" + exit 1 +fi + export PCIADDR=$SNABB_PCI_INTEL0 [ ! -z "$PCIADDR" ] || export PCIADDR=$SNABB_PCI0 if [ -z "${PCIADDR}" ]; then @@ -24,4 +63,11 @@ if [ $status != 124 ]; then exit 1 fi +timeout 5 ./snabb packetblaster lwaftr --pci ${PCIADDR} +status=$? +if [ $status != 124 ]; then + echo "Error: expected timeout (124) but got ${status}" + exit 1 +fi + echo "selftest: ok" diff --git a/src/program/packetblaster/synth/synth.lua b/src/program/packetblaster/synth/synth.lua new file mode 100644 index 0000000000..d657d5b654 --- /dev/null +++ b/src/program/packetblaster/synth/synth.lua @@ -0,0 +1,49 @@ +-- Use of this source code is governed by the Apache 2.0 license; see COPYING. + +module(..., package.seeall) + +local config = require("core.config") +local main = require("core.main") +local Synth = require("apps.test.synth").Synth +local lib = require("core.lib") + +local packetblaster = require("program.packetblaster.packetblaster") +local usage = require("program.packetblaster.synth.README_inc") + +local long_opts = { + duration = "D", + help = "h", + src = "s", + dst = "d", + sizes = "S" +} + +function run (args) + local opt = {} + local duration + local c = config.new() + function opt.D (arg) + duration = assert(tonumber(arg), "duration is not a number!") + end + function opt.h (arg) + print(usage) + main.exit(1) + end + + local source + local destination + local sizes + function opt.s (arg) source = arg end + function opt.d (arg) destination = arg end + function opt.S (arg) + sizes = {} + for size in string.gmatch(arg, "%d+") do + sizes[#sizes+1] = tonumber(size) + end + end + + args = lib.dogetopt(args, opt, "hD:s:d:S:", long_opts) + config.app(c, "source", Synth, { sizes = sizes, + src = source, dst = destination }) + packetblaster.run_loadgen(c, args, duration) +end diff --git a/src/program/snabbmark/snabbmark.lua b/src/program/snabbmark/snabbmark.lua index 8a554ffa8b..e1122f7b09 100644 --- a/src/program/snabbmark/snabbmark.lua +++ b/src/program/snabbmark/snabbmark.lua @@ -7,7 +7,6 @@ local usage = require("program.snabbmark.README_inc") local basic_apps = require("apps.basic.basic_apps") local pci = require("lib.hardware.pci") local ethernet = require("lib.protocol.ethernet") -local freelist = require("core.freelist") local lib = require("core.lib") local ffi = require("ffi") local C = ffi.C diff --git a/src/program/snabbnfv/neutron2snabb/neutron2snabb.lua b/src/program/snabbnfv/neutron2snabb/neutron2snabb.lua index 44523cdd87..45a8d81ab1 100644 --- a/src/program/snabbnfv/neutron2snabb/neutron2snabb.lua +++ b/src/program/snabbnfv/neutron2snabb/neutron2snabb.lua @@ -32,8 +32,8 @@ default_schemas = { ports = {'tenant_id', 'id', 'name', 'network_id', 'mac_address', 'admin_state_up', 'status', 'device_id', 'device_owner'}, - ml2_port_bindings = {'port_id', 'host', 'vif_type', 'driver', 'segment', - 'vnic_type', 'vif_details', 'profile'}, + ml2_port_bindings = {'port_id', 'host', 'vif_type', + 'vnic_type', 'profile', 'vif_details'}, securitygrouprules = {'tenant_id', 'id', 'security_group_id', 'remote_group_id', 'direction', 'ethertype', 'protocol', 'port_range_min', 'port_range_max', @@ -81,10 +81,10 @@ function create_config (input_dir, output_dir, hostname) for _, port in pairs(ports) do print("PortID: ", port.id) local binding = port_bindings[port.id] - -- If the port is a 'snabb' port, lives on our host and is online + -- If the port is a 'vhostuser' port, lives on our host and is online -- then we compile its configuration. - print("BindingID ", binding.id, " has driver ", binding.driver) - if binding.driver == "snabb" then + print("BindingID ", binding.id, " has vif_type ", binding.vif_type) + if binding.vif_type == "vhostuser" then local vif_details = json.decode(binding.vif_details) -- See https://github.com/snabbco/snabb/pull/423 local profile = vif_details["binding:profile"] diff --git a/src/program/snabbnfv/test_env/test_env.sh b/src/program/snabbnfv/test_env/test_env.sh index 834ed88593..4bb0ab1c7f 100644 --- a/src/program/snabbnfv/test_env/test_env.sh +++ b/src/program/snabbnfv/test_env/test_env.sh @@ -32,7 +32,11 @@ export qemu_smp=$QUEUES export qemu_vectors=$((2*$QUEUES + 1)) export sockets="" -export assets=$HOME/.test_env +if [ -z "$SNABB_TEST_FIXTURES" ]; then + export assets=$HOME/.test_env +else + export assets=$SNABB_TEST_FIXTURES +fi export qemu=qemu/obj/x86_64-softmmu/qemu-system-x86_64 export host_qemu=$(which qemu-system-x86_64) diff --git a/src/program/snabbnfv/test_fixtures/neutron_csv/ml2_port_bindings.sql b/src/program/snabbnfv/test_fixtures/neutron_csv/ml2_port_bindings.sql index 78e08e223e..5e84888cb2 100644 --- a/src/program/snabbnfv/test_fixtures/neutron_csv/ml2_port_bindings.sql +++ b/src/program/snabbnfv/test_fixtures/neutron_csv/ml2_port_bindings.sql @@ -24,15 +24,11 @@ CREATE TABLE `ml2_port_bindings` ( `port_id` varchar(36) NOT NULL, `host` varchar(255) NOT NULL, `vif_type` varchar(64) NOT NULL, - `driver` varchar(64) DEFAULT NULL, - `segment` varchar(36) DEFAULT NULL, `vnic_type` varchar(64) NOT NULL DEFAULT 'normal', - `vif_details` varchar(4095) NOT NULL DEFAULT '', `profile` varchar(4095) NOT NULL DEFAULT '', + `vif_details` varchar(4095) NOT NULL DEFAULT '', PRIMARY KEY (`port_id`), - KEY `segment` (`segment`), CONSTRAINT `ml2_port_bindings_ibfk_1` FOREIGN KEY (`port_id`) REFERENCES `ports` (`id`) ON DELETE CASCADE, - CONSTRAINT `ml2_port_bindings_ibfk_2` FOREIGN KEY (`segment`) REFERENCES `ml2_network_segments` (`id`) ON DELETE SET NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; /*!40101 SET character_set_client = @saved_cs_client */; diff --git a/src/program/snabbnfv/test_fixtures/neutron_csv/ml2_port_bindings.txt b/src/program/snabbnfv/test_fixtures/neutron_csv/ml2_port_bindings.txt index 26d233bfb7..29166d977b 100644 --- a/src/program/snabbnfv/test_fixtures/neutron_csv/ml2_port_bindings.txt +++ b/src/program/snabbnfv/test_fixtures/neutron_csv/ml2_port_bindings.txt @@ -1,2 +1,2 @@ -523276c7-73e3-4154-8b67-9c7199bdbb8c cdn1 vhostuser snabb eb949b3f-9430-4999-bf5d-09ef6032c1cc normal {"zone_port": "port0", "zone_ip": "2006::16", "port_filter": true, "zone_vlan": 243, "zone_gbps": 8.0, "zone_host": "cdn1", "binding:profile": {"rx_police_gbps":3.0, "tx_police_gbps":1.0, "tunnel_type": "L2TPv3", "l2tpv3_remote_ip":"1:2:3:4:5:6:7:8","l2tpv3_session":16,"l2tpv3_local_cookie":"616263","l2tpv3_remote_cookie":"646566","l2tpv3_next_hop":"2:2:2:2:2:2:2:2"}} \N -9745ff46-986f-4f74-bc37-a35f481c0b9b cdn1 vhostuser snabb 39b025e0-5ee4-4054-b851-3179dddc10b4 normal {"zone_port": "port2", "zone_ip": "2007:2::1a", "port_filter": true, "zone_vlan": 442, "zone_gbps": 8.0, "zone_host": "cdn1", "binding:profile": {"packetfilter":"stateless"}} {} +523276c7-73e3-4154-8b67-9c7199bdbb8c cdn1 vhostuser normal \N {"zone_port": "port0", "zone_ip": "2006::16", "port_filter": true, "zone_vlan": 243, "zone_gbps": 8.0, "zone_host": "cdn1", "binding:profile": {"rx_police_gbps":3.0, "tx_police_gbps":1.0, "tunnel_type": "L2TPv3", "l2tpv3_remote_ip":"1:2:3:4:5:6:7:8","l2tpv3_session":16,"l2tpv3_local_cookie":"616263","l2tpv3_remote_cookie":"646566","l2tpv3_next_hop":"2:2:2:2:2:2:2:2"}} +9745ff46-986f-4f74-bc37-a35f481c0b9b cdn1 vhostuser normal {} {"zone_port": "port2", "zone_ip": "2007:2::1a", "port_filter": true, "zone_vlan": 442, "zone_gbps": 8.0, "zone_host": "cdn1", "binding:profile": {"packetfilter":"stateless"}} diff --git a/src/scripts/snabb_bot.sh b/src/scripts/snabb_bot.sh index 07b3c819f6..548e48770f 100755 --- a/src/scripts/snabb_bot.sh +++ b/src/scripts/snabb_bot.sh @@ -5,7 +5,6 @@ export SNABBBOTDIR=${SNABBBOTDIR:-"/tmp/snabb_bot"} export REPO=${REPO:-"snabbco/snabb"} -export CURRENT=${CURRENT:-"master"} export JQ=${JQ:-$(which jq)} export SNABB_TEST_IMAGE=${SNABB_TEST_IMAGE:-eugeneia/snabb-nfv-test} export CONTEXT=${CONTEXT:-"$(hostname)-$SNABB_TEST_IMAGE"} @@ -46,14 +45,23 @@ function pull_request_head { echo "$(pull_request_by_id $1)" | "$JQ" -r ".head.sha" } +function pull_request_target { + echo "$(pull_request_by_id $1)" | "$JQ" -r ".base.ref" +} + function repo_path { echo "$tmpdir/repo"; } -function current_head { - (cd $(repo_path) && git log --format=%H -n1 $CURRENT) +function target_head { + (cd $(repo_path) && git rev-parse --verify $(pull_request_target $1)) +} + +function ensure_target_fetched { + (cd $(repo_path) && \ + (git rev-parse --verify $1 >/dev/null 2>&1 || git fetch origin $1:$1)) } function pull_request_log { - echo "$logdir/$(current_head)+$(pull_request_head $id)" + echo "$logdir/$(target_head $1)+$(pull_request_head $1)" } function pull_request_new_p { @@ -83,7 +91,7 @@ function log_header { echo Host: $machine echo Image: $SNABB_TEST_IMAGE echo Pull Request: \#$1 - echo Current Head: $(current_head) + echo Target Head: $(target_head $1) echo Pull Request Head: $(pull_request_head $1) pci_info SNABB_PCI0 $SNABB_PCI0 pci_info SNABB_PCI1 $SNABB_PCI1 @@ -96,21 +104,21 @@ function log_header { function benchmark_results { echo $tmpdir/$1_benchmarks; } -function benchmark_current1 { - git checkout --force $CURRENT \ +function benchmark_target1 { + git checkout --force $(target_head $1) \ && build \ - && dock_make benchmarks > $(benchmark_results current) + && dock_make benchmarks > $(benchmark_results $1) } -function benchmark_current { benchmark_current1 >/dev/null 2>&1; } +function benchmark_target { benchmark_target1 $1 >/dev/null 2>&1; } -function merge_pr_with_current1 { +function merge_pr_with_target1 { git fetch origin pull/$1/head:pr$1 \ && git checkout --force pr$1 \ - && git merge $CURRENT \ + && git merge $(target_head $1) \ && build } -function merge_pr_with_current { - out=$(merge_pr_with_current1 $1 2>&1) +function merge_pr_with_target { + out=$(merge_pr_with_target1 $1 2>&1) if [ "$?" != 0 ]; then echo "ERROR: Failed to build $1" echo "$out" @@ -123,11 +131,13 @@ function dock_make { (cd src/; scripts/dock.sh make $1); } function check_for_performance_regressions { echo "Checking for performance regressions:" - dock_make benchmarks > $(benchmark_results pr) - for bench in $(cut -d " " -f 1 $(benchmark_results pr)); do - if grep $bench $(benchmark_results current) >/dev/null 2>&1; then - echo $(grep $bench $(benchmark_results current)) \ - $(grep $bench $(benchmark_results pr)) \ + local head=$(pull_request_head $1) + local target=$(pull_request_target $1) + dock_make benchmarks > $(benchmark_results $head) + for bench in $(cut -d " " -f 1 $(benchmark_results $head)); do + if grep $bench $(benchmark_results $target) >/dev/null 2>&1; then + echo $(grep $bench $(benchmark_results $target)) \ + $(grep $bench $(benchmark_results $head)) \ | awk ' BEGIN { minratio = 0.85; @@ -192,12 +202,15 @@ EOF init fetch_pull_requests && clone_upstream || exit 1 for id in $(pull_request_ids); do - pull_request_new_p $id || continue + ensure_target_fetched $(pull_request_target $id) \ + && pull_request_new_p $id \ + || continue (cd $(repo_path) - [ -f $(benchmark_results current) ] || benchmark_current + [ -f $(benchmark_results $(pull_request_target $id)) ] \ + || benchmark_target $id log_header $id - if merge_pr_with_current $id; then - check_for_performance_regressions + if merge_pr_with_target $id; then + check_for_performance_regressions $id check_test_suite fi) 2>&1 > $(pull_request_log $id) [ ! -z "$GITHUB_CREDENTIALS" ] || continue