Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP checksum computation: improve performance #31

Open
hannesm opened this issue Nov 28, 2023 · 5 comments
Open

TCP checksum computation: improve performance #31

hannesm opened this issue Nov 28, 2023 · 5 comments

Comments

@hannesm
Copy link
Contributor

hannesm commented Nov 28, 2023

from #30

siege -c 1 -r 1000 on my laptop (and a retreat unikernel on the other side) results in:
mirage-tcpip (which doesn't validate any checksums): ~2000 req/s
current main (cea1509): ~1400 req/s
this head (d639adc): ~1500 req/s
utcp without checksum verficiation: ~2300 req/s

It may be worth to investigate using C or assembly (see e.g. https://blogs.igalia.com/dpino/2018/06/14/fast-checksum-computation/ and especially snabbco/snabb#899).

@hannesm
Copy link
Contributor Author

hannesm commented Dec 3, 2023

with 1b341a8 (avoiding bounds checks) we get ~1800 req/s

@palainp
Copy link

palainp commented Dec 4, 2023

The linked article in the PR (http://locklessinc.com/articles/tcp_checksum/) is very interesting and, after a quick test, I can confirm that it computes the csum very fast :) (but so far the result has a wrong indianess to me (e.g CS=0x7EC7 instead of 0xC77E) but after 150000 iteration of random length in [8B,63kB]:

  • C 32b word csum = 22.08 us
  • C 64b word csum = 13.04 us
  • checksum15 = 3.12 us

@hannesm
Copy link
Contributor Author

hannesm commented Dec 5, 2023

Interesting numbers @palainp -- so do you have a comparison to the current implementation in this library (Checksum.digest)?

If you happen to have a comparison table and the C code also integrated into this library, I'd appreciate a PR (if the C code is much faster).

@palainp
Copy link

palainp commented Dec 5, 2023

Unfortunately this only was with a local test. I'll try to add a C binding here and PR when it's done and if it's faster. (I suppose it might be hard to bind against/maintain the asm version?)

@hannesm
Copy link
Contributor Author

hannesm commented Dec 5, 2023

well, asm -- why not? ;) considering that lots of deployed systems are amd64, we can have special assembly for that. E.g. mirage-crypto has feature detection when to use which code paths. What is crucial from that experience is that while the assembly is fine to be shipped always, it is important to not restrict the resulting binary when build on one system (with specific CPU features) to run on a system that requires the very same features (i.e. mirage/mirage-crypto#53 was a great achievement).

Since, esp. with unikernels and in general, I prefer to have a separate build machine from running machine.

Embedding of asm code is also best done using C mnemonics (see mirage-crypto repository as example).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants