Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to monitor XDP handled packets #1339

Open
oblazek opened this issue Jan 8, 2024 · 11 comments
Open

Ability to monitor XDP handled packets #1339

oblazek opened this issue Jan 8, 2024 · 11 comments

Comments

@oblazek
Copy link

oblazek commented Jan 8, 2024

Seems like hubble (nor cilium monitor) cannot observe traffic to services which are XDP accelerated.

We've been using standalone L4LB in production for approx 2 years now, but haven't used hubble so far in this context.
The only traffic that is visible is traffic targeted for the l4lb host itself, but not for configured services.
Cilium is configured in ipip mode with native acceleration, so cannot unfortunately be switched to opt+tc.

Example what's visible - ssh, etcd, node exporter., bgp.:

Jan  8 07:31:51.954: 10.69.59.2:42996 (ID:9) <> 10.249.188.39:9100 (host) from-network FORWARDED (TCP Flags: ACK)
Jan  8 07:31:51.954: 10.69.59.2:42996 (ID:9) <> 10.249.188.39:9100 (host) from-network FORWARDED (TCP Flags: ACK)
Jan  8 07:31:51.954: 10.69.59.2:42996 (ID:9) <> 10.249.188.39:9100 (host) from-network FORWARDED (TCP Flags: ACK)
Jan  8 07:31:51.955: 10.69.59.2:42996 (ID:9) <> 10.249.188.39:9100 (host) from-network FORWARDED (TCP Flags: ACK)
Jan  8 07:31:51.968: 10.249.188.39:41230 (host) <> 10.249.188.1:179 (ID:9) to-network FORWARDED (TCP Flags: ACK)
Jan  8 07:31:51.984: 10.249.188.39:50560 (host) <> 10.244.11.93:2379 (ID:9) to-network FORWARDED (TCP Flags: ACK)
Jan  8 07:31:52.158: [2a02:598:a:ffff::75]:58888 (ID:10) <> [2a02:598:96::2639]:22 (host) from-network FORWARDED (TCP Flags: ACK)
Jan  8 07:31:52.174: [2a02:598:a:ffff::75]:58888 (ID:10) <> [2a02:598:96::2639]:22 (host) from-network FORWARDED (TCP Flags: ACK, PSH)

But nothing like this:

root@lb-l3-13:~# docker exec -it cilium-l4lb cilium service get 5552
ID     Frontend             Service Type   Backend                           
5552   10.248.11.112:7005   NodePort       1 => 10.231.2.34:42069 (active)

Perhaps related cilium errors:

level=debug msg="failed to decode payload" error="Invalid (too small) IP header length (0 < 5)" event="&{c02be881-72b1-4f39-b0da-7ff2fd93715d 2024-01-08 08:14:02.804980049 +0000 UTC m=+271.846731729 lb-l3-13.ng.iszn.cz 0xc0004afce0}" subsys=hubble
level=debug msg="failed to decode payload" error="Invalid (too small) IP header length (0 < 5)" event="&{954928ad-23ef-4cab-bfe6-352d8aa93288 2024-01-08 08:14:02.804985149 +0000 UTC m=+271.846736829 lb-l3-13.ng.iszn.cz 0xc0004afd20}" subsys=hubble
level=debug msg="failed to decode payload" error="Invalid (too small) IP header length (0 < 5)" event="&{25a10a36-3ace-4f6e-85ba-232072ce02a8 2024-01-08 08:14:02.804991449 +0000 UTC m=+271.846743139 lb-l3-13.ng.iszn.cz 0xc0004afd80}" subsys=hubble
level=debug msg="failed to decode payload" error="Invalid ip6-extension header. Length 38 less than specified length 248" event="&{eeada1a5-23f3-4696-a5b0-79a08d7dbdba 2024-01-08 08:14:02.804995099 +0000 UTC m=+271.846746789 lb-l3-13.ng.iszn.cz 0xc0004afda0}" subsys=hubble
level=debug msg="failed to decode payload" error="Invalid ip6-extension header. Length 38 less than specified length 1112" event="&{e8e14e31-4179-43a1-9bbc-1037e7b11d35 2024-01-08 08:14:02.805219509 +0000 UTC m=+271.846971189 lb-l3-13.ng.iszn.cz 0xc0004afe20}" subsys=hubble
level=debug msg="failed to decode payload" error="Invalid ip6-extension header. Length 38 less than specified length 1048" event="&{d2165de8-16b2-4bfb-b736-e57152cd235a 2024-01-08 08:14:02.805240549 +0000 UTC m=+271.846992229 lb-l3-13.ng.iszn.cz 0xc0004afe80}" subsys=hubble

Note: we're using xdpdump when debugging.

@gandro
Copy link
Member

gandro commented Jan 8, 2024

Thanks for raising this. These errors really should be investigated, I suspect this is because L4LB does IP-in-IP encapsulation, which the Hubble parser doesn't understand. Potentiality related issue cilium/cilium#15515

@oblazek
Copy link
Author

oblazek commented Jan 8, 2024

we might have a solution for this.. my colleague will try to send a PR with the fix, once I verify that it works

@oblazek
Copy link
Author

oblazek commented Jan 8, 2024

yeah looks like we can fix the failed to decode payload, but still can't monitor xdp handled traffic :/

@gandro
Copy link
Member

gandro commented Jan 8, 2024

What do you see in cilium monitor? Do you observe usable events there? Otherwise we have to investigate what exactly the datapath sends down to userspace in XDP mode.

@oblazek
Copy link
Author

oblazek commented Jan 8, 2024

unfortunately there seem to be nothing that is handled by xdp :/

@oblazek
Copy link
Author

oblazek commented Jan 8, 2024

I can investigate myself but would be glad for some pointers

@gandro
Copy link
Member

gandro commented Jan 8, 2024

@borkmann Do you know if/where we emit trace events from the XDP datapath (the monitor ones, not the pcap recorder)? Or was this never implemented?

@oblazek
Copy link
Author

oblazek commented Feb 26, 2024

imho this is the issue: cilium/cilium#19703

@brb
Copy link
Member

brb commented Mar 17, 2024

MBOI cilium/pwru#339.

@oblazek
Copy link
Author

oblazek commented Mar 17, 2024

let's see what comes up from tomorrows summit 🙂

@brb
Copy link
Member

brb commented Jul 10, 2024

pwru with the XDP tracing is finally released - https://github.com/cilium/pwru/releases/tag/v1.0.7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants