-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(bpf): implement stack bypass #458
feat(bpf): implement stack bypass #458
Conversation
- listen tcp inside daens - setup routing inside daens - attach new bpf prog to dae0 + dae0peer
没能成功实现代理(openwrt23.05_aarch64),log如下 time="2024-02-20T12:16:08+08:00" level=info msg="Loading eBPF programs and maps into the kernel..." update: 可能是我拿这个新core套了dae-wing的缘故 |
新的劫持路径如果成功,会不会能代理pppoe-wan? |
Awesome. Tested successfully on x86_64 (kernel 6.7.5), it seems to work well. BTW: To apply this pull request, the auto_config_firewall_rule configuration needs to be removed if you have set. |
immortalWrt 23.05.01 运行稳定,0.5的各种dns问题看起来也都解决了 |
Nice testing! Could you conveniently test which ones are unnecessary in this issue: #79. Let's see if dae can run without modifications. |
很好,成功分流 |
Thank you for your efforts to fix this issue, unfortunately the warning log has not been cleared. |
Sorry, I pulled the source code but forgot to compile it. Damn. There's no problem now. ✅ By the way, through PR 458, DAE will have a widespread impact on the OpenWRT world's proxy. Good job, big brother. |
c533215
to
e7525dc
Compare
why drop b7d836d ? |
skb->mark will be reset when going across netns (skb_scrub_packet), so this commit sets a special value in cb[0] which can survive bpf_redirect and netns crossing. This solves issues like: level=warning msg="No AddrPort presented: reading map: key [[::ffff:0.0.0.0]:68, 17, 255.255.255.255:67]: lookup: key does not exist"
e7525dc
to
a1a4012
Compare
a1a4012 does not cause any conflicts - just confirmed. cc @douglarek @jschwinger233 |
Current implementation simply throws away these packets. This commit is the first step of implementing re-sending them after handshake.
劫持路径的 stack bypass 实现
Background
之前的问题集中在:
这个 PR 试图把“劫持路径”(“分流路径”)绕过内核栈,并且保持 datapath 对称,希望能解决大部分问题。
Datapath
0.5 wan: 注意劫持路径请求和回复是非对称的,而且从 wan0 到 dae 的网络栈造成了大部分问题
0.5 lan for udp: 请求和回复也是非对称的,而且 dae0 的 lladdr + sysctl 可能被 systemd 修改也造成了不少问题
新的 wan datapath:注意 wan0 和 dae0 之间在双向都是通过 bpf_redirect 跳过内核栈,所以不需要配置 nft 和 sysctl。,图中数字标注解释如下:
新 lan datapath: lan0 和 dae0 之间也是 bpf_redirect。图中数字标注解释如下:
新路径是完全对称路径,希望能尽量减少潜在的问题。
Implementation
需要四个 tc bpf prog:
a. 只在 dae netns 里监听 :12345
b. 不需要监听 dae0 lladdr,现在只需要 dae0-peer 的 lladdr,但它在 dae netns 里面,应该不会被修改(应该吧。。。)
c. 删除 autoConfigFirewall flag,因为不需要配置 nft
d. ip rule 只需要在 dae netns 里设置
Code Walkthrough
Hijack Path Stack Bypass Implementation
Background
Previous solutions focused on:
sysctl accept_local=1
, and dae0 requires settingrp_filter=0
.Solution
This PR attempts to bypass the kernel stack for the "hijack path" ("diversion path") and keep the datapath symmetric, which is expected to solve most of the problems.
Datapath
0.5 wan:
Note
Note that hijack path requests and replies are asymmetric, and the network stack from wan0 to dae causes most of the problems.
0.5 lan for udp:
New wan datapath:
New lan datapath:
Implementation
bpf prog
Four tc bpf progs are needed:
route()
to make routing decisions, redirects traffic to dae0 for routed traffic usingbpf_redirect
. Before redirection, it records aredirect_track
where the key is(sip, dip, l4proto)
and the value is(smac, dmac, ifindex)
. Before redirection, it also modifiesethhdr->dest
to dae0-peer's lladdr.route()
to make routing decisions, redirects traffic to dae0 for routed traffic usingbpf_redirect
. Before redirection, it records aredirect_track
where the key is(sip, dip, l4proto)
and the value is(smac, dmac, ifindex)
. Before redirection, it also modifiesethhdr->dest
to dae0-peer's lladdr.redirect_track
, modifies layer 2 header, callsbpf_redirect
to redirect to wan0 or lan0.sk_lookup
+sk_assign
.control
a. Listen only in dae netns on :12345.
b. No need to listen to dae0 lladdr, now only dae0-peer's lladdr is needed, but it's within dae netns and shouldn't be modified (hopefully...).
c. Remove
autoConfigFirewall
flag because nft configuration is not needed.d. Set
ip rule
only in dae netns.Code Walkthrough
c.ListenAndServe()
to run withinDaeNetns
.AutoConfigFirewallRule
.DaeNetns.Setup()
earlier, removenft AcceptInputMark
, movesetupRoutingPolicy
to run withinDaeNetns
, add a new function callbindDaens
.bindDaens
function, wheredae0_ingress
anddae0peer_ingress
bpf programs are attached.setupSysctl
settings, removemonitorDae0LinkAddr
.Checklist
Full Changelogs
Issue Reference
Closes #[issue number]
Test Result