-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS renegotiation sometimes causes remaining ghost connections #56
Comments
Hi @SwizzAppz and thanks for the report! Can you share the status output with the "dead entries" too? |
The error can happen multiple times per client/connection. Though the last/newest connection still works normally then. As far as I remember, it also didn't look strange in status info, it's just several connections with the same IP and the bytes counter of the "dead" connections don't change. I will share the status log as soon as it happens again, as I have to enable it first. |
I just happened again, right after that netlink error. This is how management status info looks:
This is a dead connection. I'll try to reproduce it again with enabled status.log. |
This is another example on another server, where reneg-sec is not set on the client (default 3600). The first connection is the dead one:
|
Any news on this? It happened again, with just 1 (the same) client connected. Things I observed: The load is extremely high (ranging from 3.xx to 9.xx), but idle is at 99.x%. I can't stop nor kill the openvpn processes, also a soft reboot fails, I have to reboot the VM from KVM.
|
@SwizzAppz when you try soft rebooting and the reboot hangs, can you please run |
Hope this helps:
BTW: It has nothing to do with TLS renegotiation, the issue happens even with reneg-sec 0 (client and server side) or with default settings, it just takes longer till it happens. |
Thanks a lot. |
Any news on this issue? OpenVPN without DCO is really slow. |
Hi I just noticed a new ovpn-dco-v2 version (20240320). Are there any related fixes in that build I could test? Thanks |
hi @SwizzAppz sorry but there is no fix for this issue. ovpn-dco is undergoing a major restructuring as it was sent tot he kernel mailing list for review. I expect any bug fixing to happen after that (if bugs are still there after the restructuring of course) |
Using the default(?) value of keepalive (10 60) seems to have fixed the issue, no more crashes since 1 week. |
@ordex I have the same problem here. What I noticed in the SourceCode is that in ovpn_peer_delete_work() the ovpn_peer_release(peer) is executed before ovpn_netlink_notify_del_peer(peer): |
yeah, the idea was to stop i/o for that peer before notifying userspace. However, the peer should not be free'd until the next RCU period. Do you have a reliable way to reproduce the issue? (also please note that this part of the code is being totally rewritten, therefore it's likely that any bug existing in this code won't exist any more) |
No, unfortunately not. |
Yeah I found some other related issues here but I also didn't get any further. For now I have implemented these (dirty) hacks, since I don't want to disable DCO as it's really a big improvement:
That why I can run it "stable" without any major downtimes (reboots luckily only take about 20 seconds). |
Wouldn't we need an rcu_read_lock() in the ovpn_netlink_notify_del_peer() function? |
Technically yes, however, this is not the right thing to do. This said, it's not clear to me what is becoming NULL. I'd rather expect peer to become garbage memory (if truly released), but not NULL. Anyway, this part is being rewritten, therefore I don't think it makes much sense for me to spend time on it. |
Any news on the rewriting and integrating in kernel? Is there any status progress / road map one can look up? BTW, I got another error when this issue happens in dmesg: list_del corruption |
You can monitor the netdev mailing list and look up for mails with "ovpn" in the subject.
Not really because lists have also been rearranged/reworked big times. Thanks for poking around though. |
Hi
I'm running OpenVPN 2.6.8 with DCO and a openvpn3.8.3 iOS client. I have the problem, that sometimes after a TLS renegotiation, the old connection remains active in OpenVPN server, which leads to several dead entries over time in status info. Once it even locked the whole server up after about 50 dead entries. It's difficult to reproduce but it can be done, it happens maybe every 30th time or so. If I use "reneg-sec 0" on server and client it doesn't happen, also if I set disable-dco. I set "reneg-sec 15" for testing.
server.conf
client.conf
openvpn.log
I think the connection gets "stuck" when that netlink error comes up, but I'm not 100% sure. Also the client reconnects fine and still works. And it doesn't matter if IPv4 or IPv6, happens on both.
Regards
The text was updated successfully, but these errors were encountered: