-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple APs with mcDebian #70
Comments
Oh, this is all with the newer 5.6.14 kernel, and I've upgraded nftables and the firewall is nft, but that's only one of the APs (I've got 3... :), and even if I turn off the radio on the actual firewall and just try to roam between the other two, which don't have any nftable configuration or any other firewalling, it still takes 5 minutes to move between them... |
Once in a while I have seen this but it hasn't been much of an issue. I mostly see this if the APs see each other at greater than -65dbm which means they are too far apart so the client will not be able to change AP once it reaches -70 from the AP it's currently connected to. |
Either of these APs would do an alright job serving the entire house, so I don't think it's an issue like that. It's very consistent and just about always 5 minutes of not being able to get to anything when switching APs, feels very much like a timeout or something. Once the device is back online, as long as it isn't moved too far, it works great with strong signal strength and everything. |
I recommend you run a scan on each AP to see what the other AP signal strengths are like between them. |
Ok, I've run scans on each of the APs, on both interfaces, and included the results here. ap1 is at 00:25:9c:13:a2:35 / 00:25:9c:13:a2:36 on 2ghz, ap1 -> ap2 has signal: -52.00 dBm on 2ghz, ap2 -> ap1 has signal: -59.00 dBm I'm a bit surprised at these results since the two are only 1 floor apart (though on opposite sides of the townhouse). ap1_2ghz_scan.txt Also included is a client scan: This shows that signal strength to ap1 on 2ghz is -40, on 5ghz -46 This client is in the basement, so one AP is a floor closer (and is on the same side as the client), so it's not surprising that this is the case. Generally speaking, once a device (eg: my iphone) is associated with an AP, everything is fine and works great- the issue is that moving between floors with the device ends up having it switch APs (which should be fine...), but then it takes about 5 minutes before it's able to actually connect to anything and that's the part that I just can't figure out. It does look like the client sees FT-PSK enabled, which seems like it'd be a good thing, and I have that configured as: ap1: wpa_key_mgmt=WPA-PSK FT-PSK ap2: wpa_key_mgmt=WPA-PSK FT-PSK but it doesn't seem to have helped at all- I get the same effect with the original config in mcdebian which has: wpa_key_mgmt=WPA-PSK and then doesn't specify anything for nas_identifier/mobility_domain/ft_psk_generate_local Any thoughts or suggestions you have on this would be great to hear, happy to try out different things too. Have to say that it's driving me a bit crazy! I would have thought this would be straight-forward to get going... |
I ran in to the same problems as you describe in the initial post. If you want roaming, but do not want to setup a nat on every access point, the bridge's forwarding database is still sending packets to the old port when you have roamed to a new port. After some time the database entries are cleaned up and you finally have your internet connection. The same happens in any ethernet switch between router and access points. I programmed 2 solutions for this problem and added them to my repositories |
Where are the solutions..? Could you provide a link? |
The one I use is: As I also run my router on Ubuntu, not only my AP. |
Ok, this is starting to look promising and pretty curious. When looking at the fdb, what I'm seeing is that I'm getting two entries for a client's MAC, eg: d0:3f:aa:e8:XX:XX dev lan1 self Removing the one associated with 'lan1' by hand does indeed make things start working, but I'm trying to figure out how this is ending up happening in the first place..? Surely when a given device is connected to the wireless it shouldn't be getting picked up as also being on the lan, but that's what is happening and somehow that's confusing the AP and traffic destined for that MAC isn't getting sent over the wifi. I've further noticed that it seems to only be happening when the two interfaces are bridged that the packets aren't being forwarded, when there's actual routing happening, everything seems to work fine. I'm still not sure if there's something configured incorrectly or if there's just a bug, but I ended up writing and running this script on all my APs and it seems to have "fixed" the issue for me: `#!/bin/bash PRIOR_MAC="" while [ 1 ]; do Thanks for the pointer to go look at the bridge fdb! If anyone learns more about what's going on here, I'd love to hear it, since this is quite frustrating! |
Constantly deleting macs from the switches fdb, turns the ethernet switch into an ethernet hub... This is why i wrote bridgefdbd which only deletes when it is necessairy. |
Sorry I don't have a lot of time to help with this but my (older) multi AP deployment of McDebian with Enterprise wifi looks like this:
I make sure the channels of both the 2.4Ghz (1, 6 and 11) and 5Ghz (36 and 149) are staggered. Also all MAC addresses are unique and generated via https://www.hellion.org.uk/cgi-bin/randmac.pl?scope=local&type=unicast and the wireless ones I change to have a 0 on the end due to the way the wireless driver created MAC address for multiple bssids. In this example the WRT WAN port is connected to the main network switches which trunks the different network VLANs to the AP. I hope this helps in some way. |
The little shell script I wrote also is only delete'ing the MAC when it needs to- that is, when the MAC is listed both on a lan interface and on a wifi interface. I really don't understand what's going on in Linux (or maybe it's the switch..?) that's resulting in having the MAC show up associated with both interfaces, or why having it associated with both interfaces is causing traffic to not get passed through. The more I think about it, the more it seems like this must be a bug somewhere. Maybe because they're both connected to a bridge, Linux is propagating the MAC from the wifi interface to the lan interface? I don't know why that would make sense though. As near as I can tell, having the MAC on both interfaces is happening even if the wifi device has only associated to one AP and hasn't been on any of the others recently, meaning that there's absolutely no reason for its MAC to show up on the lan interface of the AP it's connected to, but that's what is happening and making it impossible for the wifi device to communicate with anything because the traffic isn't getting passed through from the wifi interface to the lan interface to be sent out. Very odd. As a side note, I doubt having actual switches involved or in-between would actually be an issue- in my experience, physical switches very quickly pick up on MAC address changes without issue. I'm pretty surprised that Linux is having such an issue. |
Thanks so much @Chadster766 for that config! I'm not sure that it'll help with what's going on here, but I'll definitely review what you shared and see if there's improvements to my setup that I should make. Thanks again! |
What you are trying to do is not a bug but simply not supported on ethernet networks. On Windows it is even prohibited: Quote: Layer 2 bridging is prohibited between the AP adapter and any other adapters in the system. This is why you need to use ICS/SoftAP on Windows to have the AP creatie a local network for the wireless connection. On linux you use iptables/nftables masquerading in combination with a dhcp server. Any AP you buy does the same thing. Reportedly a few use Mac spoofing, Just as my other solution mcspoof On linux it is not prohibited, but if you bridge wifi to lan on your AP you will run into the problems you have described. You need to apply some fix. You can connect your mcdebian AP's with the WAN port to your router, as @Chadster766 does. This basically does setup the local network for the AP. Sadly, this also sets up the LAN ports of the AP inside the same local network. I believe @Chadster766 uses vlan to tackle this. As I do not like to setup a local network for wifi on every AP, I created other solutions. |
Where is that quote from...? What is the reasoning for that? I mean, mcdebian is specifically set up to do exactly what I'm doing- creating one bridge between the wireless interfaces and the ethernet interfaces, and it sure looks like that's how most APs work when viewing them from the outside. All I'm trying to get to is a point where I've got one L2 network while using more than one mcdebian access point. This is really not a complicated setup... |
The word Quote in the previous post is a link to the webpage on the Microsoft site. It is the documentation of MS Wireless Hosted Network, the Microsoft version of hostapd.
Even the developer of mcdebian uses the WAN port.
Not complicated at all... But on the inside.... I seems you will be looking for other solutions. Please let me know if you find any, I would be very interested. |
I don't see why what is on Microsoft's site would be relevant, we aren't running Windows on these things. :)
Sure- for upstream, but the non-WAN ports are all part of the same bridge that the wifi adapters are connected to, just like in my setup.
I mean, what I've got now with the shell script I posted before more-or-less works, but it definitely isn't ideal. |
@ericwoud I looked into those posts but the files they reference don't seem to exist in new kernel versions. |
It looks like they are working on it: This is a V2, which I tried on kernel 5.10, without success. There is a V3, but I'm not so sure this patch will affect our system. One file patch is on drivers/net/dsa/ocelot/felix.c which is not a marvell dsa driver. It is however tackling the same problem we are having here... |
I don't think the WRT Marvel DSA switches are effected by this issue. Keep in mind that the MAC Address of the WRT wireless wlp1s0 and wlp2s0 should end in 0 because of the way the driver creates BSSIDs. This is especially important if you have multiple SSID per radio. |
sfrost wrote:
The self entry on lan1 is the entry in the hardware. This is the one that is stuck in the fdb, actually the dsa's fdb. Removing this entry, manually or any other means, makes things working again. |
Looks like there's a v4 of that patch set that you mentioned now, would be interesting if someone could play with it and see if it fixes the issue. I did go back and figure out that setting the MAC addresses on the wireless interfaces wasn't working properly in a couple of cases (to make sure the last digit is a 0). That didn't seem to be causing any particular issues, but I fixed it anyway and now they definitely all have a '0' at the end- but it didn't fix the roaming issue where the MAC of the device was ending up in multiple places and not getting cleaned up and therefore not working to have traffic get passed through. Ultimately, whatever issue that fixed, it didn't seem to be one that I'm running into. The post here, as mentioned above- https://gitlab.nic.cz/turris/turris-build/-/issues/165 seems to really be spot-on since it mentions the 300 second timeout which I was also definitely experiencing. |
Hi Stephen, See this commit This could be the answer. I have not tried it as I am migrating to bananapi R64. |
Greetings,
Having had very little luck figuring it out, I figure I'll ask here- have you any experience with getting multiple APs to work with McDebian all providing the same SSID and allowing devices to roam between APs?
I've done relatively minimal changes to the mcDebian hostapd.conf files (change the SSID and the passphrase, mainly) and have discovered that when Windows laptops and iPhones switch APs, there's 5 minutes of 'dead' time before they're able to actually get to the internet. I thought this might be due to bridging VLANs in the Linux kernel and so I ended up moving entirely over to just regular switches for the VLAN work and have kept the actual APs relatively simple, but still there's this delay when switching access points.
Next, I tried to get FT-PSK to work with hostapd and while I was able to run hostapd with FT-PSK, it didn't seem to help- I'm not sure if that's because I didn't configure for 802.11r properly (I tried to use the 'ft psk' mode rather than configuring all the r0hk stuff, but maybe that was wrong?).
Basically, just hoping that maybe you, or one of the other folks who follows this project, has already dealt with this and figured out the right hostapd.conf incantation to make this work cleanly and smoothly. Having to wait 5 minutes when moving between floors is driving me crazy. :)
Thanks!
The text was updated successfully, but these errors were encountered: