Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network boot taking nearly 2 minutes if STP is active on managed Ethernet switch #480

Open
maxnet opened this issue Jun 18, 2023 · 5 comments

Comments

@maxnet
Copy link

maxnet commented Jun 18, 2023

Describe the bug

On managed Ethernet switches that have the Spanning Tree Protocol enabled, every time the network link goes up it tests for network loops for +/- 30 seconds before it lets normal traffic through.

In an ideal world, everyone buying a managed Ethernet switch would know how to properly configure it, and only enable full STP on uplink switch ports that connect to other Ethernet switches, and disable it (or set it to portfast/rapid STP flavour) on Ethernet ports leading to host devices like computers and Pi.
We could provide some education on how to do configuration, but some users may still struggle with the console cable and CLI commands necessary (not all switches offer a convenient option to set this in the management web interface).
And therefore it is realistic to expect that some will end up using a switch that has STP enabled, and suffer from this problem.

In that case my expectancy would be that network boot is delayed by 30 seconds, but not much more.
But reality is if STP is enabled it currently takes the boot firmware like 1 minutes 50 seconds instead:

20230618_143130

It waits 25 seconds for USB MSD to timeout, goes to network boot, does not succeed because switch port is not forwarding traffic yet, goes to other boot methods again having to wait 25 seconds for USB MSD again, does succeed to network boot the second time.

If I look at the events from the switch:

Switch#
*Mar  1 00:20:27.412:     pm_vp 2/14(1): during state not_present, got event 0(add)
*Mar  1 00:20:27.412: @@@ pm_vp 2/14(1): not_present -> present
*Mar  1 00:20:27.412: *** vp_statechange: single: added: 2/14(1)
*Mar  1 00:20:27.412:     pm_vp 2/14(1): during state present, got event 8(linkup)
*Mar  1 00:20:27.412: @@@ pm_vp 2/14(1): present -> authentication
*Mar  1 00:20:27.421:     pm_vp 2/14(1): idle during state authentication
*Mar  1 00:20:27.421: @@@ pm_vp 2/14(1): authentication -> authentication
*Mar  1 00:20:27.421:     pm_vp 2/14(1): during state authentication, got event 18(authen_disable)
*Mar  1 00:20:27.421: @@@ pm_vp 2/14(1): authentication -> notforwarding
*Mar  1 00:20:27.421: STP SW: Fa2/0/12 new blocking req for 1 vlans
*Mar  1 00:20:27.421: *** vp_linkchange: single: up: 2/14(1)
*Mar  1 00:20:29.417: %LINK-3-UPDOWN: Interface FastEthernet2/0/12, changed state to up
*Mar  1 00:20:30.424: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet2/0/12, changed state to up
*Mar  1 00:20:42.428: STP SW: Fa2/0/12 new learning req for 1 vlans
*Mar  1 00:20:51.991:     pm_vp 2/14(1): during state notforwarding, got event 4(remove)
*Mar  1 00:20:51.991: @@@ pm_vp 2/14(1): notforwarding -> present
*Mar  1 00:20:51.991: *** vp_linkchange: single: down: 2/14(1)
*Mar  1 00:20:51.991: @@@ pm_vp 2/14(1): present -> not_present
*Mar  1 00:20:51.991: *** vp_statechange: single: remove: 2/14(1)
*Mar  1 00:20:52.997: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet2/0/12, changed state to down
*Mar  1 00:20:54.004: %LINK-3-UPDOWN: Interface FastEthernet2/0/12, changed state to down
*Mar  1 00:20:54.759:     pm_vp 2/14(1): during state not_present, got event 0(add)
*Mar  1 00:20:54.759: @@@ pm_vp 2/14(1): not_present -> present
*Mar  1 00:20:54.759: *** vp_statechange: single: added: 2/14(1)
*Mar  1 00:20:54.759:     pm_vp 2/14(1): during state present, got event 8(linkup)
*Mar  1 00:20:54.767: @@@ pm_vp 2/14(1): present -> authentication
*Mar  1 00:20:54.767:     pm_vp 2/14(1): idle during state authentication
*Mar  1 00:20:54.767: @@@ pm_vp 2/14(1): authentication -> authentication
*Mar  1 00:20:54.767:     pm_vp 2/14(1): during state authentication, got event 18(authen_disable)
*Mar  1 00:20:54.767: @@@ pm_vp 2/14(1): authentication -> notforwarding
*Mar  1 00:20:54.767: STP SW: Fa2/0/12 new blocking req for 1 vlans
*Mar  1 00:20:54.767: *** vp_linkchange: single: up: 2/14(1)
*Mar  1 00:20:56.764: %LINK-3-UPDOWN: Interface FastEthernet2/0/12, changed state to up
*Mar  1 00:20:57.771: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet2/0/12, changed state to up
*Mar  1 00:21:09.775: STP SW: Fa2/0/12 new learning req for 1 vlans
*Mar  1 00:21:24.782: STP SW: Fa2/0/12 new forwarding req for 1 vlans
*Mar  1 00:21:24.782:     pm_vp 2/14(1): during state notforwarding, got event 14(forward_notnotify)
*Mar  1 00:21:24.782: @@@ pm_vp 2/14(1): notforwarding -> forwarding
*Mar  1 00:21:24.782: *** vp_list_fwdchange: forward: 2/14(1)
  • It seems that when the boot firmware starts, network link goes up, and it start testing for network loops.
  • However 25 seconds later, link goes down and up again. Causing it to start testing for network loops again from the beginning.
  • It seems you are only sending DHCP requests for a short time. So network boot times out in the first round, before the switch forwards traffic.

Would it be possible to change the boot firmware so that it only set the network link carrier to up at the very beginning, and no longer resets the network link when it goes from USB MSD to network boot mode?
So that the 25 seconds spent waiting for USB MSD to timeout, would count towards the forwarding delay.

And it would be nice if it would try DHCP a little longer. The PXE standard demands DHCP discover is retried at 4, 8, 16 and 32 seconds for a reason.

Steps to reproduce the behaviour

  • Get a managed Ethernet switch that has STP enabled by default. Most notorious brand for having it on by default is Cisco.
  • Use default settings (on most Cisco models you can hold the "mode" button for some seconds to reset to factory settings)
  • Network boot a Pi

Device (s)

Raspberry Pi 4 Mod. B, Raspberry Pi 400, Raspberry Pi CM4, Raspberry Pi CM4 Lite

@pelwell
Copy link
Collaborator

pelwell commented Jun 18, 2023

One could argue that the longer delay is harder to ignore, and therefore more likely to be detected and solved by turning off STP on that port.

@maxnet
Copy link
Author

maxnet commented Jun 18, 2023

If detection is the goal: then I suggest the boot firmware prints a big fat warning message if it detects STP is enabled on the network port to alert the user.
While the switch does not forward traffic, the Pi do can see the packets the switch sends to test for network loops, so can test for that.

In the Piserver code I am already displaying a warning on the server if it detects STP is active in the network.
https://github.com/raspberrypi/piserver/blob/master/src/stpanalyzer.cpp
Could do something similar in cmprovision (currently no tests there).
However that detection is better done on the client. As server-side I can only test if STP is active on the server's port, while on most managed switches it is configurable on a per-port basis, and it is the client's port that counts.
And the testing also does not work if the server OS is running under virtualization.

@timg236
Copy link
Collaborator

timg236 commented Jun 19, 2023

The DHCP timeout is quite long already and can be overridden in the EEPROM config. Not resetting the PHY is likely to cause as many problems as it solves so I think the answer is that the network boot behaviour is not likely to change in the near future.

@maxnet
Copy link
Author

maxnet commented Jun 19, 2023

The DHCP timeout is quite long already and can be overridden in the EEPROM config.

How long does it actually keep sending new DHCP discover packets, as opposed to just waiting for replies?

As you can see in my switch log. The network link carrier goes up at 00:20:57.771, and it allows packets to be forwarded at 00:21:24.782
So things do are functional in 27 seconds.

The fine documentation suggests the default timeout is 45 seconds: https://www.raspberrypi.com/documentation/computers/raspberry-pi.html#DHCP_TIMEOUT
45 seconds is more than 27.
So how come it does not work the first round?
And only boots after it goes to the second round (in which it does not seem to reset the PHY anymore).

Not resetting the PHY is likely to cause as many problems as it solves

And the resetting has to wait until it gets to network boot order?
It cannot be done first thing when the firmware boots?

@maxnet
Copy link
Author

maxnet commented Jun 19, 2023

How long does it actually keep sending new DHCP discover packets, as opposed to just waiting for replies?

To answer my own question.
If for testing purposes I kill my DHCP server, and sniff how many packets your boot firmware sends, it seems to be only the grand total of 3 packets in a round.

$ sudo tcpdump -i eth0 port bootpc -n
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
17:57:06.143461 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:57:76:be, length 322
17:57:14.136637 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:57:76:be, length 322
17:57:30.136505 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from b8:27:eb:57:76:be, length 322

Last DHCP discover packet is 24 seconds after the first.

Your default timeout of 45 seconds may be long, but if it is not sending out any DHCP discovers anymore after 24 seconds, and everything in the first +/- 30 seconds is dropped if the user has a STP enabled switch, that is not going to work...

==

The fine documentation mentions that it is supposed to be retrying at 4 second intervals.

DHCP_REQ_TIMEOUT
The timeout in milliseconds before retrying DHCP DISCOVER or DHCP REQ.

Minimum: 500
Default: 4000

While it seems more like an exponential back-off delay starting at 8 seconds in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants