Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rtsold: specific cases maybe not send signals to dhcp6c #215

Open
2 tasks done
wevsty opened this issue Aug 19, 2024 · 5 comments
Open
2 tasks done

rtsold: specific cases maybe not send signals to dhcp6c #215

wevsty opened this issue Aug 19, 2024 · 5 comments
Assignees
Labels
upstream Third party issue

Comments

@wevsty
Copy link

wevsty commented Aug 19, 2024

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

When the ISP forcibly updates the IPV6 prefix, rtsold does not seem to notify dhcp6c to handle it.

My ISP seems to force a new IPV6 prefix to be released every so often, and the ISP doesn't seem to be keeping to its time agreement with its customers.
When a new prefix is released, all machines on the LAN are disconnected. Normally, we would think that rtsold would send a SIGHUP to dhcp6c, and that dhcp6c would receive the notification and resend a DHCPv6 request to update the prefix.

However, this does not seem to be the case in my case. This eventually causes all devices to disconnect from the network for a period of time (about 10 minutes in my case).

The results and system logs of capturing packets when a problem is sent can be found in the: opnsense/dhcp6c#37 (comment) and opnsense/dhcp6c#37 (comment)

Since rtsold gives limited debug information, if there is a way to get more useful information please let me know and I will try to get more information.
And if there are any other suggestions to help diagnose the problem, I'd be more than happy to try them.

Tip: to validate your setup was working with the previous version, use opnsense-revert (https://docs.opnsense.org/manual/opnsense_tools.html#opnsense-revert)

To Reproduce

Waiting for ISPs to issue new IPV6 prefixes

Expected behavior

rtsold should notify dhcp6c within a few seconds so that the disconnection should only take a few seconds.

Describe alternatives you considered

None.

Screenshots

None.

Relevant log files

None.

Additional context

Add any other context about the problem here.

Environment

OPNsense 24.7.1-amd64
Dell Optiplex 3070 MFF
Intel(R) Core(TM) i3-8100T CPU @ 3.10GHz (4 cores, 4 threads)
Network card:
Intel I210 (WAN)
Realtek NIC (LAN)

@fichtner fichtner self-assigned this Aug 19, 2024
@fichtner
Copy link
Member

Thanks for the ticket! Some initial digging in the manual page:

     Specifically, rtsold sends at most 3 Router Solicitations on an interface
     after one of the following events:

     •   Just after invocation of rtsold daemon.
     •   The interface is up after a temporary interface failure.  rtsold
         detects such failures by periodically probing to see if the status of
         the interface is active or not.  Note that some network cards and
         drivers do not allow the extraction of link state.  In such cases,
         rtsold cannot detect the change of the interface status.
     •   Every 60 seconds if the -m option is specified and the rtsold daemon
         cannot get the interface status.  This feature does not conform to
         the IPv6 neighbor discovery specification, but is provided for mobile
         stations.  The default interval for router advertisements, which is
         on the order of 10 minutes, is slightly long for mobile stations.
         This feature is provided for such stations so that they can find new
         routers as soon as possible when they attach to another link.

So the -m would maybe speed this up if the daemon even considers the link as reset. But I have the feeling it doesn't as your connection never recovered before and now dhcp6c can on its own.

Now if this is a dead end the other possibility is to look into default router and SLAAC advertisements. If your ISP sends a router advertisement of zero lifetime rtsold could consider this an invitation to reset and try again?

The information could be extracted from the kernel and handled separately but I have the feeling another daemon to do this wouldn't make much sense either.

Cheers,
Franco

@fichtner fichtner added the upstream Third party issue label Aug 19, 2024
@wevsty
Copy link
Author

wevsty commented Aug 20, 2024

So the -m would maybe speed this up if the daemon even considers the link as reset. But I have the feeling it doesn't as your connection never recovered before and now dhcp6c can on its own.

I could try changing the -m option to confirm if there is an improvement, but I don't think it's a good idea to wait for a polling check.

Now if this is a dead end the other possibility is to look into default router and SLAAC advertisements. If your ISP sends a router advertisement of zero lifetime rtsold could consider this an invitation to reset and try again?

I think it's possible.
The manual for rtadvd states that

       Basically, hosts	MUST NOT send Router  Advertisement  messages  at  any
       time  (RFC 4861,	Section	6.2.3).	 However, it would sometimes be	useful
       to allow	hosts to advertise some	parameters such	as prefix  information
       and  link  MTU.	 Thus, rtadvd can be invoked if	router lifetime	is ex-
       plicitly	set zero on every advertising interface.

       ……

       Use SIGHUP to reload the	configuration file  /etc/rtadvd.conf.	If  an
       invalid	parameter  is found in the configuration file upon the reload,
       the entry will be ignored and the old configuration will	be used.  When
       parameters in an	existing entry are updated, rtadvd  will  send	Router
       Advertisement messages with the old configuration but zero router life-
       time to the interface first, and	then start to send a new message.

       Use  SIGTERM  to	 kill  rtadvd  gracefully.   In	this case, rtadvd will
       transmit	router advertisement with router lifetime 0 to all the	inter-
       faces (in accordance with RFC 4861 6.2.5).

This document suggests that broadcasting the prefix lifetime to 0 is a standard action, and upstream ISPs are likely to have done the same thing.
But I think any IPV6 prefix change should trigger sending a signal to dhcp6c.
I'm not sure what happens when a prefix with a lifetime of 0 is reset immediately upon broadcast.

The information could be extracted from the kernel and handled separately but I have the feeling another daemon to do this wouldn't make much sense either.

The manual for rtsold has the -O parameter written in it.


       -O script-name
	       Specifies a supplement script file to handle the	Other Configu-
	       ration flag of the router advertisement.	 When the flag changes
	       from FALSE to TRUE, rtsold will invoke script-name with a first
	       argument	of the receiving interface name	and a second  argument
	       of  the	sending	router address,	expecting the script will then
	       start a protocol	for the	other configuration.  The script  will
	       not  be run if the Managed Configuration	flag in	the router ad-
	       vertisement is also TRUE.  script-name  must  be	 the  absolute
	       path  from  root	 to the	script file, be	a regular file,	and be
	       created by the same owner who runs rtsold.

This parameter will handle the Other Configuration flag.
In my case the command to start is /usr/sbin/rtsold -p /var/run/rtsold.pid -A /var/etc/rtsold_script.sh -R /usr/local/opnsense/scripts/interfaces/rtsold_resolvconf.sh -a -u -D I observe that the -O parameter is not specified.
I think by using this parameter and specifying a new script, we can handle the prefix change.

@fichtner
Copy link
Member

fichtner commented Aug 20, 2024

I could try changing the -m option to confirm if there is an improvement, but I don't think it's a good idea to wait for a polling check.

Well, it is a workaround for "mobile" connections after all.

But I think any IPV6 prefix change should trigger sending a signal to dhcp6c.

You're conflating SLAAC with DHCPv6 maybe because your ISP handles it this way. While you need SLAAC for DHCPv6 to work (DHCPv6 doesn't provide routers!) the two should operate independently from each other after a lease has been successfully acquired. Much of where this fails is when the ISP restarts their DHCP servers and leases are "lost" on the server side but still used by the client. Contrary to SLAAC/RA, DHCPv6 doesn't have a mechanism to revoke a valid lease. Fun stuff. :)

That being said I still agree with you that a prefix deprecation should be considered a link event because of its impact on the overall connectivity.

My best guess is that IPSs try to avoid zero lifetime advertisements in the average cases which would allow us to get away with a change in behaviour from rtsold, maybe coupled with a new option. The code to read the DHCP options presented by the router is already inside rtsol.c so it should be relatively easy to read the vltime of the prefix and generate an event when it is zero.

This parameter will handle the Other Configuration flag.

The -A parameter supersedes this for convoluted reasons.

Cheers,
Franco

@wevsty
Copy link
Author

wevsty commented Aug 20, 2024

My best guess is that IPSs try to avoid zero lifetime advertisements in the average cases which would allow us to get away with a change in behaviour from rtsold, maybe coupled with a new option. The code to read the DHCP options presented by the router is already inside rtsol.c so it should be relatively easy to read the vltime of the prefix and generate an event when it is zero.

I don't have contact with ISPs in other countries, so I don't know if lifetime to 0 is a special operation, which may require more data reporting or experience.
For me, I think adding the option to change the behavior of rtsold is acceptable.

Please contact me if you need to do any testing.And thank you for your help.

@fichtner
Copy link
Member

It's just a guess based on the fact that the SLAAC prefixes should be/could be rather static in the average case, but I'm willing to bet on it.

I'll give this code a try and report back. Your packet captures are a great resource by the way. Thanks! :)

Cheers,
Franco

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream Third party issue
Development

No branches or pull requests

2 participants