You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since adding the SRP service functionality in this PR, there is an issue where the client rejects the success response packets sent by the SRP service registrar (server), despite successful SRP service registration.
I have not fully root caused the issue yet and have not had much time to dig in so want to open this issue for others' awareness until a solution is found.
Observed Behavior
On the client side, the service client attempts registration but ultimately believes it has failed. The response code that OpenThread on the client side bubbles up is 28, which maps to a DNS timeout error, which you can see in the logs when running the SRP client example on both the h2 and c6. So the client then continually tries to register the service entering a backoff-retry loop.
On the SRP server side, the SRP service registration is successful. I have verified that there is no issue on the server side (it is running the stock image used for BR cert so fairly confident there).
Root Cause / Other Details
This may have been an issue already but it was just never observable until adding SRP client functionality, but currently the root cause is unknown so hard to say.
I have verified with pcaps and debug logs that the client does appear to get the response packet from the server, and attempts to perform validation, but ultimately rejects it. So it believes the registration has failed while the SRP service registrar is actually successfully publishing that service on the LAN on behalf of the service client.
Using a debugger (probe-rs) has been somewhat difficult since at some point while stepping through the code timers expire and processing ends without getting to the actual point of rejection and setting breakpoints in the C++ side code is still finicky with the probe-rs tool and does not always work. But I have at least verified that the client starts to process the response packet. Have not had time yet to ID the exact spot where it is failing, but it will likely be obvious what is wrong once that is found.
Will update as more information is available.
Some Ideas
Given the above I suspect this may have something to do with how the platform layer code is handling the keys that would be used to sign the packets sent between SRP client and service registrar, and could be a bug with either how keys/settings are stored, how packets are processed in esp-openthread, or a bug in the esp-ieee802154 driver itself. All of this could be totally off base though, and I will update more as I am able.
The code currently uses the platform-static-utils lib from OT that provides example C code for platform handling of settings and keys. This could potentially be getting overwritten or otherwise have a bug somewhere and the settings and/or keys are not storing correctly (perhaps we need to be using the esp-hal APIs for anything that involves storing settings in RAM?). So that is one avenue to investigate. Another option could be that we are not processing received packets correctly in all cases as defined here, and more logic may be needed to to apply the correct MAC frame key or frame counter or some other similar processing. Alternatively it could be due to how the esp-ieee802154 is processing packets before passing them along to OT stack, given that there is a lot of functionality in the esp-idf C version of the 802154 driver that is otherwise not present in the esp-ieee802154 driver code.
The text was updated successfully, but these errors were encountered:
Issue Description
Since adding the SRP service functionality in this PR, there is an issue where the client rejects the success response packets sent by the SRP service registrar (server), despite successful SRP service registration.
I have not fully root caused the issue yet and have not had much time to dig in so want to open this issue for others' awareness until a solution is found.
Observed Behavior
On the client side, the service client attempts registration but ultimately believes it has failed. The response code that
OpenThread
on the client side bubbles up is28
, which maps to a DNS timeout error, which you can see in the logs when running the SRP client example on both the h2 and c6. So the client then continually tries to register the service entering a backoff-retry loop.On the SRP server side, the SRP service registration is successful. I have verified that there is no issue on the server side (it is running the stock image used for BR cert so fairly confident there).
Root Cause / Other Details
This may have been an issue already but it was just never observable until adding SRP client functionality, but currently the root cause is unknown so hard to say.
I have verified with pcaps and debug logs that the client does appear to get the response packet from the server, and attempts to perform validation, but ultimately rejects it. So it believes the registration has failed while the SRP service registrar is actually successfully publishing that service on the LAN on behalf of the service client.
Using a debugger (
probe-rs
) has been somewhat difficult since at some point while stepping through the code timers expire and processing ends without getting to the actual point of rejection and setting breakpoints in the C++ side code is still finicky with theprobe-rs
tool and does not always work. But I have at least verified that the client starts to process the response packet. Have not had time yet to ID the exact spot where it is failing, but it will likely be obvious what is wrong once that is found.Will update as more information is available.
Some Ideas
Given the above I suspect this may have something to do with how the platform layer code is handling the keys that would be used to sign the packets sent between SRP client and service registrar, and could be a bug with either how keys/settings are stored, how packets are processed in
esp-openthread
, or a bug in theesp-ieee802154
driver itself. All of this could be totally off base though, and I will update more as I am able.The code currently uses the
platform-static-utils
lib from OT that provides example C code for platform handling of settings and keys. This could potentially be getting overwritten or otherwise have a bug somewhere and the settings and/or keys are not storing correctly (perhaps we need to be using theesp-hal
APIs for anything that involves storing settings in RAM?). So that is one avenue to investigate. Another option could be that we are not processing received packets correctly in all cases as defined here, and more logic may be needed to to apply the correct MAC frame key or frame counter or some other similar processing. Alternatively it could be due to how theesp-ieee802154
is processing packets before passing them along to OT stack, given that there is a lot of functionality in the esp-idf C version of the 802154 driver that is otherwise not present in theesp-ieee802154
driver code.The text was updated successfully, but these errors were encountered: