-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renew device identity certificate automatically with EST #6911
Comments
These are great questions. I'll take a first pass here but might need @gordonwang0 and @eustacea to keep me honest.
Pros: mitigate key compromise risk. Cons: changing the key can be considered as a cert reissuance as opposed to a renewal.
The certificate renewal procedure is based on a timer that tracks the expiration time of each managed certificate. On startup, each IoT Edge runtime service collects a list of its managed certificates and check each certificate for expiry. Expired and close-to-expiry certificates are renewed immediately. The service then schedules certificate renewal based on the certificate’s expiration time. So, at startup: The renewal timer will then fire based on the renewals scheduled at startup: Because the timer may fail to renew a certificate due to network unavailability or other issues, IoT Edge runtime services also attempt to renew a certificate upon use: For expired device ID cert and no EST connectivity, your modules (including edgeHub) can continue to send messages. And it should continue try to renew the cert as long it's up. But, if the IoT Edge runtime stops, I believe it might fail to restart because the device ID cert might be required at startup...? @gordonwang0 can you confirm?
I believe your concern stems from the worst-case scenario where the certificate renewal fails so many times to the point where it's expired and then it's irrecoverable. If that's the case, then the answer depends on the answer to the earlier question. Also if the EST service is potentially this unreliable, maybe symmetric key provisioning is the better approach? I'm also not sure if this plan where you switch attestation method upon reprovisioning is practical. AFAIK each DPS enrollment (group) can only use one method so if you want to change it on the device side you'll have to use a different enrollment (group), which might cause all sorts of issues with losing twin/deployment type data - data associated with the device identity - on IoT Hub side. |
Hi @jlian. Thanks for your comprehensive explanation. Appreciate it. Now it's much more clear how the renewal works. Regarding recovery plan, we already have two DPS enrollment groups, one set with CA and one set with symmetry key. We already tested switching (re-provisioning) same device from CA to Symmetry or vice versa, remotely via OSConfig module by Microsoft. It actually works pretty well. The device maintains its identity, and only authentication fingerprint changes, which means IoT hub and data flow work as expected. In terms of EST server, we already have agreement with a big certificate issuer organization and the whole renewal and reissuance work flawlessly. The worst-case scenario is just to have a recovery plan in place in case for any rare situation the renewal doesn't work. |
Ok, then that seems like a good plan. @gordonwang0 could you chime in to confirm what happens if the device ID cert is expired, IoT Edge restarts, and renewal attempt fails? |
In that case, the runtime will continue to restart and attempt to renew the certificate until it performs a successful renewal. |
Thanks both for the explanations. Closing the thread. |
@jlian @gordonwang0 |
Good question. Also tagging @micahl, do you know? |
We don't currently have a specific metric for cert renewal. If you have a system set up to regularly gather the logs from the device and send to a Log Analytics workspace (e.g. ELMS cloud workflow) then you could devise a query + log-based alert. |
This issue is being marked as stale because it has been open for 30 days with no activity. |
In our production edge devices, we use EST for automatic device identity certificate issuance and renewal. My questions are mainly related to the following part, which I couldn't really find answers in documents.
retry = "4%"
, does it mean that if the renewal fails at 80%, it will try again at 84%, 88%, and so on? Up until 100%? what will happen after 100%?Besides these questions, we have been thinking about having a recovery plan for edge re-provisioning when certificate renewal doesn't go as expected. If that's the case and there is no disaster recovery plan, we will lose the device and there will be no way to recover it remotely. We are thinking to develop a service on device to monitor the certificate's expire time every 1 hour and compare it with the current time. If the cert is expired and no new cert has been reissued, which means EST renewal didn't go as expected, then the edge runtime should switch to symmetry key immediately. This requires having the derived device key already on the device as backup, which is doable. Does this process sound feasible? What are the best practices to materialize this scenario?
The overall aim is to of course mitigate the risk and have proper automatic recovery plan.
The text was updated successfully, but these errors were encountered: