-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MacOS/Android AVD CI failures #68
Comments
This seems like it would be a good time to try and move the Android jobs to Linux runners instead. We would need to enable KVM as well to keep decent performance. |
Sounds reasonable 👍 |
I was able to unbreak my local dev environment by pinning a specific android-ndk version (23.1.7779620) and when run locally the missing real world test suite panic logs do appear. It's the Let's Encrypt real world verification suite that seems to be failing in the Android VM (if we assume my local env matches what happens in CI). The relevant part was:
The unknown issuer error is curious since (AIUI) the Android trust store shouldn't have changed out from under us and the chain is baked into the test. On a lark I tried running the test data update script to fetch an updated chain. Only the leaf EE cert changed, and after updating the mock verification time to be after the new EE cert's I'll put a PR with the updated test data, but something feels fishy here. For one thing, the verification time and chain are both intended to be fixed, so there shouldn't be an expiry issue. Even if the time is not fixed somehow in practice, why is it reporting as an unknown issuer error instead of an expiry error? Lastly, why are logs not showing up in CI when they work locally? I think there's something more worth understanding here... Edit: it's also curious the same tests w/ the same chain are passing on the other platforms. |
With the linked PRs merged, the only items remaining in this issue are to determine:
|
I think I understand that one: the new end-entity certificate has a NotBefore date that was after the previous fixed mock verification time:
Our mock verification time was set to Wednesday, January 3, 2024 6:03:08 PM prior to the update, and so the new EE leaf was considered as having been issued in the future and not yet valid.
This is indeed still a mystery. I'm going to try and investigate deeper this afternoon. |
It turns out there's a simple explanation for what we've seen. The fixed verification time is used up front on a call to Since We get an unknown issuer error instead of a more sensible error about the expiry because while the earlier Line 256 in d68c2ed
In this instance The fixed verification time avoids the That resolves the remaining mysteries but leaves a question about the best fix. It doesn't look like we can set the trust manager's reference time for |
At the very least I think we could at least return a better error so that in combination with the fixed CI logs it should be very obvious what the problem is (test data certificate expired) and what the fix is (run the tooling to update the test data). In practice the error won't be surfaced by this part of the code outside of tests since the earlier call to |
Here's one attempt at this: #75 I personally think it's worth landing this while we figure out a better solution to use a static verification time throughout, but I also acknowledge it's kind of gross 😿 Edit: probably #59 or similar to fix the verification time issue is a better long term solution. |
I'm going to close this now. I think the open PRs and #59 are probably enough to track any additional work here. |
Sounds good and thank you for diving into this. If nothing else we've made some great Android testing improvements so far as a result even if we haven't solved everything yet. |
It looks like the MacOS CI jobs running the Android virtual device tests started failing on
main
~yesterday (one, two). Kicking the jobs doesn't resolve the failure, so it's probably not a transient flake.Unfortunately the mentioned Rust panic logs are not being passed through (similar to #27).
The text was updated successfully, but these errors were encountered: