Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: android: avoid revocation check missing OCSP URI exceptions. #34

Closed

Conversation

cpu
Copy link
Member

@cpu cpu commented Sep 13, 2023

This is a draft for now, as there are some trade-offs to discuss here. Apologies for the length of this PR description. I did a lot of digging around to come up with this approach and I wanted to provide as much context as I could (both for my future self and for other maintainers).

Description

Revocation checking on Android has a few peculiarities that must be accounted for in order to maximize the value of the revocation checking that rustls-platform-verifier does given the platform constraints. See #13 for more information.

Most importantly, when we check a certificate chain preferring OCSP (default behaviour), Android's revocation checker implementation will throw a fatal exception if any certificates within the considered chain depth are missing an OCSP staple (which is true for all intermediates, only end-entity certs can have stapled OCSP) or are missing an authority information access extension with an OCSP access method and URI location (which is common for real-world intermediates like Let's Encrypt's R3 intermediate certificate).

In essence we have two decisions to make for any given validated certificate chain when it comes time to check revocation status:

  • Should we check for the entire chain (minus root trust anchor) or, just check the end entity status.
  • Should we check preferring OCSP and falling back to CRLs, or should we check CRLs only.

We want our decisions to:

  • Prefer OCSP when possible
  • Maximize the revocation checking we perform
  • Avoid exceptions/errors for real-world certificate chains that are unrevoked

Unfortunately the existing APIs and PKI practices mean we have to make some compromises. Read on for more.

End entity vs Full Chain

Prior to this branch we made the determination of revocation checking depth exclusively by determining if the certificate chain was issued by a known system root trust anchor. If it was, we checked the whole certificate path (minus root trust anchor, since these can not be revoked by traditional means). If it wasn't, we checked only the end entity certificate:

revocationChecker.options = if (isKnownRoot(validChain.last())) {
EnumSet.of(PKIXRevocationChecker.Option.SOFT_FAIL)
} else {
EnumSet.of(
PKIXRevocationChecker.Option.SOFT_FAIL,
PKIXRevocationChecker.Option.ONLY_END_ENTITY
)
}

In this branch we adjust this logic so that:

  • As before, if the trust anchor isn't a known system trust anchor we default to checking only the end entity certificate
  • If the trust anchor is known, but we only have OCSP revocation information for the end-entity certificate, we check only the end entity certificate
  • In all other cases we check the entire chain.

Checking only the end entity certificate when we lack OCSP revocation information for the rest of the chain is a trade-off: we can't specify OCSP vs CRL preferences per-certificate, just per-chain, so if we have a chain that has an OCSP URI or stapled OCSP response for the end-entity certificate, we prefer to check only the end-entity certificate revocation status rather than checking the whole chain when we know the chain is missing OCSP revocation information for an intermediate. The alternatives are to proceed with full chain verification, at which point an exception will be thrown from the missing OCSP information (current behaviour), or to prefer CRL validation for the full chain (which would ignore that we have enough information to check OCSP for the end-entity).

Prefer OCSP vs CRL only

Prior to this branch we always preferred OCSP (the default setting). In theory the Android revocation checker should fall back to checking CRLs if it encounters an error checking OCSP, but in practice this doesn't seem to work (TODO: Can we get a better answer for why?). Instead the first certificate that the revocation checker encounters without an OCSP staple or a suitable AIA OCSP URL results in a fatal exception of the form "certificate was revoked: java.security.cert.CertPathValidatorException: Certificate does not specify OCSP responder".

To try to meet our goals this branch changes the revocation logic so that:

  • If we're checking only the end-entity certificate, we prefer OCSP only when we have either an OCSP staple or know the end-entity certificate contains a suitable AIA extension.
  • If we're checking the entire chain, we prefer OCSP only when the entire chain has OCSP information available (e.g. the end-entity cert has a stapled OCSP response or suitable AIA extension, and each intermediate in the chain has a suitable AIA extension).
  • In all other cases we prefer CRLs and don't try to fall back to OCSP since we know it will fail.

"Does this cert have an AIA OCSP URL?"

As part of our new logic we need to make the determination about whether a certificate has an Authority Information Access extension that specifies an OCSP URI. Unfortunately Android/Java make this difficult :-(

The system provider's own OCSP implementation used by the revocation checker has all of the logic required to parse an AIA extension to find an OCSP URI, but it's locked away in platform internal only classes under the sun.security.provider.* namespace that we can not import. None of the X509Certificate and Certificate classes, or the X509Extension interface provide access to the AIA extension.

However X509Certificate implements X509Extension, which offers a method for getting a raw DER encoded extension value by OID. We can use this to access the raw DER encoding of the AIA extension.

Our next challenge is doing something useful with the raw DER extension value. Unfortunately again the tools required for DER parsing are present, but locked away in internal-only classes that we can't use. Frequently folks will take a dependency on BouncyCastle to parse DER but this is an extremely heavy-weight cryptography library (and one I find particularly gross to work with...).

As a last resort we take a more extreme, but lightweight, approach: we have the DER encoding of the overall extension value, and we know the DER encoding of the OID that specifies a OCSP access method within that extension (id-ad-ocsp) so we can simply try to find the matching byte sequence for that OID in the raw DER extension. This is sufficient for our purposes as a heuristic for whether or not OCSP checking will fail from a missing OCSP URI. A maliciously crafted certificate could include an AIA extension designed to fool our heuristic into thinking there is an OCSP URI when there isn't, but this will only result in a revocation rejection when the provider OCSP checking code fails to find a real OCSP URI. It would also have to be a certificate issued by a known trust anchor since we do all of this work after verifying the chain signatures up to a trust anchor.

If this approach is too gross I think the best alternative is to have Rust code do this parsing and provide the extension URI (if present) along with each certificate. Note that webpki and rustls don't have existing code to parse/expose the AIA extension either, so this will require implementing that code somewhere, or taking a dependency on a large crate like x509-parser.

Let's Encrypt testcase

So far this branch includes one additional real world verification test, bundling a Let's Encrypt certificate chain of the form EE (AIA OCSP present) -> R3 intermediate (no AIA OCSP) -> ISRG Root X1 (no AIA OCSP) -> trust anchor.

Before the changes in this branch revocation checking would always fail for this chain, producing the "certificate was revoked: java.security.cert.CertPathValidatorException: Certificate does not specify OCSP responder" result observed in #13 due to the two intermediate certificates missing AIA OCSP URLs.

With the changes in this branch revocation checking succeeds:

  • When deciding whether to check the full chain or just the end entity cert we find the root is known but we only have OCSP checking information for the end entity certificate (it has an AIA OCSP URL, but the intermediates do not). So we decide to check only the end entity certificate.
  • When deciding whether to prefer OCSP or use CRL w/o fallback, we find that we're only checking the end entity and we do have sufficient OCSP information, so we prefer OCSP.

Ultimately this translates into using the revocation checker options [ONLY_END_ENTITY, SOFT_FAIL], and a positive "not revoked" determination.

Other approaches

Unfortunately there's precious little information or example code in the wild for Android revocation checking. What I did find tended to either:

a) Ignore this problem, presumably encountering the exception we see when checking full chains when some certificates are missing OCSP information (e.g. this gist)
b) Tried to ignore this exception in particular, by matching the error message.

Neither approach seemed viable to me.

Todo

  • Vet overall idea is reasonable
  • Implement additional test coverage - the test coverage in-branch isn't sufficient, but I wanted to vet the approach first.

Revocation checking on Android has a few peculiarities that must be
accounted for in order to maximize the value of the revocation checking
that rustls-platform-verifier does given the platform constraints.

TODO: Write more here.
@ctz
Copy link
Member

ctz commented Sep 14, 2023

Android does not provide any way only to attempt to validate revocation from cached
data like the other platforms do

Literally unplayable. How bad would it be to only check stapled OCSP responses? Does this caveat also apply to CRLs? Is android going to go and download a 30MB CRL every time?

(Thanks for correcting the OSCPs in this file!)

@cpu
Copy link
Member Author

cpu commented Sep 14, 2023

How bad would it be to only check stapled OCSP responses?

I think to do this we'd have to forgo checking beyond end entity depth. It looks like this is what Conscrypt is doing.

Does this caveat also apply to CRLs? Is android going to go and download a 30MB CRL every time?

Yup, that's my understanding 😵 Looking at Conscrypt again they explicitly added a no-fallback flag when checking revocation up to end entity depth preferring OCSP because:

if the OCSP check fails (e.g. because the date on the device is wrong) then the Sun PKIXRevocationChecker will fall back to downloading a CRL and checking that. On Android 8.0/8.1 this causes large latency and possible OOM.

@ctz
Copy link
Member

ctz commented Sep 14, 2023

😮‍💨

@djc
Copy link
Member

djc commented Sep 14, 2023

If we had Rust code to do CRLite, would we be able to plug that in here somehow?

@cpu
Copy link
Member Author

cpu commented Sep 14, 2023

If we had Rust code to do CRLite, would we be able to plug that in here somehow?

I think we'd bypass using the platform verifier for revocation checking in that case, and do it in the Rust code after using the platform verifier to check the chain similar to how subject name validation is done after-the-fact on Android using webpki.

@cpu
Copy link
Member Author

cpu commented Nov 2, 2023

A simpler alternative to consider: #40

@cpu
Copy link
Member Author

cpu commented Nov 17, 2023

Closing in favour of #40. Thanks for the input folks!

@cpu cpu closed this Nov 17, 2023
@cpu cpu deleted the cpu-13-android-rev-check-adjustments_dev branch January 3, 2024 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants