Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add external types CI check + config #183

Merged
merged 10 commits into from
Dec 9, 2023

Conversation

cpu
Copy link
Member

@cpu cpu commented Oct 27, 2023

Having types from dependent crates appear in rcgen's public API makes taking semver incompatible updates to those dependencies tricky: we need to treat it as a semver incompatible change to rcgen itself. To make sure whenever this type leak happens that it was done because of a deliberate choice this branch adds cargo-check-external-types to CI.

Three existing types that leaked into the API are fixed: ring::error::KeyRejected, ring::error::Unspecified, and pem::PemError. In each case it's simple to translate the specific error types from the dependency into our own Error variants that don't expose the type.

Two types are allow-listed: time::offset_date_time::OffsetDateTime and zeroize::Zeroize.

The first, OffsetDateTime is used throughout the public API. It isn't clear to me if we want to keep that as part of the public API and treat time updates carefully, or if we should pursue using a more generic representation. I've left this out for now so it can be discussed.

The second, Zeroize, could probably be avoided by implementing Drop and calling zeroize on the sensitive fields manually. That's the approach Rustls implemented (rustls/rustls#1492). I've left this out for in case there was a reason to prefer implementing the trait on the public types.

@codecov
Copy link

codecov bot commented Oct 27, 2023

Codecov Report

Attention: 17 lines in your changes are missing coverage. Please review.

Comparison is base (1a579ca) 72.86% compared to head (fbe7cd9) 73.40%.

❗ Current head fbe7cd9 differs from pull request most recent head ce54465. Consider uploading reports for the commit ce54465 to get more accurate results

Files Patch % Lines
rcgen/src/key_pair.rs 56.41% 17 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #183      +/-   ##
==========================================
+ Coverage   72.86%   73.40%   +0.54%     
==========================================
  Files           7        7              
  Lines        1861     1854       -7     
==========================================
+ Hits         1356     1361       +5     
+ Misses        505      493      -12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that these changes are semver-incompatible. I usually like to proactively bump the version number in cases like these (in this case, to 0.12.0) to make sure we don't release a semver-compatible release with semver-incompatible changes by accident.

I'm not sure whether it's useful to have separate RingKeyRejected and RingUnspecified error variants, I'd probably just merge them into a single Ring variant.

@cpu cpu force-pushed the cpu-ci-external-types-check branch from 405827f to 46da70c Compare October 30, 2023 15:20
@cpu
Copy link
Member Author

cpu commented Oct 30, 2023

I usually like to proactively bump the version number in cases like these

SGTM. I added a commit for that. I'll reconcile with #184 if that one lands first.

I'm not sure whether it's useful to have separate RingKeyRejected and RingUnspecified error variants, I'd probably just merge them into a single Ring variant.

Sure, I don't feel strongly. I've done that in a separate commit in case it proves to be controversial. If it isn't, I'll squash the change to use a unified error with the two changes that fixed the type leaks.

@cpu
Copy link
Member Author

cpu commented Oct 30, 2023

to make sure we don't release a semver-compatible release with semver-incompatible changes by accident.

Probably a good idea to bring cargo-semver-check into CI as well. I'll look at doing that shortly.

@cpu
Copy link
Member Author

cpu commented Oct 30, 2023

Probably a good idea to bring cargo-semver-check into CI as well

tacked on another commit for that.

@cpu cpu force-pushed the cpu-ci-external-types-check branch from 985cbe4 to dda6554 Compare November 1, 2023 13:12
@est31
Copy link
Member

est31 commented Nov 1, 2023

Some points:

  • IDK about hiding of all these error types. These types might contain more context that goes beyond just strings. People might want to match on variants, etc.
  • As for dependency upgrades being semver breaking, I think it's not a black and white thing: people can depend on any possible behaviour of a library and get broken by a change (like say the people who ask for stability in rcgen's output). This applies only to the error handling but the way this PR approaches the potentially breaking changes that it "fixes" is by making the relevant behaviour entirely impossible. I'm not sure that's an improvement.
  • I'm okay with switching away from exposing the zeroize trait and doing it manually upon drop, but only if the zeroize cargo feature is enabled.

I'm not sure whether it's useful to have separate RingKeyRejected and RingUnspecified error variants, I'd probably just merge them into a single Ring variant.

Ring's error variants are already unpleasingly limited (part of the greater problem of Ring's public API being overly restrictive; i understand that this is done for purposes of security but it's gone too far). In any case, I think it's useful to expose the full variety that Ring exposes.

@est31
Copy link
Member

est31 commented Nov 1, 2023

What about exposing only a Display impl by default for the error and then also exposing a cargo feature that adds a way to access the inner ErrorKind with the semantic info? It would be accompanied by a warning that if you enable the feature you should pin the rcgen dependency as minor updates might include major new versions of the dependencies?

So:

  • struct Error { kind: ErrorKind } with Display impl
  • enum ErrorKind { ... }
  • Error is always public, ErrorKind is private by default
  • the cargo feature makes the kind accessor for Error public, as well as ErrorKind

Does this sound good?

@djc
Copy link
Member

djc commented Nov 3, 2023

If you want to preserve more of the inner stuff, I think we should just store a Box<dyn std::error::Error> instead of a String instead. That way you could still get a bunch of context out, including the original ring type if you're so inclined, via a downcast. I generally feel like the ErrorKind pattern ends up making things more complex with limited benefit. Changing the API here based on a Cargo feature feels like an anti-pattern.

@est31
Copy link
Member

est31 commented Nov 3, 2023

For applications I agree that enums are not good. For libraries, I have a different opinion: it's better to have all the possible cases listed rather than "some error, can be converted to a string maybe".

The Box<dyn std::error::Error> solution would technically still mean a semver risk. So we'd also have to store it inside a wrapper type and then have to add a cargo-feature-gated accessor to the inner box.

@est31
Copy link
Member

est31 commented Nov 3, 2023

Actually, given that many error kind's don't carry data, we should maybe actually do it via the opaque wrapper type indeed...

@djc
Copy link
Member

djc commented Nov 6, 2023

The Box<dyn std::error::Error> solution would technically still mean a semver risk. So we'd also have to store it inside a wrapper type and then have to add a cargo-feature-gated accessor to the inner box.

Because of the ability to downcast? I think yielding a Box<dyn> makes it pretty obvious that we give no guarantees of any particular backing type. (I think all the Cargo feature-gated APIs are the wrong path for this kind of thing and just complicate the API for little benefit.)

@cpu cpu marked this pull request as draft November 6, 2023 18:02
@est31
Copy link
Member

est31 commented Nov 7, 2023

The semver risk of Box<dyn ...> is that if you try to downcast it to a specific type (which you usually want to do if you are interested in it beyond just the display impl) then you need to a) name that type, which means you have to import that crate and b) your code may break by an update of rcgen updating the major version of the dependency providing that type (because it's now a different one).

Compared to enums, the semver-breaking nature is worse in two ways:

  • first, you must name the type. previously you might have just called .backtrace() or .is_fatal() or such on the type, without having to actually import it (maybe your use would involve a naming, but there is the chance that it didn't). now that you must name the type, it also means that you have to include the crate yourself and this means more things that can break.
  • second, breakage is less noticeable, as it is moved from compile time to runtime.

maybe this "breakage moved to runtime" thing is what makes this a non-breakage with some definitions of semver, but I'm not a fan of that.

@djc
Copy link
Member

djc commented Nov 7, 2023

IMO using a Box<dyn> clearly communicates that the library doesn't intend to guarantee a specific type. As such, while it is technically possible to downcast I feel the type pretty clearly communcates that the only thing specified are the prescribed trait bounds.

@cpu
Copy link
Member Author

cpu commented Nov 9, 2023

I'm not sure I'm convinced there's an actual use-case for preserving the "inner stuff". Can you find a downstream user that unpacks the error variants that existed prior to this branch to do something other than display the string-ified err? None of the variants being obscured strike me as things you would want to try and programmatically recover from through distinguishing the variant. I'm generally hesitant about making things more complicated for theoretical uses when we could land something simpler and then adjust if someone arrived wanting to do a thing that wasn't possible.

@est31
Copy link
Member

est31 commented Nov 10, 2023

I don't really agree that the issue the PR tries to resolve needs a change in the API: IMO it's enough to document that if you rely on these types, you should pin your dependency.

I think we all agree that error inspection is a niche thing at best, but I still think that it is a use case. Say you want to handle file not found errors specially and want to also look for files with a different extension. or you might want to display "seems there is no network" when the error was caused by dns.

Box<dyn> feels like a black box to me, you don't know what is going on in there, same goes for errors that are only available via string apis. don't get me wrong, they are fine for end applications, but you know way less about the use cases that someone might have as the library author.

@cpu
Copy link
Member Author

cpu commented Nov 10, 2023

IMO it's enough to document that if you rely on these types, you should pin your dependency.

Where would you propose documenting that? I think it will likely be overlooked and result in a mismatch between expectations and reality that will break downstream builds.

Say you want to handle file not found errors specially and want to also look for files with a different extension. or you might want to display "seems there is no network" when the error was caused by dns.

Right, but those aren't the error variants we're talking about here so I'm not sure it's relevant. I'm specifically interested in situations where you think a real user needs to match the specific error variants I'm proposing making more opaque.

I would be in favour of the Box API @djc proposes, or the one I have in this branch. I'm not in favour of a feature based approach. If you feel strongly and are unwilling to adopt one of the two options we prefer then I suggest we close this PR and consider every dependency to be part of Rcgen's API, with the downsides that come with that.

@est31
Copy link
Member

est31 commented Nov 10, 2023

Where would you propose documenting that?

the type level docs of the RcgenError enum.

mismatch between expectations and reality that will break downstream builds.

it only happens in the rare instances where people try to use these types (which you argue nobody wants to do anyways). And I don't want to repeat myself but a compile time breakage something easily discoverable and it's a very actionable thing, as you can easily pin the rcgen version even if it was some upstream crate which introduced the breakage.

I think it will likely be overlooked

that's a potential danger that the features are meant to resolve.

I'm specifically interested in situations where you think a real user needs to match the specific error variants I'm proposing making more opaque.

for example there is PemError::InvalidData, which sometimes contains the offset if something is not valid Base64. This could be used by some users to turn that into a span used in pretty printing of the error.

@est31
Copy link
Member

est31 commented Nov 10, 2023

I've looked at some errors in the ecosystem. most of these don't have just a string-only way to access the error.

  • reqwest::Error is implemented with the Box plus ErrorKind way, but it has tons of accessors like a function to get the URL
  • glutin::Error has a data-less ErrorKind but also additional accessor methods
  • pem::Error uses rcgen's current approach, also including DecodeError from the base64 crate.
  • syn::Error is a wrapper of a list of (string plus span) objects.

so it seems to me that the best practice is to not just expose glorified opaque Display objects, but enrich them with data that is maybe useful. Which also confirms what I thought earlier that the best practice was.

@djc
Copy link
Member

djc commented Nov 10, 2023

Note that I wasn't proposing using Box<dyn StdErr> to replace an Error enum, just to use it inside some of the variants (mainly the Ring variant) -- sorry if that was unclear. Given that ring's errors are pretty much opaque anyway, I don't think we lose much value here, and I think it's still a meaningful way of making it clear that ring's exact types are not meant to leak through the rcgen public API (downcasting notwithstanding).

@est31
Copy link
Member

est31 commented Nov 11, 2023

FTR, we already don't expose any of Ring's internal types (except via the From conversion stuff, and I'm not sure if the proposed solution here is the one we want to take).

@cpu
Copy link
Member Author

cpu commented Nov 13, 2023

Note that I wasn't proposing using Box to replace an Error enum, just to use it inside some of the variants (mainly the Ring variant)

That was how I understood your suggestion fwiw 👍

I've looked at some errors in the ecosystem. most of these don't have just a string-only way to access the error.
...
so it seems to me that the best practice is to not just expose glorified opaque Display objects, but enrich them with data that is maybe useful.

For the Error:Ring case there's only one bit of data being lost with the impl as it stands in branch now: whether it's an unspecified error or a key rejection error. If we revert b0fef6b no data is lost. You get Error::RingUnspecified or Error::RingKeyRejected(string). Is that acceptable? If not, what context do you think is missing?

For the Error::PemError case, I'm still unconvinced there's a need for additional data. You suggested one use-case:

for example there is PemError::InvalidData, which sometimes contains the offset if something is not valid Base64. This could be used by some users to turn that into a span used in pretty printing of the error.

The Error::PemError variant is only used in two places: KeyPair::from_pem and KeyPair::from_pem_and_sign_algo when the pem feature is enabled. Both of these have corresponding DER versions that are always present: KeyPair::from_der and KeyPair::from_der_and_sign_algo.

Since rcgen exposes interfaces for accepting DER a consumer that's interested in maximum programmatic specificity around PEM decoding could import pem themselves, handle the errors however they like, and give rcgen DER instead of PEM. From my perspective the PEM feature is a helpful quickstart and any advanced usage would be better accomplished by handling PEM outside of rcgen.

@cpu cpu force-pushed the cpu-ci-external-types-check branch from fbe7cd9 to ce54465 Compare December 5, 2023 21:54
@cpu
Copy link
Member Author

cpu commented Dec 5, 2023

This branch has a +1 from djc, and I've restored the "lost" information for Ring errors.

Since rcgen exposes interfaces for accepting DER a consumer that's interested in maximum programmatic specificity around PEM decoding could import pem themselves, handle the errors however they like, and give rcgen DER instead of PEM.

I still believe this is the best path forward for an imagined consumer that needs maximum error specificity for PEM decoding. This puts full control into their hands, allowing any choice of PEM and Base64 library or configuration, while imposing no downstream costs to rcgen w.r.t managing its own dependency versions for the simple use case.

@est31 Are you still unswayed by these arguments and putting a hard -1 on merging this? If so then it feels like this branch has reached an impasse and I'll probably close it. I'd prefer to reach an end state soon so that I don't have to keep rebasing and can direct that energy elsewhere.

Copy link
Member

@est31 est31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay let's try it with a stringly typed Pem error now and if users complain we can always change it.

For the From removals, it's a bit sad that this leads to more verbose code, but I don't think there is much we can do about it. We can always improve it in the future also. Let's merge it.

@cpu
Copy link
Member Author

cpu commented Dec 5, 2023

Sounds good to me. Thank you. If there's any follow-up work related to this I'm happy to do it.

@est31
Copy link
Member

est31 commented Dec 5, 2023

If there's any follow-up work related to this I'm happy to do it.

Maybe instead of the map_err invocations there could be a trait like:

trait ToRcgenErr: Sized {
    fn _to_rcgen_err(self) -> Result<(), Error> { todo!() }
}

impl<T> ToRcgenErr for Result<T, ring::error::KeyRejected> {
    fn _to_rcgen_err(self) -> Result<(), Error> { Error::RingKeyRejected(err.to_string()) }
}

// ... similar impl blocks for the other conversions to Error

The advantage of that trait is that it is private and thus nothing leaks to the outside, and when you write/read code you don't have to immediately worry about how the error is converted. As it is an extension trait, we just need to make sure that the function name would never ever be added to std, which my suggestion fulfills. Shorter name suggestions welcome.

@cpu
Copy link
Member Author

cpu commented Dec 7, 2023

Maybe instead of the map_err invocations there could be a trait like:

I'll experiment with this idea and get back to you 👍

cpu added 10 commits December 8, 2023 17:04
Previously this file had some places where members of a type were
separated by whitespace, and some places where they weren't. In my
experience it's more common to use whitespace here, and I think it reads
better, so this commit standardizes on that (at least within
key_pair.rs).
This trait offers a way to convert errors from dependencies into
`rcgen::Error` instances without manually mapping the error in each
instance, or leaking the error type into the public API with a `From<x>
for Error` impl.
Having `From<ring::error::KeyRejected>` defined on the public `Error`
type means that the `*ring*` type leaks into rcgen's public API,
complicating semver incompatible updates.

This commit takes the simple approach of using the crate-internal
`ExternalError` trait to convert from the `*ring*` error type to our
generic rcgen `Error::RingKeyRejected` type as needed. This allows the
`From` impl on `Error` to be removed, fixing the type leak.
Having `From<ring::error::Unspecified>` defined on the public `Error`
type means that the `*ring*` type leaks into rcgen's public API,
complicating semver incompatible updates.

This commit updates the sites that previously used this trait to
instead use a crate internal `ExternalError` trait implementation
to map to the generic rcgen `Error::RingUnspecified` err.
This allows the `From` impl on `Error` to be removed, fixing the type
leak.
Having `From<pem::PemError>` defined on the public `Error`
type means that the `pem` type leaks into rcgen's public API,
complicating semver incompatible updates.

This commit updates the sites that previously used this trait to
use the crate internal `ExternalError` extension trait to map the
`PemError` err to the generic rcgen `Error::PemError` err.

Additionally, the `rcgen::Error::PemError` variant is changed to hold
a `String` with the `pem::PemError` error string instead of the type
itself. This allows the `From` impl on `Error` to be removed, fixing the
type leak.
This commit adds configuration and a CI task for checking that no
types from dependencies are accidentally leaked through the Rcgen public
API unintentionally.

The previous commits in this branch fixed the `*ring*` type leaks, so
our configuration only has two white-listed types as of this branch:

1. `time::offset_date_time::OffsetDateTime`

It's unclear whether usage of that type should be adjusted, so for now
we explicitly allow-list it in the cargo-check-external-types config. We
can deal with this type (or not) in the future.

2. `zeroize::Zeroize`

We could probably avoid leaking this type by implementing `Drop` and
calling `zeroize` on fields directly from the `drop` impl. In the
meantime we add this type to the allow list.
This commit bumps the rcgen version from `0.11.3` to `0.12.0` to reflect
there are breaking changes in `main`.
This commit adds `cargo-semver-checks`[0] to CI. This tool helps detect
when semver incompatible changes are being made without properly
incrementing the `Cargo.toml` version.

Note that this is necessary, but not sufficient, for ensuring semver
compatibility. This tool is helpful, but not perfect, and can miss
some breakages.

[0]: https://github.com/obi1kenobi/cargo-semver-checks
@cpu cpu force-pushed the cpu-ci-external-types-check branch from ce54465 to 11f31a2 Compare December 8, 2023 22:36
@cpu
Copy link
Member Author

cpu commented Dec 8, 2023

Maybe instead of the map_err invocations there could be a trait like:

@est31 I implemented your suggestion in branch, but with the trait named ExternalError and the fn named _err to be short and sweet. Does this look better to you?

@est31
Copy link
Member

est31 commented Dec 9, 2023

Does this look better to you?

Yes, perfect, thank you. _err is a bit riskier but I think it's still extremely unlikely std adds a name like that.

@cpu cpu marked this pull request as ready for review December 9, 2023 14:41
@cpu cpu added this pull request to the merge queue Dec 9, 2023
Merged via the queue into rustls:main with commit 8798e81 Dec 9, 2023
16 of 17 checks passed
@cpu cpu deleted the cpu-ci-external-types-check branch December 9, 2023 14:53
@est31 est31 mentioned this pull request Dec 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants