-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document why the changes made in that fork couldn't be done in encoding directly #3
Comments
Whether the ecosystem needs to crates and whether the changes made in encoding_rs_compat could be made in rust-encoding directly are two distinct questions. The ecosystem might not need two crates that have the same scope if one is clearly better than the other on all metrics that people care about. At present, rust-encoding and encoding_rs have different scope and neither is clearly better than the other on all metrics. As far as the scope goes, encoding_rs is narrower in scope than rust-encoding in the sense that encoding_rs is strictly an implementation of the Encoding Standard but rust-encoding implements also non-Encoding Standard encodings and names things in a way that's suggestive of rust-encoding not treating Encoding Standard as being the only thing that exists. (HZ used to be in the Encoding Standard and US-ASCII and ISO-8859-1 have a special trivial algorithmic relationship with Unicode, so their inclusion in rust-encoding might not alone be a definitive indication of broader scope, but see the recent merge of ARMSCII-8, as well as the open TODO list and the open CESU-8 issue.) In contrast, encoding_rs's CONTRIBUTING.md is very clear about non-Encoding Standard encodings being out of scope. On the other hand, encoding_rs is broader in scope in the sense that FFI-friendliness and catering to UTF-16 as an in-RAM Unicode representation are in scope. On one hand, considering the needs of Gecko, an implementation of the Encoding Standard that has FFI-friendliness and catering to UTF-16 is an in-RAM Unicode representation is necessary. On the other hand, while an encoding library for software designed for interacting with the Web (not just browsers but also other pieces of software that wish to interact with the Web in a browser-compatible way) should not expose non-Encoding Standard encodings to a Web context, it would be unreasonable to take the position that the Rust ecosystem should not have crates that cater to non-Web legacy character encoding conversion use cases. Aside: E.g. an email client likely needs to encode UTF-8 and decode the Encoding Standard encodings plus UTF-7 as well and decode and encode the IMAP-modified UTF-7. Exposing the protocol-internal IMAP-modified UTF-7 via a general interface seems like a bad idea, since it shouldn't be possible to use that encoding outside the protocol. OTOH, you'd want to share code with normal UTF-7 and, therefore, put those two in one crate. So that would be scope creep beyond UTF-7 itself for encoding_rs if encoding_rs would try to support just the seeming one additional encoding to bring email in scope. Furthermore, UTF-7 doesn't fit as nicely into the streaming model of encoding_rs as the Encoding Standard encodings, and an email client may be happy with non-streaming conversion anyway. So a separate crate that implements non-streaming UTF-7 and delegates the Encoding Standard encodings (non-streaming mode only) to encoding_rs and has a separate but back end sharing (with UTF-7) entry point for IMAP modified UTF-7 would likely be a happier solution for everyone involved than trying to grow the scope of one crate. The question whether the changes made in encoding_rs_compat could be made in rust-encoding directly still makes sense to ask for the overlap of the scopes of encoding_rs and rust-encoding. Ultimately, answering that question will be up to @lifthrasiir. However, I would prefer @lifthrasiir not to evaluate encoding_rs in the light of that question at this time, because encoding_rs is still immature. Specifically:
To your concern regarding working with rust-encoding, it is worth noting, that I have offered to rust-encoding a couple of small fixes that arose as a side effect of developing encoding_rs_compat. Also, just now, I created a PR to change to the Upstreaming this change would allow future use of the upstream encoding-types crate, once published. Still, the rust-encoding types becoming a distinct crate in the future doesn't address the issue that encoding_rs_compat exists in order to allow Gecko to depend on crates from the Rust ecosystem without shipping duplicate data tables or converter code, because the ecosystem depends on rust-encoding 0.2.x, so 0.2.x is what encoding_rs_compat will need to be a drop-in replacement for in Gecko in the foreseeable near future. For the time being encoding_rs_compat includes its own (patched with the same changes I've submitted upstream PRs for) copy of rust-encoding types. Anyway, encoding_rs maturity concerns aside, I'd be quite OK with code from encoding_rs_compat being adopted by the upstream in order to delegate to encoding_rs where the scopes overlap and encoding_rs internals are deemed better on relevant metrics, but, as noted, that's really @lifthrasiir's call, and one I'd prefer not to be made based on the current immature state of encoding_rs. (Considering this comment to be the requested documentation.) |
Very nice! Thanks for the wall of text! |
I still don't understand why the ecosystem needs two crates, even after reading the design document and whatnot.
The text was updated successfully, but these errors were encountered: