Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Basic authentication string with Unicode special chars #293

Open
n-peugnet opened this issue Feb 22, 2022 · 5 comments
Open

Incorrect Basic authentication string with Unicode special chars #293

n-peugnet opened this issue Feb 22, 2022 · 5 comments

Comments

@n-peugnet
Copy link
Contributor

n-peugnet commented Feb 22, 2022

When the Basic authentication string contains Unicode special chars (for instance é) it is incorrectly base64 encoded. As the documentation of the used base64 library states:

base64.encode(input)

This function takes a byte string (the input parameter) and encodes it according to base64. The input data must be in the form of a string containing only characters in the range from U+0000 to U+00FF, each representing a binary byte with values 0x00 to 0xFF. The base64.encode() function is designed to be fully compatible with btoa() as described in the HTML Standard.

To base64-encode any Unicode string, encode it as UTF-8 first:

var base64 = require('base-64');
var utf8 = require('utf8');

var text = 'foo © bar 𝌆 baz';
var bytes = utf8.encode(text);
var encoded = base64.encode(bytes);
console.log(encoded);
// → 'Zm9vIMKpIGJhciDwnYyGIGJheg=='

So any Unicode character outside of the 0x00 to 0xFF range will be incorrectly encoded. I think it is safe to assume that most Web-servers and authentication back-ends use UTF-8 to decode the authentication string, so I would say that it should be modified in this library rather than in the clients. If you don't want to change it, this should be added in the docs that the strings must be UTF-8 encoded before passed to createClient().

I do think it's an easy fix on your side, as you just need to call utf8.encode in the toBase64() function (and utf8.decode in the corresponding fromBase64() function):

return encode(text);

It does require one more dependency and creates a breaking change though.

n-peugnet added a commit to club-1/webdav-drive that referenced this issue Feb 22, 2022
to support Unicode Special chars (like `é`) in username or passord,
as the auth-backend decodes them using UTF-8.

See perry-mitchell/webdav-client#293
@perry-mitchell
Copy link
Owner

@n-peugnet Thanks a lot for this, and for the link from your implementation. I'd definitely like this added as I'm sure many have or will experience this. I'm not worried about an extra dependency in this case. Why do you feel it would be a breaking change?

@n-peugnet
Copy link
Contributor Author

I assumed that calling utf8.encode two times in a row on the string would create an incorrect string too. And that it would break clients that fixed it on their side as I did. But it may be fine, I didn't make the test.

@n-peugnet
Copy link
Contributor Author

n-peugnet commented Feb 25, 2022

After further research, UTF-8 encoding is not the defaut yet according to the RFC 7617 (Appendix B.3), but all major browsers seem to have switched to use it by default (Opera since at least 2009 Chrome since at least 2017 and Firefox since 2018. I could not find data about Safari using it by default.

MDN docs states :

Browsers use utf-8 encoding for usernames and passwords.

Firefox once used ISO-8859-1, but changed to utf-8 for parity with other browsers and to avoid potential problems as described in bug 1419658.

This is definitely a breaking change as UTF-8 is byte compatible with the previous default (ISO-8859-1) only for Unicode chars from 0 to 127.
So changing it would break existing auth using chars from 128 to 255.

By the way, I made a mistake. The character that caused me trouble : é is in fact U+00E9 (Unicode number 233), so my initial thought was false, it is a char in the range 00-FF, and it was correctly base64 encoded based on the charset ISO-8859-1. But the problem was my backend (Apache + mod_authnz_ldap + OpenLdap) was expecting UTF-8.

@n-peugnet
Copy link
Contributor Author

n-peugnet commented Feb 25, 2022

Also, if you want to make a clean change about making UTF-8 encoding the default for auth, then it would be a good idea to also change it in Digest auth. So only updating toBase64() will not be sufficient.

@perry-mitchell
Copy link
Owner

This is ancient but happy to see a PR on this, if someone has time. I'll most likely close it otherwise due to lack of interest/concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants