-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statusandheaders: if setting a header fails on Headers object, retry with encoded header #83
Conversation
…with encodeURIComponent-encoded value instead of just skipping the header fixes #81
Trying to decide if we want to go with encodeURI or with latin-1 encoding here... I think latin-1 is technically more 'correct', though encodeURI() more portable / easier to understand. As mentioned in #81, fetch() does the equivalent of latin-1 / iso-8859-1, so maybe we do go with that? |
I can see the arguments either way but my vote would be for ISO-8859-1, since that's what the HTTP spec seems to have originally supported and is consistent with fetch. |
Updated it to the latin-1 encoding again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion to take or leave re: encoding, otherwise looks good and thanks for the tests
src/lib/statusandheaders.ts
Outdated
buf.forEach((x) => (str += String.fromCharCode(x))); | ||
//buf.forEach(x => str += x < 128 ? String.fromCharCode(x) : `\\x${x}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if using buffer.transcode() might be a good idea here, as there might be edge cases where TextEncoder().encode(value)
might encode certain characters as two bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah, though buffer is not available in the browser, so would have to use the equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah true, so used to working with node I forgot! It is also an edge case within an edge case, so I trust how much attention you want to give this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if we change this, it will break this assertion from the test case:
function decodeLatin1(buf) {
let str = "";
for (let i = 0; i < buf.length; i++) {
str += String.fromCharCode(buf[i]);
}
return str;
}
d = decodeLatin1(new TextEncoder().encode("https://wiki.archlinux.jp/index.php/Arch_User_Repository?rdfrom=https%3A%2F%2Fwiki.archlinux.org%2Findex.php%3Ftitle%3DAUR_Metadata_%28%25E6%2597%25A5%25E6%259C%25AC%25E8%25AA%259E%29%26redirect%3Dno#AUR_メタデータ"));
// this matches the headers from node fetch:
a = await fetch("https://wiki.archlinux.org/title/AUR_Metadata_(%E6%97%A5%E6%9C%AC%E8%AA%9E)", {redirect: "manual"})
assert(a.headers.get("location") === d)
If we do: decodeLatin1(Buffer.from("https://wiki.archlinux.jp/index.php/Arch_User_Repository?rdfrom=https%3A%2F%2Fwiki.archlinux.org%2Findex.php%3Ftitle%3DAUR_Metadata_%28%25E6%2597%25A5%25E6%259C%25AC%25E8%25AA%259E%29%26redirect%3Dno#AUR_メタデータ", "ascii"));
it will give a different result than what node fetch is doing..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved this in the most correct way, I think. Still encoding utf8-latin1 to store in Headers object, but then reencoding back when serializing
- add a UTF8ToLatin1 and Latin1ToUTF8 conversions for using Headers - when using Headers, convert UTF8 to Latin1, set 'needsReencode' flag - when serializing, if reencoding flag set, be sure to reencode Latin1 back to UTF8
instead of skipping invalid headers:
fixes Headers with non-ASCII characters silently disappear #81