Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zwrite: Assume UTF-8 rather than ISO-8859-1 in an ASCII locale #132

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

andersk
Copy link
Member

@andersk andersk commented May 15, 2014

It turns out that lots of scripts still run zwrite without setting any locale environment variables. The charset of the default C locale is ANSI_X3.4-1968 (ASCII), so we were sending z_charset = ZCHARSET_ISO_8859_1. But UTF-8 is much more common these days, so this results in a lot of mislabeled zephyrs.

Other clients just ignore z_charset (Roost made a good-faith effort to respect it, and gave up after finding that it really has no correlation with the actual charset); but it still caused incorrect display in zwgc.

Since UTF-8 is just as good a superset of ASCII as ISO-8859-1 is, we should just assume UTF-8 by default in this case.

(We still assume ISO-8859-1 if the locale explicitly specifies that, such as en_US. Almost nobody uses such locales anymore.)

It turns out that lots of scripts still run zwrite without setting any
locale environment variables.  The charset of the default C locale is
ANSI_X3.4-1968 (ASCII), so we were sending z_charset =
ZCHARSET_ISO_8859_1.  But UTF-8 is much more common these days, so
this results in a lot of mislabeled zephyrs.

Other clients just ignore z_charset (Roost made a good-faith effort to
respect it, and gave up after finding that it really has no
correlation with the actual charset); but it still caused incorrect
display in zwgc.

Since UTF-8 is just as good a superset of ASCII as ISO-8859-1 is, we
should just assume UTF-8 by default in this case.

(We still assume ISO-8859-1 if the locale explicitly specifies that,
such as en_US.  Almost nobody uses such locales anymore.)

Signed-off-by: Anders Kaseorg <[email protected]>
@davidben
Copy link
Contributor

LGTM!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants