Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalization: Identify Java's default Normalization form #10

Open
cweb opened this issue Aug 2, 2013 · 0 comments
Open

Normalization: Identify Java's default Normalization form #10

cweb opened this issue Aug 2, 2013 · 0 comments

Comments

@cweb
Copy link
Owner

cweb commented Aug 2, 2013

See: http://websec.github.io/unicode-security-guide/character-transformations/#normalization

Identify Java's normalization form when handling Unicode - is this documented? If so, skip the following tests.

If not documented, test major versions to identify:

  • normalization behavior - what normalization form do the core Encoding APIs use by default?

One way to test this might be to use a few specific code points which have known transformations in certain normalization forms. These include (from http://www.unicode.org/reports/tr15/):

U+212B in NFC becomes U+00C5
U+212B in NFD becomes U+0041 U+030A
The sequence U+1E9B U+0323 in NFKC becomes U+1E69
The sequence U+1E9B U+0323 in NFKD becomes U+0073 U+0323 U+0307

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant