Normalization: Identify Java's default Normalization form #10

cweb · 2013-08-02T06:21:51Z

See: http://websec.github.io/unicode-security-guide/character-transformations/#normalization

Identify Java's normalization form when handling Unicode - is this documented? If so, skip the following tests.

If not documented, test major versions to identify:

normalization behavior - what normalization form do the core Encoding APIs use by default?

One way to test this might be to use a few specific code points which have known transformations in certain normalization forms. These include (from http://www.unicode.org/reports/tr15/):

U+212B in NFC becomes U+00C5
U+212B in NFD becomes U+0041 U+030A
The sequence U+1E9B U+0323 in NFKC becomes U+1E69
The sequence U+1E9B U+0323 in NFKD becomes U+0073 U+0323 U+0307

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalization: Identify Java's default Normalization form #10

Normalization: Identify Java's default Normalization form #10

cweb commented Aug 2, 2013

Normalization: Identify Java's default Normalization form #10

Normalization: Identify Java's default Normalization form #10

Comments

cweb commented Aug 2, 2013