Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String encoding breaks in several edge cases #27

Open
ryancdotorg opened this issue Oct 24, 2019 · 2 comments
Open

String encoding breaks in several edge cases #27

ryancdotorg opened this issue Oct 24, 2019 · 2 comments

Comments

@ryancdotorg
Copy link

The root cause seems to be the string encoder assuming that any character \ud800 is the first half of a well formed UTF-16 surrogate pair. That assumption fails in the following cases:

  • Code points U+E000 to U+FFFF
  • Unpaired surrogates

JavaScript strings are not necessarily well formed UTF-16. The code needs to process characters in the range \ud800 to \udbff by checking whether they are followed by a character in the range \udc00 to \udfff, and if not, encoding U+FFFD instead. Anything \udc00 to \udfff by itself should also be encoded as U+FFFD.

For example, CBOR.encode("\uff08\u9999\u6e2f\uff09") gives 6bf3928699e6b8aff3929080 rather than the expected 6cefbc88e9a699e6b8afefbc89.

aaronhuggins referenced this issue in aaronhuggins/cbor-redux Sep 3, 2020
Typescript rewrite

Rewritten to TypeScript for Deno support and Node typescript definitions, now as a ponyfill.
Added tests for more scenarios and to bring coverage up.
Modernized syntax and tests.
Added code quality and coverage analysis.
Removed default polyfill behavior and added back as an optional module.

Addresses the following issues:
paroga#27
paroga#24
paroga#21
paroga#13
@aaronhuggins
Copy link

This is resolved at https://github.com/aaronhuggins/cbor-redux.

@taisukef
Copy link

This is resolved at CBOR-es in JavaScript (ES Module) with TextEncoder/TextDecoder
https://github.com/code4fukui/CBOR-es/blob/master/CBOR.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants