-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encode valid characters #1408
base: master
Are you sure you want to change the base?
Encode valid characters #1408
Conversation
It seems the whole point of the json_encode is to change the unicode characters to their escaped versions. This seems like the RegExp.escape method is perfect for: dproofreaders/tools/proofers/srchrep.js Line 10 in 15d9310
but I don't think it handles the unicode issue you're addressing here so we should fix both at the same time if possible. |
Thanks Chris, I found it difficult to understand what's going on here. The php function build_character_regex_filter() in unicode.inc produces a string with escaped characters like |
so it will get interpreted as a single backslash rather than the code getting interpreted as a character.
Actually, they get converted to 'plain characters' as soon as the enclosing string literal is evaluated (probably when the browser parses the JS code). I.e., the string value that's passed to Rather than protect the unicode escapes from JS string-literal evaluation, it might be simpler to avoid the latter altogether. One way would be to (in effect) replace:
with
i.e. the unicode escapes are interpolated directly into a regexp literal, where they are treated as intended. (As far as I can tell, there's no benefit to going through |
Thanks Michael, that seems the best way to do it. |
so escaped characters don't get changed back to plain characters. This should fix the issue with the test project projectID462699ad8f1c6 which has custom characters which are regex special characters found by @srjfoo.
Sandbox at: https://www.pgdp.org/~rp31/c.branch/regex_escape