-
-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect handling of tabs in link resources compared to spaces, newlines #191
Comments
It looks like all that would be required is to change one line in inlines.js: https://github.com/commonmark/commonmark.js/blob/master/lib/inlines.js#L74
to
but I'v not built commonmark.js before, so I'm not sure how to go about testing it. |
Actually, that new regex only covers this one case, but the 0.30 spec says we should interpret "whitespace" as:
so while we're in here we should add those. |
It looks like all of the whitespace definitions in that part of the code could do with a cleanup.
I'm not even sure that we need to differentiate between unicode-white-space and white-space, the spec seems to have changed between 0.29 and 0.30 to only have the definition for unicode-white-space. |
Where did you see that this case should be covered by unicode whitespace? |
The spec says "white-space" and the only definition for whitespace is now "unicode whitespace". Maybe ascii-whitespace was intended, and the definition should not have been removed... |
In 0.29 it says whitespace. In 0.30 it doesn’t. I believe I fixed that confusion. Or, where do you see whitespace? |
You're right the spec does explicitly call it out as spaces/tabs in links.
|
In that case, my initial fix should be sufficient I think. I'm miss-remembering from some other issues I've been digging into for space handling in HTML elements... sorry. |
Yeah. I also found this confusing. That’s why I fixed it :) |
This is wandering off-topic a little - but this is enlightening for me. Does this mean the current parsing of cmark, commonmark.js and comrak (rust parser) all accept it as HTML though... |
I believe that cmark, commonmark.js, and comrak, are incorrect for accepting it as HTML according to the CommonMark spec. |
@mikeando want to submit a PR for your simple fix above? |
whitespace
was used since the start: https://github.com/commonmark/commonmark-spec/blame/858a28941d0dd17c24b7240f21372652111bd38b/spec.txt#L7495The bug probably stems from here:
commonmark.js/lib/inlines.js
Line 651 in 8c698a2
This bug can be reproduced with the following permalink to the dingus: https://spec.commonmark.org/dingus/?text=tab%3A%20%5Bx%5D(%09y)%0Aspace%3A%20%5Bx%5D(%20y)%0A%0Atab%3A%20%5Bx%5D(y%09)%0Aspace%3A%20%5Bx%5D(y%20)%0A%0Atab%3A%20%5Bx%5D(%09%3Cy%3E)%0Aspace%3A%20%5Bx%5D(%20%3Cy%3E)%0A%0Atab%3A%20%5Bx%5D(y%09%22z%22)%0Aspace%3A%20%5Bx%5D(y%20%22z%22)%0A.
The expected behavior is that tabs and spaces behave the same.
The text was updated successfully, but these errors were encountered: