You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The JSON specification allows escapes of the form \uxxxx, which is \u followed by four hexadecimal digits. Considering that the newline character is Unicode code point U+000A, I expect parseJSON "\"\\u000A\"" to yield Right "\n" (note that this is the Showed representation), but in fact it yields a Left containing
Furthermore, it is also allowed to use surrogate pairs to encode a code point past U+FFFF. A fully conforming parseJSON would parse "\"\\uD835\\uDD4C\"" into "𝕌", a string containing the code point U+1D54C.
This incompatibility is caused by the use of Parsec's stringLiteral parser, which follows Haskell's syntax rules for string literals. This parser also supports a large number of escapes that are not valid in JSON, such as \a and \&.
This issue is not critical for Communicate (certainly not important enough to delay ideas-1.5), but we expect to start working with JSON generated by many different encoders, some of which are likely to use this feature.
The text was updated successfully, but these errors were encountered:
Yes, I just tried my examples and aeson gets all of this right.
It looks like the dependency philosophy of aeson doesn't seem to match that of Ideas very well. Aeson uses Attoparsec and ByteString, i.e. the more "advanced" and performance-focused text tools, and has four dependencies outside the Haskell Platform (dlist, fail, tagged, semigroups - fortunately no build problems on Windows). Ideas, on the other hand, uses the "basic" String for almost all operations, and implements many things internally (XML parsing, UTF-8 encoding), taking very few dependencies. I don't know if this is intentional (or historical?) and whether you would want to switch to using more external systems.
The JSON specification allows escapes of the form
\uxxxx
, which is\u
followed by four hexadecimal digits. Considering that the newline character is Unicode code point U+000A, I expectparseJSON "\"\\u000A\""
to yieldRight "\n"
(note that this is theShow
ed representation), but in fact it yields aLeft
containingFurthermore, it is also allowed to use surrogate pairs to encode a code point past
U+FFFF
. A fully conformingparseJSON
would parse"\"\\uD835\\uDD4C\""
into"𝕌"
, a string containing the code point U+1D54C.This incompatibility is caused by the use of Parsec's
stringLiteral
parser, which follows Haskell's syntax rules for string literals. This parser also supports a large number of escapes that are not valid in JSON, such as\a
and\&
.This issue is not critical for Communicate (certainly not important enough to delay ideas-1.5), but we expect to start working with JSON generated by many different encoders, some of which are likely to use this feature.
The text was updated successfully, but these errors were encountered: