Skip to content

Handling Non UTF 8 Encodings

Paul Crovella edited this page Sep 19, 2016 · 5 revisions

RFC 7159 specifies that JSON be encoded in UTF-8, UTF-16, or UTF-32 (without a byte order mark.) This parser only supports UTF-8 and there are no plans to change that.

If you need to parse JSON that's in another encoding you can use an iconv stream filter to transcode it to UTF-8 on the fly.

Either set it on an existing stream via stream_filter_append() that's then passed to stream():

use pcrov\JsonReader\JsonReader;

$stream = fopen("UTF-32.json", "rb");
stream_filter_append($stream, "convert.iconv.UTF-32/UTF-8");

$reader = new JsonReader();
$reader->stream($stream);

while($reader->read()) { /* do stuff */ }

$reader->close();
fclose($stream);

Or as part of the URI passed to open():

$reader->open(
    "php://filter/read=" .
    urlencode("convert.iconv.UTF-32/UTF-8") .
    "/resource=UTF-32.json"
);

Keep in mind that any output from the reader will be in UTF-8.

Clone this wiki locally