Automatically convert to UTF-8.
Detecting based on header's charset & html meta charset.
(handling several charset more carefully - SJIS-win, TIS-620 and others..)
This library aims to used in web-scraping.
- PHP 5.3 or over
- mbstring and iconv
- wrap response object:
<?php
use Diggin\Http\Charset\WrapperFactory;
$client = new Zend\Http\Client($url);
$response = $client->send();
$response = WrapperFactory::factory($response); // then, response getBody() return with converted UTF-8.
Please see more at demos/Diggin/Http/Charset .
guzzle-plugin-AutoCharsetEncodingPlugin supports for using with Guzzle3.
Usage of with Behat by @MugeSo
Diggin_Http_Charset is based on HTMLScraping.
Diggin_Http_Charset is licensed under LGPL(GNU Lesser General Public License).
- perl : HTTP::Response::Encoding
- python : Universal Encoding Detector
- handling non text/html content types.
- better APIs & according ZF2 coding standard.
- struggle in more charset :-\