Thanks to issue #86 I realized that there are modern versions of libxml2. I always wondered why the bundled version of libxml was so old (2.9.4 was released in 2016). Turns out I was checking the wrong website. What seems to be the official website has a really old version as the latest version, meanwhile in gitlab the last version was released months ago!
So I realized there are newer versions and from 2.9.5 the normal behavior changed, breaking up all our tests. Luckily the change is "cosmetic" (whitespace differences with 2.9.4) so the tests are still "valid" but PHPUnit will complain anyway. If you know a way to compare HTMLs ignoring whitespace, let me know.
Anyway, the following changes are included in this release:
- Avoid overwriting extracted metadata with similarly named keys (like
og:image
andog:image:width
) - Imported new
getSiteName()
feature from JS version as of 21 Dec 2018 - Added getFirstElementChild function to NodeTrait + test case (Issue #83)
- Reworked the test suit to use TestPage objects and give more hints about what failed
- Removed getWordThreshold and setWordThreshold configuration functions
- Added NodeUtility::filterTextNodes and deprecated NodeTrait getChildren()
- Added new DOMNodeList fake class that mimics the original DOMNodeList class but allows to add new nodes to the list
- Added new Dockerfiles that pulls different versions of PHP and libxml. Now we are supporting 4 versions of PHP and 6 versions of libxml!
I reworked the 4 Dockerfiles we had before and created a dedicated repo for PHP with custom libxml versions. Here it is: https://github.com/andreskrey/php-libxml-docker-images
Each PR will be tested against 4 versions of PHP and 6 versions of libxml, which means that Travis will run 24 virtual machines every time there are changes in the repo. Let's see for how long we can abuse their free resources.
And that's it. Let me know if something is broken for you. Tell your mom you love her. Don't forget to call your father.