Skip to content
This repository has been archived by the owner on Aug 14, 2021. It is now read-only.

Commit

Permalink
Merge pull request #71 from andreskrey/development
Browse files Browse the repository at this point in the history
v2.0.0
  • Loading branch information
andreskrey authored Nov 25, 2018
2 parents 992a112 + fcbb76a commit e1b31f9
Show file tree
Hide file tree
Showing 167 changed files with 7,060 additions and 2,523 deletions.
7 changes: 4 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,19 +1,20 @@
language: php

before_install:
- apt-cache policy libxml2

install:
- composer install

php:
- "5.6"
- "7.0"
- "7.1"
- "7.2"
- "7.3"

script:
- ./vendor/bin/phpunit --coverage-clover ./test/clover.xml

after_script:
- composer require php-coveralls/php-coveralls:^2.0
- php ./vendor/php-coveralls/php-coveralls/bin/php-coveralls -v

sudo: false
20 changes: 20 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,26 @@ All notable changes to this project will be documented in this file.

## Unreleased

## [v2.0.0](https://github.com/andreskrey/readability.php/releases/tag/v2.0.0)

- [BREAKING CHANGE] Bumped the minimum supported version of PHP to 7.0
- Clean `<aside>` tags during `prepArticle()`.
- Merged PR #58: Fix notice non-object on $parentOfTopCandidate for tumblr.com
- Fixed issue #63: Division by zero
- Housekeeping:
- Removed $parseSuccessful flag that wasn't needed anymore
- Rename wordThreshold to charThreshold and throw deprecation notices. WordThreshold will be removed in version 3.0.
- Added "-ad-" as unlikely candidate
- Added Docker containers with PHP 7.0, 7.1, and 7.2 and makefile to trigger the tests.
- Imported new code from the JS version as of [19 Nov 2018](https://github.com/mozilla/readability/commit/876c81f710711ba2afb36dd83889d4c5b4fc2743), which includes the following changes:
- Move phrasing contents [into paragraphs](https://github.com/mozilla/readability/commit/9f2c5cb42ee9635f091178271d66888cbb47e5dc)
- Improved the title detection
- Remove [single cell tables](https://github.com/mozilla/readability/commit/ea4165721f9105d8f1e53cfecdcfdafceaf3e4bf)
- Improved the detection of video related elements
- New test cases
- Various minor fixes


## [v1.2.0](https://github.com/andreskrey/readability.php/releases/tag/v1.2.0)

- Merged PR#49 (Missing object when calling `->getContent()`)
Expand Down
15 changes: 15 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
.PHONY: test-all test-7.2 test-7.1 test-7.0

test-all: start test-7.2 test-7.1 test-7.0

test-7.2:
docker-compose exec php-7.2 php /app/vendor/phpunit/phpunit/phpunit --configuration /app/phpunit.xml

test-7.1:
docker-compose exec php-7.1 php /app/vendor/phpunit/phpunit/phpunit --configuration /app/phpunit.xml

test-7.0:
docker-compose exec php-7.0 php /app/vendor/phpunit/phpunit/phpunit --configuration /app/phpunit.xml

start:
docker-compose up -d
18 changes: 16 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The project aim is to be a 1 to 1 port of Mozilla's version and to follow closel

## Requirements

PHP 5.6+, ext-dom, ext-xml, and ext-mbstring. To install all this dependencies (in the rare case your system does not have them already), you could try something like this in *nix like environments:
PHP 7.0+, ext-dom, ext-xml, and ext-mbstring. To install all this dependencies (in the rare case your system does not have them already), you could try something like this in *nix like environments:

`$ sudo apt-get install php7.1-xml php7.1-mbstring`

Expand All @@ -26,6 +26,7 @@ Then, create a Readability class and pass a Configuration class, feed the `parse
```php
use andreskrey\Readability\Readability;
use andreskrey\Readability\Configuration;
use andreskrey\Readability\ParseException;

$readability = new Readability(new Configuration());

Expand Down Expand Up @@ -158,14 +159,27 @@ Readability.php uses the [PSR Log](https://github.com/php-fig/log) interface to

- Keep up with Readability.js changes
- Add a small template engine for the __toString() method, instead of using a hardcoded one.
- Replace all the `iterator_to_array` calls with a custom PHP generator that keeps track of the removed or altered nodes.

## How it works

Readability parses all the text with DOMDocument, scans the text nodes and gives the a score, based on the amount of words, links and type of element. Then it selects the highest scoring element and creates a new DOMDocument with all its siblings. Each sibling is scored to discard useless elements, like nav bars, empty nodes, etc.

## Testing

Any version of PHP installed locally should be enough to develop new features and add new test cases. If you want to be 100% sure that your change doesn't create any issues with other versions of PHP, you can use the provided Docker containers to test currently in 7.0, 7.1, and 7.2.

You'll need Docker and Docker Compose for this. To run all the tests in all the available versions just type the following command:

```bash
make test-all
```

This will start all the containers and run all the tests on every supported version of PHP. If you want to test against a specific version, you can use `make test-7.0`, `make test-7.1`, or `make test-7.2`.

## Code porting

Up to date with readability.js as of [2 Mar 2018](https://github.com/mozilla/readability/commit/8525c6af36d3badbe27c4672a6f2dd99ddb4097f).
Up to date with readability.js as of [19 Nov 2018](https://github.com/mozilla/readability/commit/876c81f710711ba2afb36dd83889d4c5b4fc2743).

## License

Expand Down
6 changes: 3 additions & 3 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,15 @@
}
},
"require": {
"php": ">=5.6.0",
"php": ">=7.0.0",
"ext-dom": "*",
"ext-xml": "*",
"ext-mbstring": "*",
"psr/log": "^1.0"
},
"require-dev": {
"phpunit/phpunit": "^5.7",
"monolog/monolog": "^1.23"
"phpunit/phpunit": "^6.5",
"monolog/monolog": "^1.24"
},
"suggest": {
"monolog/monolog": "Allow logging debug information"
Expand Down
20 changes: 20 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
version: '3'

services:
php-7.0:
build: ./docker/php-7.0
volumes:
- ./:/app
tty: true

php-7.1:
build: ./docker/php-7.1
volumes:
- ./:/app
tty: true

php-7.2:
build: ./docker/php-7.2
volumes:
- ./:/app
tty: true
2 changes: 2 additions & 0 deletions docker/php-7.0/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
FROM php:7.0-cli
RUN pecl install xdebug && docker-php-ext-enable xdebug
2 changes: 2 additions & 0 deletions docker/php-7.1/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
FROM php:7.1-cli
RUN pecl install xdebug && docker-php-ext-enable xdebug
2 changes: 2 additions & 0 deletions docker/php-7.2/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
FROM php:7.2-cli
RUN pecl install xdebug && docker-php-ext-enable xdebug
40 changes: 33 additions & 7 deletions src/Configuration.php
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ class Configuration
/**
* @var int
*/
protected $wordThreshold = 500;
protected $charThreshold = 500;

/**
* @var bool
Expand Down Expand Up @@ -109,9 +109,9 @@ public function getLogger()
// If no logger has been set, just return a null logger
if ($this->logger === null) {
return new NullLogger();
} else {
return $this->logger;
}

return $this->logger;
}

/**
Expand Down Expand Up @@ -149,19 +149,45 @@ public function setMaxTopCandidates($maxTopCandidates)
/**
* @return int
*/
public function getCharThreshold()
{
return $this->charThreshold;
}

/**
* @param int $charThreshold
*
* @return $this
*/
public function setCharThreshold($charThreshold)
{
$this->charThreshold = $charThreshold;

return $this;
}

/**
* @deprecated Use getCharThreshold. Will be removed in version 2.0
*
* @return int
*/
public function getWordThreshold()
{
return $this->wordThreshold;
@trigger_error('getWordThreshold was replaced with getCharThreshold and will be removed in version 3.0', E_USER_DEPRECATED);

return $this->charThreshold;
}

/**
* @param int $wordThreshold
* @param int $charThreshold
*
* @return $this
*/
public function setWordThreshold($wordThreshold)
public function setWordThreshold($charThreshold)
{
$this->wordThreshold = $wordThreshold;
@trigger_error('setWordThreshold was replaced with setCharThreshold and will be removed in version 3.0', E_USER_DEPRECATED);

$this->charThreshold = $charThreshold;

return $this;
}
Expand Down
1 change: 1 addition & 0 deletions src/Nodes/DOM/DOMNode.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

/**
* @method getAttribute($attribute)
* @method hasAttribute($attribute)
*/
class DOMNode extends \DOMNode
{
Expand Down
Loading

0 comments on commit e1b31f9

Please sign in to comment.