Skip to content

Commit

Permalink
Release 1.0.0 (#157)
Browse files Browse the repository at this point in the history
* Improved relative link handling

* New Group type, docs, tests that didnt make it pre namespace change

* Introduces allowed_classes filtering.  Fixes encoding issues

* UnwrapLinks processor

* More comprehensive unwrap links but still WIP

* Add option for referer to fetcher

* Unescape slashes on json output

* Whitespace leave one space

* Latest group type

* namespace exception

* Support for uuidv3 on group item content and json output to be consumed as paragraphs in Drupal world

* Allow generic any name of output

* Optionally use Guzzle redirect info for speed

* Use Guzzle redirect

* Check a url exists in cache and report path

* Use Guzzle redirect info

* composer

* Group crawl by query string

* Track redirects on crawl in Guzzle

* Add mandatory support for field in group

* Build effective after redirect url lists

* Option to use effective url in fetcher if redirect

* Group crawled urls by regex

* PHP warnings

* More unicode fixing

* More options and features

* More unicode fixes

* Pass in whole object to callback

* Fix redirect check

* More unicoe support

* More unicode support

* More unicode support

* Add method to return console io

* General group_uuid instead of paragraph

* Support for extra media attributes

* Use results from fetcher, remove JSON UTF8 error check

* New sub_fetch processor to fetch and process an URL.  Nested Merls.

* Proper check for config and rename entity based on config

* Track what page media was on

* Support for a prebuilt alias map

* comment

* Generator for mappings

* spelling

* Array config holder for sub_fetch processor

* composer

* Moved uuid generation to standard MerlinUuid method.

* Unicode menu links

* process_file for xpath Type/Media

* Comment typo

* Better error reporting for SubFetch.  WIP Still needs a bit more tidying up in the case the fetched thing wasn't TEXT/HTML.

* Use v4 ip resolve for Curl options, a lot faster

* Use v4 ip resolve for Curl options, a lot faster

* Resolve robots.txt ignore.

* Fixed ordered type to emit the field name as well.

* Updated error message to be more descriptive.

* Allowed to have dot in cache dir name.

* Fixing ordered.

* Allowed to group URLs by the value of a meta tag.

* Added a URL options flag to control content duplicates for redirects.

* Print url cache path from CLI exists lookup.

* Remove alpha UnwrapLinks type.

* MD rendering.

* Linting.

* Remove old unused functions.

* Fix existing tests.

* Linting.

* Remove old getMapping().

* Comment typo.

* Return original reset comment.

* phpcs

* Add cURL IP resolve method as option.

* Default address IP resolve to any/whatever.

* Update Fetcher Docs.

* Make some feature of group optional.

* Update Group type tests and docs.

* Docs update.

* Pass same config object to Output as used in GenerateCommand

* Rename _redirected_from. Add curl ip resolve func.

* Use ip resolve func.

* Getter for multicurl object

* Separate build duplicates function

* Options for SubFetch.

* phpcs, typos

* Save sub fetch status error similar to normal fetch.

* Subfetch tests.

* Composer update.

* Typo and missing JSON files for subfetch test.

* sub_fetch processor docs.

* Add is_external flag to redirect info.

* Only add internal or non redirect links to queue when loading from cache.

* Only add redirect to effective url list if internal.

* Update browsershot for dependencies vulnerability.

* Minor package update.
Moved from drupal-entity to drupal-media tags.

* Updated packages.

* MediaNullAttributeTest update.

* Update tests.

* Use puppeteer orb.

* Remove orb in favour of hardcoding.

* Apt-update.

* Update to non-strech debian.

* Add the google signing key

Co-authored-by: Andrew Rowlands <[email protected]>
Co-authored-by: Stuart Rowlands <[email protected]>
Co-authored-by: Stuart Rowlands <[email protected]>
Co-authored-by: Suchi Garg <[email protected]>
Co-authored-by: Sonny Kieu <[email protected]>
Co-authored-by: Stuart Rowlands <[email protected]>
  • Loading branch information
7 people authored Oct 28, 2022
1 parent a130915 commit 16dcc65
Show file tree
Hide file tree
Showing 71 changed files with 4,810 additions and 1,290 deletions.
22 changes: 18 additions & 4 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,36 @@
#
# Check https://circleci.com/docs/2.0/language-php/ for more details
#
version: 2.0
version: 2.1

jobs:
build:
docker:
# Specify the version you desire here
- image: circleci/php:7.3-stretch-node-browsers

- image: circleci/php:7.3-node-browsers
steps:
- checkout
- run:
name: Update apt repositories
command: |
wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | sudo apt-key add
sudo apt update
- run:
name: Install Headless Chrome dependencies
command: |
sudo apt-get install -yq \
gconf-service libasound2 libatk1.0-0 libatk-bridge2.0-0 libc6 libcairo2 libcups2 libdbus-1-3 \
libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 \
libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 \
libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates \
fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget
- run:
name: Install puppeteer with chromium
command: |
npm i puppeteer
- run: sudo apt update
- run: sudo docker-php-ext-install zip
- run: sudo docker-php-ext-install exif && sudo docker-php-ext-enable exif

Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ website
# Commit your application's lock file https://getcomposer.org/doc/01-basic-usage.md#commit-your-composer-lock-file-to-version-control
# You may choose to ignore a library lock file http://getcomposer.org/doc/02-libraries.md#lock-file
# composer.lock
node_modules
27 changes: 14 additions & 13 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,22 +31,23 @@
"migration"
],
"require": {
"symfony/yaml": "~4.2.0",
"mustangostang/spyc": "^0.6.2",
"chuyskywalker/rolling-curl": "^3.1",
"symfony/dom-crawler": "~4.2.0",
"symfony/css-selector": "~4.2.0",
"consolidation/comments": "^1.0",
"dompdf/dompdf": "^0.8.3",
"league/uri": "^6.0",
"masterminds/html5": "^2.5",
"symfony/console": "~4.2.0",
"mustangostang/spyc": "^0.6.2",
"myclabs/deep-copy": "^1.9",
"php-curl-class/php-curl-class": "^8.6",
"ramsey/uuid": "^3.8",
"spatie/crawler": "^4.4",
"consolidation/comments": "^1.0",
"samchristy/piechart": "^2.0",
"spatie/browsershot": "^3.32",
"php-curl-class/php-curl-class": "^8.6",
"myclabs/deep-copy": "^1.9",
"twig/twig": "^2.0",
"dompdf/dompdf": "^0.8.3",
"samchristy/piechart": "^2.0"
"spatie/crawler": "^4.4",
"symfony/console": "~4.4.0",
"symfony/css-selector": "~4.4.0",
"symfony/dom-crawler": "~4.4.0",
"symfony/yaml": "~4.4.0",
"twig/twig": "^2.0"
},
"require-dev": {
"phpunit/phpunit": "^7.5",
Expand All @@ -60,7 +61,7 @@
},
"config": {
"platform": {
"php": "7.2.0"
"php": "7.2.5"
},
"optimize-autoloader": true,
"sort-packages": true
Expand Down
Loading

0 comments on commit 16dcc65

Please sign in to comment.