Releases: salsadigitalauorg/merlin-framework
Merlin 1.1.0
What's Changed
- Fixed ordered type to emit the field name as well. by @gargsuchi in #139
- Fixed typo in documentation by @gargsuchi in #151
- Bump twig/twig from 2.12.2 to 2.15.3 by @dependabot in #154
- Bump dompdf/dompdf from 0.8.3 to 2.0.1 by @dependabot in #153
New Contributors
- @gargsuchi made their first contribution in #139
- @dependabot made their first contribution in #154
Full Changelog: 1.0.0...1.1.0
Merlin 1.0.0
What's Changed
- Improved relative link handling
- New Group type, docs, tests that didnt make it pre namespace change
- Introduces allowed_classes filtering. Fixes encoding issues
- UnwrapLinks processor
- Add option for referer to fetcher
- Unescape slashes on json output
- Support for uuidv3 on group item content and json output to be consumed as paragraphs in Drupal world
- Allow generic any name of output
- Optionally use Guzzle redirect info for speed
- Group crawl by query string
- Track redirects on crawl in Guzzle
- General group_uuid instead of paragraph
- Support for extra media attributes
- New sub_fetch processor to fetch and process an URL. Nested Merls.
- Proper check for config and rename entity based on config
- Use v4 ip resolve for Curl options, a lot faster
- Resolve robots.txt ignore.
- Fixed ordered type to emit the field name as well.
Co-authored-by: Andrew Rowlands [email protected]
Co-authored-by: Stuart Rowlands [email protected]
Co-authored-by: Suchi Garg [email protected]
Co-authored-by: Sonny Kieu [email protected]
Co-authored-by: Stuart Rowlands [email protected]
Merlin 0.4.3
What's Changed
- Add a new group plugin for crawler. by @steveworley in #137
Full Changelog: 0.4.2...0.4.3
Merlin 0.4.2
Merlin 0.4.1
0.4.0
Release notes:
- Added caching support to spider #85
- Allow
cache_dir
override - Added support for multiple selectors #40
- Multiple selectors error reporting #110
- Vastly improve crawler performance at scale #115
- Added support for optional starting routes for spider #83
- Added additional crawler options (
timeout
,connect_timeout
,verify
,cookies
) - Added support for separate menu selector #35
- Better relative URL resolution
- Exclude external media assets #112
- Added support for spider inclusion by regex
- [DRUPAL] Added support for linkit link conversion
- Ensures uniqueness in logging output files #89
- Adds runtime
--limit
flag to limit total number of results - Split exported media results by type
- Added PR template, CI badge, various documentation improvements
- Various bugfixes #81 #87 #104 #105
0.4.1
bugfix release:
- Re-added missing binary
- Fixed broken links in docs
Contributors
- Stuart Rowlands
- Andy Rowlands
- Steve Worley
- Nick Georgiou
- Alex Skrypnyk
Merlin 0.3.0
- Feature: Adds crawler to generate URL list and group by criteria (DOM or path regex)
- Feature: Local caching layer on runs
- Feature: Javascript support
- Feature: Sub-field processing
- Feature: Mandatory fields
- Feature: Support loading URL list from separate file
- Feature: Crawled URLs can be merged directly into config files
- Feature: Default value support
- Feature: Support for query parameters and fragments in URLs
- Bugfix: Validate file permissions on output files
- Bugfix: Blank attributes caused media processor to fail
- Bugfix: Ensure error arrays are merged appropriate for logging
- Bugfix: Ensure same output for both dom + xpath selectors (
long_text
) - Bugfix: Resolve issue with raw selectors
- Bugfix: Media configuration for
data_embed_button
anddata_entity_embed_display
resolved
Contributors
- Stuart Rowlands
- Andy Rowlands
- Steve Worley
- Nick Georgiou
Merlin 0.2.0
Welcome to the Merlin framework! A configurable and repeatable way to build structured representations of content to assist with migrating content between content management systems.
This initial release provides the base framework to build configuration files and run the program.
Contributors
- Steve Worley
- Stuart Rowlands
- Andy Rowlands
Merlin 0.1.0
Welcome to the Merlin framework! A configurable and repeatable way to build structured representations of content to assist with migrating content between content management systems.
This initial release provides the base framework to build configuration files and run the program.
Contributors
- Steve Worley
- Stuart Rowlands
- Andy Rowlands