Welcome to Merlin a simple tool to aid content migration from an arbitrary source to a structured format ready for consumption by another system.
Detailed documentation can be found https://salsadigitalauorg.github.io/merlin-framework/.
- PHP > 7.2
- Composer (optional)
The Merlin framework is expected to be used as a standalone executable, this can be localised to your project or installed globally and added to your path. To download, visit the release page and download the latest bundled .phar executable.
curl -s https://github.com/salsadigitalauorg/merlin-framework/releases \
| grep "merlin-framework" \
| cut -d : -f 2,3 \
| tr -d \" \
| wget -qi -
Merlin can be installed as a composer dependency as well, this changes how the application is excuted for your project.
Add the repository
"repositories": [
{
"type": "vcs",
"url": "https://github.com/salsadigitalauorg/merlin-framework"
}
]
Add the dependency
composer require salsadigitalauorg/merlin-framework
There are two primary commands: crawl
and generate
.
crawl
will run crawl a domain and find URLs on a domain for migration. Read the crawler docs and check the example for more information.generate
will generate structured output based on mapping configuration. Read the migration docs and check the example for more information.
To run the framework you need to specify a command (e.g crawl or generate), a configuration yaml file, and a path to the output, e.g:
merlin crawl -c <path/to/crawler-config.yml> -o <path/to/output>
merlin generate -c <path/to/migrate-config.yml> -o <path/to/output>
The configuration file should be treated as a schema file, this contains the paths, domains and mapping information to transform a HTML representation of content into structured JSON.
Example configuration files can be found in the examples.
The automated testing suite tests standard configuration values against representative HTML structure to make sure that the tool can correctly build the JSON structure.
Running the tests
./vendor/bin/phpunit
We encourage you to file issues with the github issue queue.