Run CWL workflows using Xenon through a REST api.
The following diagram shows a rough overview of the interaction when using xenonflow. The overview shows 3 file systems, all of which can be configured for Xenonflow (See the quick-start guide below).
- The input filesystem, this should contain all the input files needed for running the cwl workflow
- The cwl filesystem, this filesystem should contain the cwl workflows you want to run with Xenonflow.
- The output filesystem, this is where xenonflow will put the output of the workflows.
On the right you can see a compute resource: Xenonflow can be configured to run on a number of computing backends, including the local machine, to actually execute the cwl workflow.
Before making a call to the Xenonflow REST API make sure the data is available on the input filesystem and the workflow is available on the cwl filesystems. The rest call will return a JSON object which contains some information on the job you just submitted. Such as its current state, what input was provided, a uri to the job (for instance to poll the server for new states) and a uri to the log of the job.
After the workflow is completed the results will be available in the target filesystem
- Java 11
- a cwl runner
For the cwl runner you can use the reference implementation called cwltool. It can be installed by
pip install cwltool
You may need to use pip3 on some systems. For a full list of available cwl runners check https://www.commonwl.org/#Implementations
After installing the cwl runner it is a good idea to double check that your workflow can be run using the runner on the command line
wget https://github.com/xenon-middleware/xenon-flow/releases/V1.0
unzip xenonflow-1.0.zip
Configuration of the server is done by editing the XENONFLOW_HOME/config/config.yml
file.
As well as the XENONFLOW_HOME/config/application.properties
.
By default it is set the use the local file system as the source and the local computer to run workflows.
For information on which filesystems and schedulers can be used refer to the xenon documentation: https://xenon-middleware.github.io/xenon/versions/3.1.0/javadoc/.
Xenon-flow configuration consists of
sourceFileSystem
: Any filesystem supported by Xenon can be used heretargetFileSystem
: Any filesystem supported by Xenon can be used herecwlFileSystem
: Any filesystem supported by Xenon can be used hereComputeResources
: A map of compute resource descriptions for Xenon- Each resource has the following settings:
cwlCommand
: A script to run the cwl runner, allowing for python environments to be started first.- Default:
#!/usr/bin/env bash cwltool $@
- Default:
scheduler
: A Xenon schedulerfilesystem
A Xenon filesystem- Both the scheduler and filesystem have to following format:
adaptor
: The name of the Xenon adaptor (for instance slurm for scheduler or sftp for filesystem)location
: The URI for the resourcecredential
: Optional credentials (if not supplied the current user and ~/.ssh/id_rsa is used)user
: Usernamepassword
: Password in base64 encoded
properties
: Optional properties (usually not needed)
There are two environment variables that can be set in your environement which can then be
used in the config.yml file: XENONFLOW_FILES
and XENONFLOW_HOME
.
The application.properties needs configuration for the following things:
- api-key
xenonflow.http.auth-token-header-name
controls the name of the header that holds the api keyxenonflow.http.auth-token
the value of the api key. IMPORTANT you should really change this one
- The Database Configuration.
- These settings should be changed!
spring.datasource.username
The database usernamespring.datasource.password
The database password3.
- The following settings can be left as is.
server.port
The port for the server to run on.local.server.address=localhost
The servername.server.http.interface
Set up the server to be publicaly available by setting this to 0.0.0.0
- These settings should be changed!
The following command will run the server.
./bin/xenonflow
Put the workflow and any input files and directories to into the location as defined by the sourceFileSystem
in the config. For instance when using a webdav server, upload the files there.
Send a POST http request with a job description to the server.
Given the echo command-line-tool (in yaml notation):
cwlVersion: v1.0
class: CommandLineTool
inputs:
- id: inp
type: string
inputBinding: {}
outputs:
- id: out
type: string
outputBinding:
glob: out.txt
loadContents: true
outputEval: $(self[0].contents)
baseCommand: echo
stdout: out.txt
The job description looks something like the following.
Note that the input map contains a key inp
which refers to the corresponding input of the echo command-line-tool.
{
"name": "My First Workflow",
"workflow": "cwl/echo.cwl",
"input": {
"inp": "Hello CWL Server!"
}
}
curl -X POST -H "Content-Type: application/json" -H "api-key: <insert api key here>" -d '{"name": "My First Workflow","workflow": "cwl/echo.cwl","input": {"inp": "Hello CWL Server!"}}' http://localhost:8080/jobs
If you need access to the jobid generated by xenonflow, or the jobname that was used to submit the workflow
then you can add them as inputs to your cwl file as parameters with the ids xenonflow_jobid
and xenonflow_jobname
respectively.
Xenonflow will then automatically inject the values into the job-order.json as input to the cwl file.
For example the following cwl file will echo the xenonflow_jobid and xenonflow_jobname:
cwlVersion: v1.0
class: CommandLineTool
inputs:
- id: xenonflow_jobid
type: string
inputBinding:
position: 1
- id: xenonflow_jobname
type: string
inputBinding:
position: 2
outputs:
- id: out
type: string
outputBinding:
glob: out.txt
loadContents: true
outputEval: $(self[0].contents)
baseCommand: echo
stdout: out.txt
We recommend running xenonflow behind a proxy server. Both nginx and apache httpd are good candidates for this. In addition both nginx and apache httpd can act as webdav servers which xenonflow can use as a sourceFileSystem.
Doing this requires no changes to the configuration of xenonflow as long as the correct X-forwarded-* headers are set in the proxy server.
To ensure that xenonflow returns the correct uri's for the jobs you should set the following headers:
- X-Forwarded-Host
- X-Forwarded-Server
- X-Forwarded-Proto
- X-Forwarded-Port
- X-Forwarded-Prefix
Below is an example location from a nginx config that correctly proxies a xenonflow instance running at localhost:8080
...
location /api/ {
include cors;
proxy_pass http://localhost:8080/;
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Server $host;
proxy_set_header X-Forwarded-Proto http;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-Prefix /api/;
}
...
To run xenonflow in ssl (https) mode you can follow the following steps:
- Please read https://dzone.com/articles/spring-boot-secured-by-lets-encrypt for setup using Letsencrypt.
- You should now have a certificate with a private key store
- You should now set the following properties in the application.properties file:
server.ssl.enabled=true
Enable ssl encryption in the serverserver.ssl.key-store-type
The keystore type (spring boot only supports PKCS12).server.ssl.key-store
The store for the certificate files.server.ssl.key-store-password
The password to the key store.server.ssl.key-alias
The alias as given to the keystore.
Warning: This will delete the input data on the source directory. It is recommended to set the input filesystem to a different location than the cwl and output filesystems so files are not lost by accident.
You can have xenonflow clean up the input files after a task has run by setting the clearOnJobDone
parameter
to true in the sourceFileSystem.
i.e.
sourceFileSystem:
adaptor: file
location: ${XENONFLOW_FILES}/input
clearOnJobDone: true