A proof of concept based on GA4GH's DRS specifications.
This flask application offers an interface to query files in such
a fashion: drs://some-domain/some-ID
.
For storing the files, two methods are currently supported : in the current filesystem or inside a MinIO instance (which is a s3-like software).
- Ingesting is either through the command line or by the endpoint of the same name (which will create a single object).
- Consider how to be aware of http vs https depending on the deployment setup (in singularity, docker, as is).
At the root of this project there is a sample dotenv file (.env-sample). These can be exported as environment variables or used as is. Simply copy the sample file and provide the missing values.
cp .env-sample .env
Development dependencies are described in requirements.txt
and can be
installed using the following command:
poetry install
Afterward, we need to set up the DB:
poetry run flask db upgrade
Most likely you will want to load some objects to serve through this service. This can be done with this command (ingestion is recursive for directories):
poetry run flask ingest $A_FILE_OR_A_DIRECTORY
The Flask development server can be run with the following command:
FLASK_DEBUG=True poetry run flask run
To generate migrations while in the shell of a development Docker container, run the following command:
poetry run flask db migrate -m "describe what has changed here"
Migrations will be automatically applied in the Docker environment (dev/prod) on container restart, but if you want to run them manually, use the following command:
poetry run flask db upgrade
To run all tests and calculate coverage, run the following command:
poetry run tox
Tox is configured to run both pytest and flake8, you may want to uncomment the second line of tox.ini (envlist = ...) so as to run these commands for multiple versions of Python.
In production, the service should be deployed using a WSGI service like uWSGI.
With uWSGI you should point to chord_drs.app:application, the wsgi.py file at the root of the project is there to simplify executing the commands (such as "ingest")
/objects/<string:object_id>
/ga4gh/drs/v1/objects/<string:object_id>
Returns a standard GA4GH record for the object.
/search
exact match /search?name=P-1001.hc.g.vcf.gz
partial match /search?fuzzy_name=1001
/objects/<string:object_id>/download
/ingest
e.g. POST body (multipart form-encoded)
path=examples/P-1001.hc.g.vcf.gz
This will automatically deduplicate with existing DRS objects if the file matches.
To ingest and force-create a duplicate record, provide the deduplicate
parameter, set to false
:
path=examples/P-1001.hc.g.vcf.gz&deduplicate=false
If path
is left out and instead a file is provided, the file will be uploaded instead
of copied from the specified local filesystem path.
/service-info
Returns a GA4GH+Bento-formatted service info response.