Skip to content
Wolfgang Gerlach edited this page Jul 2, 2015 · 5 revisions

These data types are only recommendations. Shock does no validation.

Type "data-library"

Example:

"attributes": {
      "type": "data-library",
      "name": "Solr M5NR",
      "version": "1",
      "description": "Solr M5NR v1 with Solr v4.10.3",
      "member": "1/1",
      "project": "production",
      "provenance": {
        "creation_type": "manual",
        "note": "tar -zcvf solr-m5nr_v1_solr_v4.10.3.tgz -C /mnt/m5nr_1/data/index/ ."
      }   
}

Required fields:
type=data-library Application Scope/Name
name=<string> (e.g. “M5NR”, or “Bowtie index of human genome”)
version=<string> (version number, date or similar of the data-library-name)

Optional fields:
member=<string> (a name for the data library member, could be the same as filename, or chunk number, e.g. “m5nr.1”)

Filename is stored under Shock metadata file->name and is not part of this specification.

description=<string> (longer description)
file_format=<string> (fasta, bt2 ... etc., CV would be nice long term)
created=<date> (creation date of the file/member, not upload date; is this provenance?)

attributes->provenance->creation_type = clone | workflow | manual

  1. clone
    simple case: data just has been copied from another server (e.g. BLAST NR)
    attributes->provenance->url= URI for the original file , download/copy location

  2. AWE workflow
    attributes->provenance->workflow=<url/string> Reference to a workflow document if available, if this a computed product and not copied (workflow document with input)

  3. manual
    attributes->provenance->note=<string> Description how the file has been created if not copied/downloaded

Comment

Every “data-library” consists of a finite number of Shock nodes. A library name AND version number uniquely identifies a specific library. (Unsolved: how do you prevent people uploading stuff with the same name? protected namespaces?)

Type "dockerimage"

TODO

Type <other>

TODO

Clone this wiki locally