-
Notifications
You must be signed in to change notification settings - Fork 20
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create version JSONs and upload to S3
Creates one version JSON for each Nextclade TSV and one version JSON for the metadata TSV. Since the metadata just uses the Nextclade TSV columns directly, just add the `metadata_tsv_sha256sum` to the SARS-CoV-2 dataset version JSON. If we ever want to track data provenance by column, we will update the schema to include the 21L dataset version. The two Nextclade version JSONs will be used to check whether the workflow should use the existing cache. The metadata version JSON will be used to surface the version info to downstream users of the data.
- Loading branch information
1 parent
9a2ca57
commit ccc9fa7
Showing
3 changed files
with
77 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
#!/bin/bash | ||
|
||
set -euo pipefail | ||
|
||
vendored="$(dirname "$0")"/../vendored | ||
|
||
|
||
nextclade="${1:?A path to the Nextclade executable is required as the first argument}" | ||
nextclade_dataset="${2:?A path to the Nextclade dataset is required as the second argument}" | ||
nextclade_tsv="${3:?A path to the Nextclade TSV is required as the third argument}" | ||
|
||
|
||
nextclade_version="$("$nextclade" --version)" | ||
dataset_pathogen_json="$(unzip -p "$nextclade_dataset" pathogen.json)" | ||
dataset_name="$(echo "$dataset_pathogen_json" | jq -r '.attributes.name')" | ||
dataset_version="$(echo "$dataset_pathogen_json" | jq -r '.version.tag')" | ||
nextclade_tsv_sha256sum="$("$vendored/sha256sum" < "$nextclade_tsv")" | ||
|
||
jq -c --null-input \ | ||
--arg NEXTCLADE_VERSION "$nextclade_version" \ | ||
--arg DATASET_NAME "$dataset_name" \ | ||
--arg DATASET_VERSION "$dataset_version" \ | ||
--arg NEXTCLADE_TSV_SHA256SUM "$nextclade_tsv_sha256sum" \ | ||
'{ | ||
"schema_version": "v1", | ||
"nextclade_version": $NEXTCLADE_VERSION, | ||
"nextclade_dataset_name": $DATASET_NAME, | ||
"nextclade_dataset_version": $DATASET_VERSION, | ||
"nextclade_tsv_sha256sum": $NEXTCLADE_TSV_SHA256SUM | ||
}' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters