diff --git a/workflows/raster/README.md b/workflows/raster/README.md index e48ef9063..6eb8522ea 100644 --- a/workflows/raster/README.md +++ b/workflows/raster/README.md @@ -15,36 +15,37 @@ Publishing to the AWS Registry of Open Data is an optional step [publish-odr](#P ## Workflow Input Parameters -| Parameter | Type | Default | Description | -| ---------------------- | ----- | ------------------------------------- | -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| ticket | str | | Ticket ID e.g. 'AIP-55' | -| region | enum | | Region of the dataset | -| source | str | s3://linz-imagery-staging/test/sample | the uri (path) to the input tiffs | -| include | regex | .tiff?$ | A regular expression to match object path(s) or name(s) from within the source path to include in standardising\*. | -| scale | enum | 500 | The scale of the TIFFs | -| validate | enum | true | Validate the TIFFs files with `tileindex-validate`. | -| retile | enum | false | Prepare the data for retiling TIFFs files to `scale` with `tileindex-validate`. | -| group | int | 50 | The number of files to group into the pods (testing has recommended using 50 for large datasets). | -| compression | enum | webp | Standardised file format | -| cutline | str | | (Optional) location of a cutline file to cut the imagery to `.fgb` or `.geojson` (leave blank if no cutline) | -| collection_id | str | | (Optional) Provide a Collection ID if re-processing an existing published survery, otherwise a ULID will be generated for the collection.json ID field. | -| category | enum | urban-aerial-photos | Dataset type for collection metadata, also used to Build Dataset title & description | -| gsd | str | 0.3m | Dataset GSD for collection metadata, also used to build dataset title | -| producer | enum | Unknown | Imagery producer :warning: Ignored if `producer_list` is used. | -| producer_list | str | | List of imagery producers, separated by semicolon (;). :warning: Has no effect unless a semicolon delimited list is entered. | -| licensor | enum | Unknown | Imagery licensor. :warning: Ignored if `licensor_list` is used. | -| licensor_list | str | | List of imagery licensors, separated by semicolon (;). :warning: Has no effect unless a semicolon delimited list is entered. | -| start_datetime | str | YYYY-MM-DD | Imagery start date (flown from), must be in default formatting | -| end_datetime | str | YYYY-MM-DD | Imagery end date (flown to), must be in default formatting | -| geographic_description | str | Hamilton | (Optional) Additional datatset description, to be used in dataset title / description in place of the Region. | -| lifeycle | enum | Completed | Lifecycle Status of Collection, from [linz STAC extension](https://github.com/linz/stac/tree/master/extensions/linz#collection-fields). Options: `completed`, `preview`, `ongoing`, `under development`, `deprecated` | -| event | str | Cyclone Gabrielle | (Optional) Event name if dataset has been captured in association with an event. | -| historic_survey_number | str | SNC8844 | (Optional) Survey Number associated with historical datasets. | -| source_epsg | str | 2193 | The EPSG code of the source imagery | -| target_epsg | str | 2193 | The target EPSG code - if different to source the imagery will be reprojected | -| publish_to_odr | str | false | Run [publish-odr](#Publish-odr) after standardising has completed successfully | -| target_bucket_name | enum | | Used only if `publish_to_odr` is true. The bucket name of the target ODR location | -| copy_option | enum | --no-clobber | Used only if `publish_to_odr` is true.
`--no-clobber`
Skip overwriting existing files.
`--force`
Overwrite all files.
`--force-no-clobber`
Overwrite only changed files, skip unchanged files.
| +| Parameter | Type | Default | Description | +| ---------------------- | ----- | ------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| ticket | str | | Ticket ID e.g. 'AIP-55' | +| region | enum | | Region of the dataset | +| source | str | s3://linz-imagery-staging/test/sample | the uri (path) to the input tiffs | +| include | regex | .tiff?$ | A regular expression to match object path(s) or name(s) from within the source path to include in standardising\*. | +| scale | enum | 500 | The scale of the TIFFs | +| validate | enum | true | Validate the TIFFs files with `tileindex-validate`. | +| retile | enum | false | Prepare the data for retiling TIFFs files to `scale` with `tileindex-validate`. | +| group | int | 50 | The number of files to group into the pods (testing has recommended using 50 for large datasets). | +| compression | enum | webp | Standardised file format | +| create_capture_area | enum | true | Create a GeoJSON capture area for the dataset | +| cutline | str | | (Optional) location of a cutline file to cut the imagery to `.fgb` or `.geojson` (leave blank if no cutline) | +| collection_id | str | | (Optional) Provide a Collection ID if re-processing an existing published survery, otherwise a ULID will be generated for the collection.json ID field. | +| category | enum | urban-aerial-photos | Dataset type for collection metadata, also used to Build Dataset title & description | +| gsd | str | 0.3m | Dataset GSD for collection metadata, also used to build dataset title | +| producer | enum | Unknown | Imagery producer :warning: Ignored if `producer_list` is used. | +| producer_list | str | | List of imagery producers, separated by semicolon (;). :warning: Has no effect unless a semicolon delimited list is entered. | +| licensor | enum | Unknown | Imagery licensor. :warning: Ignored if `licensor_list` is used. | +| licensor_list | str | | List of imagery licensors, separated by semicolon (;). :warning: Has no effect unless a semicolon delimited list is entered. | +| start_datetime | str | YYYY-MM-DD | Imagery start date (flown from), must be in default formatting | +| end_datetime | str | YYYY-MM-DD | Imagery end date (flown to), must be in default formatting | +| geographic_description | str | Hamilton | (Optional) Additional datatset description, to be used in dataset title / description in place of the Region. | +| lifeycle | enum | Completed | Lifecycle Status of Collection, from [linz STAC extension](https://github.com/linz/stac/tree/master/extensions/linz#collection-fields). Options: `completed`, `preview`, `ongoing`, `under development`, `deprecated` | +| event | str | Cyclone Gabrielle | (Optional) Event name if dataset has been captured in association with an event. | +| historic_survey_number | str | SNC8844 | (Optional) Survey Number associated with historical datasets. | +| source_epsg | str | 2193 | The EPSG code of the source imagery | +| target_epsg | str | 2193 | The target EPSG code - if different to source the imagery will be reprojected | +| publish_to_odr | str | false | Run [publish-odr](#Publish-odr) after standardising has completed successfully | +| target_bucket_name | enum | | Used only if `publish_to_odr` is true. The bucket name of the target ODR location | +| copy_option | enum | --no-clobber | Used only if `publish_to_odr` is true.
`--no-clobber`
Skip overwriting existing files.
`--force`
Overwrite all files.
`--force-no-clobber`
Overwrite only changed files, skip unchanged files.
| \* This regex can be used to exclude paths as well, e.g. if there are RBG and RGBI directories, the following regex will only include TIFF files in the RGB directory: `RGB(?!I).*.tiff?$`. For more complicated exclusions, there is an `--exclude` parameter, which would need to be added to the Argo WorkflowTemplate. @@ -61,6 +62,7 @@ Publishing to the AWS Registry of Open Data is an optional step [publish-odr](#P | retile | false | | group | 50 | | compression | webp | +| create_capture_area | true | | cutline | s3://linz-imagery-staging/cutline/bay-of-plenty_2021-2022.fgb | | collection_id | 01FP371BHWDSREECKQAH9E8XQ | | category | rural-aerial-photos | diff --git a/workflows/raster/standardising.yaml b/workflows/raster/standardising.yaml index 1a9612ee5..41d7d91cf 100644 --- a/workflows/raster/standardising.yaml +++ b/workflows/raster/standardising.yaml @@ -94,6 +94,11 @@ spec: - 'webp' - 'lzw' - 'dem_lerc' + - name: create_capture_area + value: 'true' + enum: + - 'false' + - 'true' - name: cutline description: '(Optional) location of a cutline file to cut the imagery to .fgb or .geojson' value: '' @@ -482,6 +487,8 @@ spec: - '{{=sprig.trim(workflow.parameters.end_datetime)}}' - '--collection-id' - '{{inputs.parameters.collection-id}}' + - '--create-footprints' + - '{{workflow.parameters.create_capture_area}}' - '--cutline' - '{{=sprig.trim(workflow.parameters.cutline)}}' - '--source-epsg' @@ -502,6 +509,7 @@ spec: artifacts: - name: capture-area path: '/tmp/capture-area.geojson' + optional: true archive: none: {} container: