Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NOAA semi-annual terrain_aggregator DB details #81

Open
1 of 4 tasks
dhardestylewis opened this issue Aug 17, 2022 · 2 comments
Open
1 of 4 tasks

NOAA semi-annual terrain_aggregator DB details #81

dhardestylewis opened this issue Aug 17, 2022 · 2 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@dhardestylewis
Copy link
Owner

dhardestylewis commented Aug 17, 2022

TNRIS high resolution terrain database details

terrain_aggregator provides a back-to-front approach to aggregating and serving source Lidar DEM tiles from a high-performance computing environment.

Context

Processing terrain data at scale requires relying on

  • scalable raster & vector image processing libraries such as @OSGeo's GDAL,
  • scalable terrain analysis tools such as @dtarb's TauDEM, and
  • scalable flood mapping toolsets such as @passaH2O's GeoFlood tools.

DB preparation

tldr; 100% of TNRIS Lidar DEM tiles with GDAL-incompatible metadata were successfully "corrected" so these tiles could be included in scalable pre-processing with GDAL. With the corrections, 100% of the tiles successfully ran against common GDAL routines.

In order to prepare source terrain imagery tiles for use at scale, terrain_aggregator gathers all desired terrain tiles into a central PostgreSQL database and records basic but necessary metadata from each tile. Current TNRIS best practices require that DEM tile metadata is FGDC-compliant but does not require this metadata to be produced in way that supports essential DEM processing libraries such as GDAL. At least 10% of TNRIS's ~350,000 DEM tiles cannot by default be used with GDAL in particular, usually:

  • GDAL-incompatible # 1 because GDAL cannot detect included projection information
    👉 impacts ~10% or ~35,000 tiles
  • GDAL-incompatible # 2 because the original tiles had incorrectly stated projection information
    👉 impacts ~1% or ~4,000 tiles
  • GDAL-incompatible # 3 because the original tile is corrupted
    👉 impacts ~0.001% or exactly 3 tiles

GDAL-incompatible # 1 usually occurs in newer TNRIS Lidar DEM tilesets, because more highly detailed projection information is provided, recording the provenance of the projection using a BOUNDCRS WKT2 key. Common GDAL operations do not yet support the BOUNDCRS WKT2 key, and so these tiles cannot be processed at scale using GDAL except by explicitly naming the correct projection code. terrain_aggregator stores the "corrected" projection code for these tiles as an attribute to these tiles in a PostgreSQL database to enable bulk processing to include these tiles.

  • This is a candidate for automation when looking towards
    • including future TNRIS Lidar DEM tilesets
    • replicating this work
    • expanding to other states' Lidar DEM tilesets

A handful of tiles are impacted by GDAL-incompatible # 1 because no projection information has been included whatsoever. Currently, these tiles or tilesets containing these tiles require manual intervention in order to determine and assign the correct projection code.

  • Since the number of likely projections is small, a guess-and-check projection finding routine could be implemented to simplify this.

GDAL-incompatible # 2 usually occurs for some older tiles and tilesets. In the vast majority of these cases, these tiles are labelled with an adjacent UTM zone to what they actually represent. Currently these tiles require manual intervention to correct their projections.

  • However, this could also be automated for the vast majority of the tiles, by applying a guess-and-check projection finding routine.

GDAL-incompatible # 3 refers to a few tiles with integer pixel data type and palette color interpretation. Some common GDAL routines will break if either the data type or the color interpretation is not consistent throughout. Reference to these tiles is maintained in the terrain_aggregator PostgreSQL DB, but these tiles are dropped from any further processing.
Beyond the fact that these tiles having highly suspect elevation data, we can safely drop these tiles:

  • 🤔 because their palleting guarantees at least 1m vertical inaccuracy and
  • 🤔 because there exist alternate statewide seamless terrain datasets with at least 1m vertical inaccuracy,
  • 👉 so these tiles can be safely disregarded in favor of the alternate terrain data available.
  • automation for dropping these tiles
@dhardestylewis dhardestylewis changed the title NOAA annual report out terrain_aggregator details NOAA semi-annual terrain_aggregator DB details Aug 17, 2022
@dhardestylewis
Copy link
Owner Author

@dhardestylewis dhardestylewis added the documentation Improvements or additions to documentation label Aug 20, 2022
@dhardestylewis
Copy link
Owner Author

dhardestylewis commented Aug 21, 2022

#79 (comment)

  • 👉 return GDAL-incompatible tile results as a vector image file to both @TNRIS & @NOAA

Here is how to do it: #28 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

5 participants