Skip to content

Commit

Permalink
Merge pull request #63 from cloudnativegeo/kyle/ipynb-to-qmd
Browse files Browse the repository at this point in the history
Convert cogs intro notebook to markdown
  • Loading branch information
kylebarron authored Sep 18, 2023
2 parents 7bc4980 + 3cd2c16 commit 1d2ee6e
Show file tree
Hide file tree
Showing 5 changed files with 52 additions and 95 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/preview.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,4 @@ jobs:
- name: Deploy preview
uses: rossjrw/pr-preview-action@v1
with:
source-dir: docs
source-dir: _site
2 changes: 1 addition & 1 deletion _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ website:
contents:
- section: Cloud Optimized GeoTIFFs
contents:
- cloud-optimized-geotiffs/intro.ipynb
- cloud-optimized-geotiffs/intro.qmd
- cloud-optimized-geotiffs/cogs-examples.ipynb
- section: Zarr
contents:
Expand Down
91 changes: 0 additions & 91 deletions cloud-optimized-geotiffs/intro.ipynb

This file was deleted.

48 changes: 48 additions & 0 deletions cloud-optimized-geotiffs/intro.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Cloud-Optimized GeoTIFFs

## What is a Cloud-Optimized GeoTIFF?

Cloud-Optimized GeoTIFF (the COG) is a variant of the TIFF image format that specifies a particular layout of internal data in the GeoTIFF specification to allow for optimized (subsetted or aggregated) access over a network for display or data reading. The key components are overviews, and internal tiling.

For more details see https://www.cogeo.org/

<img alt="COG Diagram" src="../images/cog-diagram-1.png" width=300/>

### Dimensions

This attribute is also sometimes called **chunks** or **internal tiles**.

Dimensions are the number of bands, rows and columns stored in a GeoTIFF. There is a tradeoff between storing lots of data in one GeoTIFF and storing less data in many GeoTIFFs. The larger a single file, the larger the GeoTIFF header and the multiple requests may be required just to read the spatial index before data retrieval. The opposite problem occurs if you make too many small files, then it takes many reads to retrieve data, and when rendering a combined visualization can greatly impact load time.

If you plan to pan and zoom a large amount of data through a tiling service in a web browser, there is a tradeoff between 1 large file, or many smaller files. The current recommendation is to meet somewhere in the middle, a moderate amount of medium files.

### Internal Blocks

Internal blocks are required if the dimensions of data are over 512x512. However you can control the size of the internal blocks. 256x256 or 512x512 are recommended. When displaying data at full resolution, or doing partial reading of data this size will impact the number of reads required. A size of 256 will take less time to read, and read less data outside the desired bounding box, however for reading large parts of a file, it may take more total read requests. Some clients will aggregate neighboring block reads to reduce the total number of requests.

### Overviews

Overviews are downsampled (aggregated) data intended for visualization.
The best resampling algorithm depends on the range, type, and distribution of the data.

The smallest size overview should match the tiling components’ fetch size, typically 256x256. Due to aspect ratio variation just aim to have at least one dimension at or slightly less than 256. The COG driver in GDAL, or rio cogeo tools should do this.

There are many resampling algorithms for generating overviews. When creating overviews several options should be compared before deciding which resampling method to apply.

## How to create and validate COGs

1. [Rio-cogeo: GitHub - cogeotiff/rio-cogeo: Cloud Optimized GeoTIFF creation and validation plugin for rasterio](https://github.com/cogeotiff/rio-cogeo)
2. [Gdal: COG – Cloud Optimized GeoTIFF generator — GDAL documentation](https://gdal.org/drivers/raster/cog.html)


## Additional Resources

* [Planet Blog: An Introduction to Cloud Optimized GeoTIFFS (COGs) Part 1: Overview](https://developers.planet.com/docs/planetschool/an-introduction-to-cloud-optimized-geotiffs-cogs-part-1-overview/)
* [COG Talk — Part 1: What’s new?](https://medium.com/devseed/cog-talk-part-1-whats-new-941facbcd3d1)
* [Development Seed Blog: Do you really want people using your data?](https://medium.com/@_VincentS_/do-you-really-want-people-using-your-data-ec94cd94dc3f)

## How to visualize COGs

* GDAL vis* drivers (vsicurl, vsis3, vsiaz,)
* Titiler https://github.com/developmentseed/titiler
* Rio-viz https://github.com/developmentseed/rio-viz
4 changes: 2 additions & 2 deletions index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Notes:

1. [Overview of Formats (slideshow)](./overview.qmd)
2. Formats
a. [Cloud-Optimized GeoTIFFs](./cloud-optimized-geotiffs/intro.ipynb)
a. [Cloud-Optimized GeoTIFFs](./cloud-optimized-geotiffs/intro.qmd)
b. [Zarr](./zarr/intro.qmd)
c. [Kerchunk](./kerchunk/intro.qmd)
c. [Cloud-Optimized NetCDF4/HDF5](./cloud-optimized-netcdf4-hdf5/index.qmd)
Expand Down Expand Up @@ -82,4 +82,4 @@ Most of the data formats covered in this guide have a Jupyter Notebook example t
4. What is the expected access method?
5. How much of your data is typically rendered or selected at once?

{{< include _thankyous.qmd >}}
{{< include _thankyous.qmd >}}

0 comments on commit 1d2ee6e

Please sign in to comment.