diff --git a/format-specs/geoparquet.md b/format-specs/geoparquet.md index 6821c1c..4782e57 100644 --- a/format-specs/geoparquet.md +++ b/format-specs/geoparquet.md @@ -13,6 +13,9 @@ This is version 1.1.0-dev of the GeoParquet specification. See the [JSON Schema ## Geometry columns Geometry columns MUST be stored using the `BYTE_ARRAY` parquet type. They MUST be encoded as [WKB](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry#Well-known_binary). + +Implementation note: when using the ecosystem of Arrow libraries, Parquet types such as `BYTE_ARRAY` might not be directly accessible. Instead, the corresponding Arrow data type can be `Arrow::Type::BINARY` (for arrays that whose elements can be indexed through a 32-bit index) or `Arrow::Type::LARGE_BINARY` (64-bit index). It is recommended that GeoParquet readers are compatible with both data types, and writers preferably use `Arrow::Type::BINARY` (thus limiting to row groups with content smaller than 2 GB) for larger compatibility. + See the [encoding](#encoding) section below for more details. ### Nesting