Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRS spec definition for version 0.1 #25

Merged
merged 23 commits into from
Mar 6, 2022
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified examples/geoparquet/example.parquet
Binary file not shown.
33 changes: 17 additions & 16 deletions examples/geoparquet/example.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,25 @@

.. code-block:: python

>>> import json, pprint, pyarrow.parquet
>>> import json, pprint, pyarrow.parquet as pq
>>> pprint.pprint(json.loads(pq.read_schema("example.parquet").metadata[b"geo"]))
{'columns': {'geometry': {'crs': 'GEOGCRS["WGS 84",ENSEMBLE["World Geodetic '
'System 1984 ensemble",MEMBER["World Geodetic '
'System 1984 (Transit)"],MEMBER["World '
'Geodetic System 1984 (G730)"],MEMBER["World '
'Geodetic System 1984 (G873)"],MEMBER["World '
'Geodetic System 1984 (G1150)"],MEMBER["World '
'Geodetic System 1984 (G1674)"],MEMBER["World '
'Geodetic System 1984 '
'(G1762)"],ELLIPSOID["WGS '
'84",6378137,298.257223563],ENSEMBLEACCURACY[2.0]],CS[ellipsoidal,2],AXIS["geodetic '
'latitude (Lat)",north],AXIS["geodetic '
'longitude '
'(Lon)",east],UNIT["degree",0.0174532925199433],USAGE[SCOPE["Horizontal '
'component of 3D '
'system."],AREA["World."],BBOX[-90,-180,90,180]],ID["EPSG",4326]]',
'encoding': 'WKB'}},
'System 1984 ensemble",MEMBER["World Geodetic '
'System 1984 (Transit)"],MEMBER["World '
'Geodetic System 1984 (G730)"],MEMBER["World '
'Geodetic System 1984 (G873)"],MEMBER["World '
'Geodetic System 1984 (G1150)"],MEMBER["World '
'Geodetic System 1984 (G1674)"],MEMBER["World '
'Geodetic System 1984 (G1762)"],MEMBER["World '
'Geodetic System 1984 '
'(G2139)"],ELLIPSOID["WGS '
'84",6378137,298.257223563],ENSEMBLEACCURACY[2.0]],CS[ellipsoidal,2],AXIS["geodetic '
'latitude (Lat)",north],AXIS["geodetic '
'longitude '
'(Lon)",east],UNIT["degree",0.0174532925199433],USAGE[SCOPE["Horizontal '
'component of 3D '
'system."],AREA["World."],BBOX[-90,-180,90,180]],ID["EPSG",4326]]',
'encoding': 'WKB'}},
'primary_column': 'geometry',
'version': '0.1.0'}
"""
Expand Down
54 changes: 38 additions & 16 deletions format-specs/geoparquet.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,33 +50,55 @@ Each geometry column in the dataset must be included in the columns field above

| Field Name | Type | Description |
| ---------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| crs | string | **REQUIRED** [WKT2](http://docs.opengeospatial.org/is/12-063r5/12-063r5.html) string representing the Coordinate Reference System (CRS) of the geometry. |
| crs | string | **REQUIRED** [WKT2](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html) string representing the Coordinate Reference System (CRS) of the geometry. |
| encoding | string | **REQUIRED** Name of the geometry encoding format. Currently only 'WKB' is supported. |

#### crs

It is strongly recommended to use [EPSG:4326 (lat, long)](https://spatialreference.org/ref/epsg/4326/) for all data, so in most cases the value of the crs should be:
The Coordinate Reference System (CRS) is a mandatory parameter for each geometry column defined in geoparquet format.

The CRS must be provided in [WKT](https://en.wikipedia.org/wiki/Well-known_text_representation_of_coordinate_reference_systems) version 2, also known as **WKT2**. WKT2 has several revisions, this specification supports the revisions from [2015](http://docs.opengeospatial.org/is/12-063r5/12-063r5.html) and [2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html): WKT2_2015, WKT2_2015_SIMPLIFIED, WKT2_2019, WKT_2019_SIMPLIFIED.


As the most common CRS for datasets is latitude/longitude, for the widest interoperability we recommend [EPSG:4326](https://epsg.org/crs_4326/WGS-84.html) for all data, so in most cases the value of the crs should be:
jorisvandenbossche marked this conversation as resolved.
Show resolved Hide resolved

```
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.01745329251994328,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]]
GEOGCRS["WGS 84",
cholmes marked this conversation as resolved.
Show resolved Hide resolved
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563],
ENSEMBLEACCURACY[2.0]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north],
AXIS["geodetic longitude (Lon)",east],
UNIT["degree",0.0174532925199433],
USAGE[
SCOPE["Horizontal component of 3D system."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["EPSG",4326]]
```

Data that is better served in particular projections can choose to use an alternate coordinate reference system.
Due to the large number of CRSes available and the difficulty of implementing all of them, we expect that a number of implementations will at least start with only support a single CRS. To maximize interoperability we strongly recommend GeoParquet tool providers to always implement support for [EPSG:4326](https://epsg.org/crs_4326/WGS-84.html).
alasarr marked this conversation as resolved.
Show resolved Hide resolved
Users are recommended to store their data in EPSG:4326 for it to work with the widest number of tools. But data that is better served in particular projections can choose to use an alternate coordinate reference system. We expect many tools will support alternate CRSes, but encourage users to check.

#### encoding

This is the binary format that the geometry is encoded in. The string 'WKB' to represent
[Well Known Binary](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry#Well-known_binary) is the only current option, but future versions
of the spec may support alternative encodings. This should be the ["standard"](https://libgeos.org/specifications/wkb/#standard-wkb) WKB representation.
This is the binary format that the geometry is encoded in.
The string 'WKB', signifying [Well Known Binary](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry#Well-known_binary) is the only current option, but future versions
of the spec may support alternative encodings. This should be the ["standard"](https://libgeos.org/specifications/wkb/#standard-wkb) WKB
representation. This means 3D coordinates are not supported in this version of GeoParquet, but we expect
this to come in a future version.

#### Coordinate axis order

The axis order of the coordinates in WKB stored in a geoparquet follows the de facto standard for axis order in WKB and is therefore always (x, y) where x is easting or longitude and y is northing or latitude. This ordering explicitly overrides the axis order as specified in the CRS.
jorisvandenbossche marked this conversation as resolved.
Show resolved Hide resolved

### Additional information

Expand Down