Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRS spec definition for version 0.1 #25

Merged
merged 23 commits into from
Mar 6, 2022
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified examples/geoparquet/example.parquet
Binary file not shown.
33 changes: 17 additions & 16 deletions examples/geoparquet/example.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,25 @@

.. code-block:: python

>>> import json, pprint, pyarrow.parquet
>>> import json, pprint, pyarrow.parquet as pq
>>> pprint.pprint(json.loads(pq.read_schema("example.parquet").metadata[b"geo"]))
{'columns': {'geometry': {'crs': 'GEOGCRS["WGS 84",ENSEMBLE["World Geodetic '
'System 1984 ensemble",MEMBER["World Geodetic '
'System 1984 (Transit)"],MEMBER["World '
'Geodetic System 1984 (G730)"],MEMBER["World '
'Geodetic System 1984 (G873)"],MEMBER["World '
'Geodetic System 1984 (G1150)"],MEMBER["World '
'Geodetic System 1984 (G1674)"],MEMBER["World '
'Geodetic System 1984 '
'(G1762)"],ELLIPSOID["WGS '
'84",6378137,298.257223563],ENSEMBLEACCURACY[2.0]],CS[ellipsoidal,2],AXIS["geodetic '
'latitude (Lat)",north],AXIS["geodetic '
'longitude '
'(Lon)",east],UNIT["degree",0.0174532925199433],USAGE[SCOPE["Horizontal '
'component of 3D '
'system."],AREA["World."],BBOX[-90,-180,90,180]],ID["EPSG",4326]]',
'encoding': 'WKB'}},
'System 1984 ensemble",MEMBER["World Geodetic '
'System 1984 (Transit)"],MEMBER["World '
'Geodetic System 1984 (G730)"],MEMBER["World '
'Geodetic System 1984 (G873)"],MEMBER["World '
'Geodetic System 1984 (G1150)"],MEMBER["World '
'Geodetic System 1984 (G1674)"],MEMBER["World '
'Geodetic System 1984 (G1762)"],MEMBER["World '
'Geodetic System 1984 '
'(G2139)"],ELLIPSOID["WGS '
'84",6378137,298.257223563],ENSEMBLEACCURACY[2.0]],CS[ellipsoidal,2],AXIS["geodetic '
'latitude (Lat)",north],AXIS["geodetic '
'longitude '
'(Lon)",east],UNIT["degree",0.0174532925199433],USAGE[SCOPE["Horizontal '
'component of 3D '
'system."],AREA["World."],BBOX[-90,-180,90,180]],ID["EPSG",4326]]',
'encoding': 'WKB'}},
'primary_column': 'geometry',
'version': '0.1.0'}
"""
Expand Down
52 changes: 37 additions & 15 deletions format-specs/geoparquet.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,33 +50,55 @@ Each geometry column in the dataset must be included in the columns field above

| Field Name | Type | Description |
| ---------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| crs | string | **REQUIRED** [WKT2](http://docs.opengeospatial.org/is/12-063r5/12-063r5.html) string representing the Coordinate Reference System (CRS) of the geometry. |
| crs | string | **REQUIRED** [WKT2](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html) string representing the Coordinate Reference System (CRS) of the geometry. |
| encoding | string | **REQUIRED** Name of the geometry encoding format. Currently only 'WKB' is supported. |

#### crs

It is strongly recommended to use [EPSG:4326 (lat, long)](https://spatialreference.org/ref/epsg/4326/) for all data, so in most cases the value of the crs should be:
The Coordinate Reference System (CRS) is a mandatory parameter for each geometry column defined in geoparquet format.

The CRS must be provided in [WKT](https://en.wikipedia.org/wiki/Well-known_text_representation_of_coordinate_reference_systems) version 2, also known as **WKT2**. WKT2 has several revisions, this specification supports the revisions from [2015](http://docs.opengeospatial.org/is/12-063r5/12-063r5.html) and [2019](https://docs.opengeospatial.org/is/18-010r7/18-010r7.html): WKT2_2015, WKT2_2015_SIMPLIFIED, WKT2_2019, WKT_2019_SIMPLIFIED.


As the most common CRS for datasets is latitude/longitude, for the widest interoperability we recommend [EPSG:4326](https://spatialreference.org/ref/epsg/wgs-84) for all data, so in most cases the value of the crs should be:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should mention here that our axis order overrides this?

Copy link
Collaborator Author

@alasarr alasarr Mar 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a specific section for coordinate order where we specify that. Did you see it? Or do you just want to emphasize here too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I saw it - I think we can refer to that section. I just think we need to call it out here. Acknowledge it's a bit confusing. We say here 'use the crs for latitude/longitude', and then below we say 'but do it as longitude/latitude'.

I suppose alternatively we don't actually mention lat/long or or long/lat here - we just that we recommend 4326 for widest interoperability.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose alternatively we don't actually mention lat/long or or long/lat here - we just that we recommend 4326 for widest interoperability.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point now 👍

alasarr marked this conversation as resolved.
Show resolved Hide resolved

```
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.01745329251994328,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]]
GEOGCRS["WGS 84",
cholmes marked this conversation as resolved.
Show resolved Hide resolved
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563],
ENSEMBLEACCURACY[2.0]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north],
AXIS["geodetic longitude (Lon)",east],
UNIT["degree",0.0174532925199433],
USAGE[
SCOPE["Horizontal component of 3D system."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["EPSG",4326]]
```

Data that is better served in particular projections can choose to use an alternate coordinate reference system.
Due to the large number of CRSes available and the difficulty of implementing all of them, we expect that a number of implementations will at least start with only support a single CRS. To maximize interoperability we strongly recommend GeoParquest tool providers to always implement support for [EPSG:4326](https://spatialreference.org/ref/epsg/wgs-84).
alasarr marked this conversation as resolved.
Show resolved Hide resolved
Users are recommended to store their data in EPSG:4326 for it to work with the widest number of tools. But data that is better served in particular projections can choose to use an alternate coordinate reference system. We expect many tools will support alternate CRS's, but encourage users to check.
alasarr marked this conversation as resolved.
Show resolved Hide resolved

#### encoding

This is the binary format that the geometry is encoded in. The string 'WKB' to represent
alasarr marked this conversation as resolved.
Show resolved Hide resolved
[Well Known Binary](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry#Well-known_binary) is the only current option, but future versions
of the spec may support alternative encodings. This should be the ["standard"](https://libgeos.org/specifications/wkb/#standard-wkb) WKB representation.
The string 'WKB', signifying [Well Known Binary](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry#Well-known_binary) is the only current option, but future versions
of the spec may support alternative encodings. This should be the ["standard"](https://libgeos.org/specifications/wkb/#standard-wkb) WKB
representation. This means 3D coordinates are not supported in this version of GeoParquet, but we expect
this to come in a future version.

#### Coordinate Order
alasarr marked this conversation as resolved.
Show resolved Hide resolved

The axis order in WKB stored in a geoparquet follows the de facto standard for axis order in WKB and is therefore always (x,y{,z}{,m}) where x is easting or longitude, y is northing or latitude, z is optional elevation, and m is optional measure. This ordering explicitly overrides the axis order as specified in the CRS.
alasarr marked this conversation as resolved.
Show resolved Hide resolved

### Additional information

Expand Down