Skip to content

Commit

Permalink
Migration of the GeoParquet specification to the official OGC documen…
Browse files Browse the repository at this point in the history
…t template (#206)
  • Loading branch information
m-mohr committed May 19, 2024
1 parent 6e01093 commit afff519
Show file tree
Hide file tree
Showing 42 changed files with 735 additions and 0 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,8 @@
/scripts/data/
/scripts/__pycache__/
/format-specs/relaton/
/format-specs/iev/
.DS_Store
format-specs/document.err.html
format-specs/document.presentation.xml
format-specs/document.html
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Initial work started in the [geo-arrow-spec](https://github.com/geoarrow/geoarro
Arrow work in a compatible way, with this specification focused solely on Parquet. We are in the process of becoming an [OGC](https://ogc.org) official
[Standards Working Group](https://portal.ogc.org/files/103450) and are on the path to be a full OGC standard.

**The OGC candidate Standard is at [https://docs.ogc.org/DRAFTS/24-013.html](https://docs.ogc.org/DRAFTS/24-013.html)**. The candidate Standard remains in draft form until it is approved as a Standard by the OGC Membership.

**The latest [stable specification](https://geoparquet.org/releases/v1.0.0/) and [JSON schema](https://geoparquet.org/releases/v1.0.0/schema.json) are published at [geoparquet.org/releases/](https://geoparquet.org/releases/).**

The 'dev' versions of the spec are available in this repo:
Expand Down
4 changes: 4 additions & 0 deletions format-specs/Gemfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
source "https://rubygems.org"

gem "metanorma-cli"
gem "relaton-cli"
26 changes: 26 additions & 0 deletions format-specs/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
= Standard template in Metanorma

== Content

This repository contains the content for an OGC standard.

* `document.adoc` - the main standard document with references to all sections
* remaining ``adoc``s - each section of the standard document is in a separate document: follow directions in each document to populate
* `figures` - figures go here
* `images` - Image files for graphics go here. Image files for figures go in the `figures` directory. Only place in here images not used in figures (e.g., as parts of tables, as logos, etc.)
* `requirements` - directory for requirements and requirement classes to be referenced in `clause_7_normative_text.adoc`
* `code` - sample code to accompany the standard, if desired
* `abstract_tests` - the Abstract Test Suite comprising one test for every requirement, optional
* `UML` - UML diagrams, if applicable

More information about the document template is https://github.com/opengeospatial/templates/tree/master/standard#readme[here].

An authoring guide is available at https://www.metanorma.org/author/ogc/authoring-guide/[metanorma.org].

== Building

Run `docker run -v "$(pwd)":/metanorma -v ${HOME}/.fontist/fonts/:/config/fonts metanorma/metanorma metanorma compile --agree-to-terms -t ogc -x html document.adoc`.

== Auto built document

A daily built document is available at https://docs.ogc.org/DRAFTS/[OGC Document DRAFTS].
48 changes: 48 additions & 0 deletions format-specs/abstract_tests/ATS_class_core.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
[[ats_core]]
[conformance_class]
====
[%metadata]
identifier:: /conf/core
subject:: <<rc_table-core>>
classification:: Target Type:Apache Parquet file
conformance-test:: /conf/core/geometry-columns
conformance-test:: /conf/core/nesting
conformance-test:: /conf/core/repetition
conformance-test:: /conf/core/metadata
conformance-test:: /conf/core/crs
conformance-test:: /conf/core/epoch
conformance-test:: /conf/core/orientation
conformance-test:: /conf/core/bbox
====

==== Geometry colums

include::./TEST001.adoc[]

==== Nesting

include::./TEST002.adoc[]

==== Repetition

include::./TEST003.adoc[]

==== Metadata

include::./TEST004.adoc[]

==== CRS

include::./TEST005.adoc[]

==== Epoch

include::./TEST006.adoc[]

==== Orientation

include::./TEST007.adoc[]

==== Bounding Box

include::./TEST008.adoc[]
5 changes: 5 additions & 0 deletions format-specs/abstract_tests/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
This folder contains the Abstract Test Suite.

The test is expressed according to this pattern:

NOTE: for each test, there should be a corresponding requirement in the "requirements" folder.
15 changes: 15 additions & 0 deletions format-specs/abstract_tests/TEST001.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@

[abstract_test]
====
[%metadata]
identifier:: /conf/core/geometry-columns
target:: /req/core/geometry-columns
test-purpose:: Validate that geometry columns are stored using the BYTE_ARRAY parquet type.
test-method::
+
--
1. Verify that geometry columns are stored using the BYTE_ARRAY parquet type.
2. Verify that geometries are encoded as WKB.
--
====
16 changes: 16 additions & 0 deletions format-specs/abstract_tests/TEST002.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@

[abstract_test]
====
[%metadata]
identifier:: /conf/core/nesting
target:: /req/core/nesting
test-purpose:: Validate that geometries are not contained in complex or nested types such as structs, lists, arrays, or map types.
test-method::
+
--
1. Verify that geometry columns are at the root of the schema.
2. Verify that no geometry is a group field or nested in a group.
--
====
16 changes: 16 additions & 0 deletions format-specs/abstract_tests/TEST003.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@

[abstract_test]
====
[%metadata]
identifier:: /conf/core/repetition
target:: /req/core/repetition
test-purpose:: Validate the cardinality of geometry columns.
test-method::
+
--
1. Verify that the cardinality for all geometry columns is “required” (exactly one) or “optional” (zero or one).
2. Verify that no geometry column is repeated.
--
====
19 changes: 19 additions & 0 deletions format-specs/abstract_tests/TEST004.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@

[abstract_test]
====
[%metadata]
identifier:: /conf/core/metadata
target:: /req/core/metadata
test-purpose:: Validate the metadata keys contained in the GeoParquet file.
test-method::
+
--
1. Verify that the GeoParquet file includes a geo key in the Parquet metadata (see FileMetaData::key_value_metadata).
2. Verify that the value of this key is a JSON-encoded UTF-8 string representing the file and column metadata that validates against the GeoParquet metadata schema.
3. Verify that each geometry column in the dataset is included in the columns field (specified in <<tbl_file_and_column_metadata_fields>>) with the content specified in <<tbl_column_metadata>>, keyed by the column name
--
====
17 changes: 17 additions & 0 deletions format-specs/abstract_tests/TEST005.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@

[abstract_test]
====
[%metadata]
identifier:: /conf/core/crs
target:: /req/core/crs
test-purpose:: Validate that the CRS correctly specified.
test-method::
+
--
1. If CRS is provided, verify that the CRS is provided in https://proj.org/specifications/projjson.html[PROJJSON] format.
2. If CRS is not provided, verify that all coordinates in the geometries use longitude, latitude based on the WGS84 datum, and the default value is https://www.opengis.net/def/crs/OGC/1.3/CRS84[OGC:CRS84] for CRS-aware implementations.
--
====
15 changes: 15 additions & 0 deletions format-specs/abstract_tests/TEST006.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@

[abstract_test]
====
[%metadata]
identifier:: /conf/core/epoch
target:: /req/core/epoch
test-purpose:: If the crs field defines a dynamic CRS, validate that the coordinates are qualified with the epoch at which they are valid.
test-method::
+
--
1. If the crs field defines a dynamic CRS, verify that the coordinates are qualified with the epoch at which they are valid.
--
====
17 changes: 17 additions & 0 deletions format-specs/abstract_tests/TEST007.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@

[abstract_test]
====
[%metadata]
identifier:: /conf/core/orientation
target:: /req/core/orientation
test-purpose:: Validate the winding order of polygons.
test-method::
+
--
1. Verify that all vertices of exterior polygon rings are ordered in the counterclockwise direction
2. Verify that all interior rings are ordered in the clockwise direction.
--
====
14 changes: 14 additions & 0 deletions format-specs/abstract_tests/TEST008.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@

[abstract_test]
====
[%metadata]
identifier:: /conf/core/bbox
target:: /req/core/bbox
test-purpose:: Validate that the bounding boxes are constructed correctly.
test-method::
+
--
1. Verify that the bbox, if specified, is encoded with an array representing the range of values for each dimension in the geometry coordinates.
--
====
1 change: 1 addition & 0 deletions format-specs/code/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Sample code may be stored in this folder, organized as you see fit
60 changes: 60 additions & 0 deletions format-specs/document.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
= GeoParquet Specification
:doctype: standard
:encoding: utf-8
:lang: en
:status: draft
:committee: technical
:draft: 3.0
:external-id: http://www.opengis.net/doc/IS/geoparquet/1.0
:docnumber: 24-013
:received-date: 2029-03-30
:issued-date: 2029-03-30
:published-date: 2029-03-30
:fullname: Chris Holmes
:fullname_2: Tim Schaub
:fullname_3: Joris Van den Bossche
:fullname_4: Kyle Barron
:fullname_5: Javier de la Torre
:docsubtype: Interface
:keywords: ogcdoc, OGC document, geoparquet, parquet, columnar, cloud
:submitting-organizations: Planet; CARTO; Wherobots; Foursquare Labs
:mn-document-class: ogc
:mn-output-extensions: xml,html,doc,pdf
:local-cache-only:
:data-uri-image:
:pdf-uri: ./document.pdf
:xml-uri: ./document.xml
:doc-uri: ./document.doc
:edition: 1.0.0

////
Make sure to complete each included document
////
include::sections/clause_0_front_material.adoc[]

include::sections/clause_1_scope.adoc[]

include::sections/clause_2_conformance.adoc[]

include::sections/clause_3_references.adoc[]

include::sections/clause_4_terms_and_definitions.adoc[]

include::sections/clause_5_conventions.adoc[]

include::sections/clause_6_normative_text.adoc[]


////
add or remove annexes after "A" as necessary
////

include::sections/annex-a.adoc[]

////
Revision History should be the last annex before the Bibliography
Bibliography should be the last annex
////
include::sections/annex-history.adoc[]

include::sections/annex-bibliography.adoc[]
5 changes: 5 additions & 0 deletions format-specs/figures/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Figures go here.

Each figure is a separate file with the naming convention:

"FIGn.xxx" where "n" is a number with leading zeroes appropriate for the total number of figures and "xxx" is the appropriate extension for the file type.
5 changes: 5 additions & 0 deletions format-specs/images/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Image files for graphics go here. Image files for figures go in the "figures" directory. Only place in here images not used in figures (e.g., as parts of tables, as logos, etc.)

Each graphic is a separate file with the naming convention:

"GRPn.xxx" where "n" is a sequential number with leading zeroes appropriate for the total number of graphics and "xxx" is the appropriate extension for the file type.
3 changes: 3 additions & 0 deletions format-specs/notes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Confirm the target type of the Abstract Test suite. Presumably it is the Parquet file.

Confirm the editors, submitters and contributors.
6 changes: 6 additions & 0 deletions format-specs/recommendations/recommendation001.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[recommendation]
====
[%metadata]
identifier:: /rec/core/encoding
part:: The geometry encoding SHOULD be the https://portal.ogc.org/files/?artifact_id=18241[OpenGIS® Implementation Specification for Geographic information — Simple feature access — Part 1: Common architecture] WKB representation (using codes for 3D geometry types in the [1001,1007] range).
====
6 changes: 6 additions & 0 deletions format-specs/recommendations/recommendation002.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[recommendation]
====
[%metadata]
identifier:: /rec/core/orientation-spherical-edges
part:: If edges is “spherical”, the orientation SHOULD always be set to counterclockwise
====
6 changes: 6 additions & 0 deletions format-specs/recommendations/recommendation003.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[recommendation]
====
[%metadata]
identifier:: /rec/core/feature-identifiers
part:: If you are using GeoParquet to serialize geospatial data with feature identifiers, you SHOULD create your own https://github.com/apache/parquet-format#metadata[file key/value metadata] to indicate the column that represents this identifier.
====
15 changes: 15 additions & 0 deletions format-specs/requirements/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
This folder contains requirements description.

Each file is a single requirement. The naming convention for these files is:

"REQn.adoc" where "n" corresponds to the requirement number. Numbers should have preceding zeros appropriate for the total number of requirements in the project (e.g., the first requirement could be REQ001 if less than 1000 requirements are anticipated).

The requirement files are integrated into the main document as links.

The requirement is expressed according to this pattern:

NOTE: for each requirement, there should be a corresponding Abstract Test in the "abstract_tests" folder.

NOTE: sample code may reference one or more requirements and should state which requirements are included in the code by adding the following line to the Extended Description:

"#REQS: reqnum1,reqnum2,...reqnumn"
7 changes: 7 additions & 0 deletions format-specs/requirements/requirement001.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[requirement]
====
[%metadata]
identifier:: /req/core/geometry-columns
part:: Geometry columns SHALL be stored using the BYTE_ARRAY parquet type.
part:: Geometries SHALL be encoded as https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry#Well-known_binary[Well Known Binary (WKB)].
====
7 changes: 7 additions & 0 deletions format-specs/requirements/requirement002.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[requirement]
====
[%metadata]
identifier:: /req/core/nesting
part:: Geometry columns SHALL be at the root of the schema.
part:: A geometry SHALL NOT be a group field or nested in a group.
====
7 changes: 7 additions & 0 deletions format-specs/requirements/requirement003.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[requirement]
====
[%metadata]
identifier:: /req/core/repetition
part:: The repetition for all geometry columns SHALL be “required” (exactly one) or “optional” (zero or one).
part:: A geometry column SHALL NOT be repeated.
====
Loading

0 comments on commit afff519

Please sign in to comment.