From 1ad7f9d95bffca56abfe09c725e9c1aa12b2326a Mon Sep 17 00:00:00 2001 From: Chris Holmes Date: Mon, 11 Sep 2023 15:58:43 -0700 Subject: [PATCH 1/7] updated readme and version numbers --- README.md | 27 +++++++++++++++++---------- examples/example.parquet | Bin 27814 -> 27798 bytes examples/example_metadata.json | 2 +- format-specs/geoparquet.md | 2 +- format-specs/schema.json | 2 +- 5 files changed, 20 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index c67862b..371a255 100644 --- a/README.md +++ b/README.md @@ -2,9 +2,7 @@ ## About -This repository defines a [specification](https://geoparquet.org/releases/) for how to store geospatial [vector data](https://gisgeography.com/spatial-data-types-vector-raster/) (point, lines, polygons) in [Apache Parquet](https://parquet.apache.org/), a popular columnar storage format for tabular data - see [this vendor explanation](https://databricks.com/glossary/what-is-parquet) for more on what that means. Our goal is to standardize how geospatial data is represented in Parquet to further geospatial interoperability among tools using Parquet today, and hopefully help push forward what's possible with 'cloud-native geospatial' workflows. There are now more than 10 different tools and libraries in 6 different languages that support GeoParquet, you can learn more at [geoparquet.org](https://geoparquet.org). - -**Note:** This specification is currently in 1.0 'release candidate' status, which means the community is proposing the current version to be 1.0.0, and if no blocking negative feedback is made until end of August 2023 then it will become 1.0.0. This means breaking changes are still possible, but quite unlikely - see the [versioning](#versioning) section below for more info. +This repository defines a [specification](https://geoparquet.org/releases/) for how to store geospatial [vector data](https://gisgeography.com/spatial-data-types-vector-raster/) (point, lines, polygons) in [Apache Parquet](https://parquet.apache.org/), a popular columnar storage format for tabular data - see [this vendor explanation](https://databricks.com/glossary/what-is-parquet) for more on what that means. Our goal is to standardize how geospatial data is represented in Parquet to further geospatial interoperability among tools using Parquet today, and hopefully help push forward what's possible with 'cloud-native geospatial' workflows. There are now more than 20 different tools and libraries in 6 different languages that support GeoParquet, you can learn more at [geoparquet.org](https://geoparquet.org). Early contributors include developers from GeoPandas, GeoTrellis, OpenLayers, Vis.gl, Voltron Data, Microsoft, Carto, Azavea, Planet & Unfolded. Anyone is welcome to join the project, by building implementations, trying it out, giving feedback through issues and contributing to the spec via pull requests. @@ -12,10 +10,21 @@ Initial work started in the [geo-arrow-spec](https://github.com/geoarrow/geoarro Arrow work in a compatible way, with this specification focused solely on Parquet. We are in the process of becoming an [OGC](https://ogc.org) official [Standards Working Group](https://portal.ogc.org/files/103450) and are on the path to be a full OGC standard. -- [**Specification**](format-specs/geoparquet.md) +The latest [stable specification](https://geoparquet.org/releases/v1.0.0) and [JSON schema](https://geoparquet.org/releases/v1.0.0/schema.json) are published at [geoparquet.org/releases/](https://geoparquet.org/releases/). + +The 'dev' versions of the spec are available in this repo: + +- [**Specification**](format-specs/geoparquet.md) (dev version - not stable, go to for latest stable) - [JSON Schema](format-specs/schema.json) - [Examples](examples/) +## Validating GeoParquet + +There are two tools that validate the metadata and the actual data. It is recommended to use one of them to ensure any GeoParquet you produce or are given is completely valid according to the specification: + +* **[GPQ](https://github.com/planetlabs/gpq)** - the `validate` command generates a report with `gpq validate example.parquet`. +* **[GDAL/OGR Validation Script](https://gdal.org/drivers/vector/parquet.html#validation-script)** - a python scrip that can check with `python3 validate_geoparquet.py --check-data my_geo.parquet` + ## Goals There are a few core goals driving the initial development. @@ -53,16 +62,14 @@ will work much better if it is backing a system that is constantly updating the ## Roadmap -Our aim is to get to a 1.0.0 final by the end of August 2023. The goal of 1.0.0 is to establish a baseline of interoperability for geospatial information in Parquet. For 1.0.0 -the only geometry encoding option is Well Known Binary, but we made it an option to allow other encodings. The main goal of 1.1.0 will be to incorporate a more columnar-oriented +The goal of 1.0.0 was to establish a baseline of interoperability for geospatial information in Parquet. For 1.0.0 +the only geometry encoding option is Well Known Binary, but there is an option to allow other encodings. The main goal of 1.1.0 will be to incorporate a more columnar-oriented geometry format, which is currently being worked on as part of the [GeoArrow spec](https://github.com/geoarrow/geoarrow). Once that gets finalized we will add the option to -GeoParquet. In general 1.1.0 will further explore spatial optimization, spatial indices and spatial partitioning to improve GeoParquet's performance. +GeoParquet. In general 1.1.0 will further explore spatial optimization, spatial indices and spatial partitioning to improve GeoParquet's. ## Versioning -After we reach version 1.0 we will follow [SemVer](https://semver.org/), so at that point any breaking change will require the spec to go to 2.0.0. -Currently implementors should expect breaking changes, though at some point, hopefully relatively soon (0.4?), we will declare that we don't *think* there -will be any more potential breaking changes. Though the full commitment to that won't be made until 1.0.0. +As of version 1.0 the specification follows [Semantic Versioning](https://semver.org/), so at that point any breaking change will require the spec to go to 2.0.0. ## Current Implementations & Examples diff --git a/examples/example.parquet b/examples/example.parquet index 6550284ab14f8bdf66fdf082356913ed11b3900a..287c81fd6c57c480669cd2dbd058f756cbd25340 100644 GIT binary patch delta 1815 zcmZ`(&92)-6z(mzh>9y#?FJT9m!Abl#deamzLgMTCvof>J2%OV$97j9|HMvgCv6ft zHWGOQHWoYrO9W!YD!-vnp*YEy({}Vd)sQ$|R^tM*izqs!o+R_{$3Bge23d|m> znUi>#u_i)09>plur|n|6K=DvUu{Y4x1oz9p6cf25 zE7eEpU=zQ3K8nFcYVH8Dbn&5+sf*QIP5j|FRyV4wWeyg%jy0i1xWJMUrlh}#XNtH9 zu%D9#XZU!}xu%qpL6T8tt!Xh=Q=_~wstt^VtC(AR3Qow{${$P$uUh59mQ)g9wIeMp z8y;FmkfrDT0_}`gEWA{F?q&uU>QuH6B?R|&&6+vLoMKKd{e%d9X1mZ~r%crM%;~Le^ z!;6za{aJ8vYAigi{sO>e{bTU-)D!fWftjW|>RzvwfPQ??i;=~PO=`^zQIf(F#I++g z@A@B+Haro`t!s)~8?SRO5u-p!nD@p)Z;iUW$|&hmB|ASgOHVLEC*4QzJPCn4hjON? zc1Ci$G~T=RN`0a8#=;OO0c~t7d`pY$*0LdI+%D=b{q_Mgmqx}nAVJK%bg~BTIL8#L zz%n=ob8vyO+u-eN((#lTQ{dqJ#bFn@r zinw!xg!AzvQR89;;q+rW(-J>TB40@;ODXGg%1r9Hqa>w+4Y(}s7*kd=Ht%V zrzw@SrGtII&EgQ)TpQ6lJ*w@-%>P=ktwz(ot#ooPo2=?@qqh&dPnALj7NjL54TC$m zFH0{-Dz-@6IBHv3ax57ON@U~6$Nf~xgX6mc2Y6qJLZb9J7ntlc+1!q<5oK)4fs9+i zJ}0Ta_6O#YRl`Rzw;y#GfYu)8_Qeg9!9RfRvS69Q{rda(-W>o-V*O|E?%`^}HI+qs z#<8+!wY3dd6Io0t*bq#{F*TDmHts}T(f*p7;aFC09Bl*)I-%)Q6^R*I zugzg^+>s@I0_RGI8{!9W<3_o0M~EN5fp=yS8dMx*V!!ju`|-^4zWMge-gj^IUVr@Z z&L6kFI@nBKEz|Sj;P}CT_*i%>Jc_5A?+;$i-v98|+n>{Co3!7AU%vMowZDW9Z$0fM z2z3!Wde!dXySr7OyONPsiBWXxa3ap5g3NqLS;iA_ZDi+(k;eo}B8dlPhv9O2g9q*k z7glfyRgjqzb$pbAk=h1*fFI z(jfzmJ$D)FWQ3({CyFgIadHV|hWN%ncLgNldTzyhxv>WTnA_L}l z-x{A)?Vr=*J*WL__Qk$0+iLrFtn7*HPx0~1zHDd4_ykxcX{KjsQt<^;IHcMOtBr0s;Nx{yvS^wz7w3vXvQ)J@%W~DQGm;GGBIq3A@9znHg(KZU34c?)=8e0)>M> zpt89LC!Kk$yR%3p4g6iv10c>Oo|wYa>k=+*cZqdUMUqkmy0+w3tR03P*CpWB{zT;1 z1#JX}fG#sb-)?Xom+`<|1gFN**s12+&&@jUf)Yw8sgS8K$2y3YQj$Aml)D)m-r$t! zi};=meL*|VUBkUHJai*y&Lu_xln<6E&b1W;pj|}uxCp?vUw0{8Z-YbZ-Tw#Icz}0< zuYn;UR3UTK6md=j&W1w**(AZKsdyf{bLb$Yx=i2%%cCTeI-3}}=aQ*9;2pS2U9RZM z4Ku}Z!bMWpo7i*CJ!uQqL7+1nM(1$6sHoUu$!%gySz@VEgq|Y?tXt5^8YgTYLo3Jn ze1@T`uI+a$m9py2BiVtyPn2~ay>(6x&raJPlY6_d5chbpbSCl6rQLke8>KT|BnkAG zFoCLO1##BC$LAH0x|yo9370&d3;|&R#6G0CZBKd#^j9S(J!@^ljIdl$0iSfdB2>Yz z9`6XQkWRn*@PjOBRbd7r6w{?QqAP3w_61C9x;+{(0Ks%9F%}M;V82aA+Zvpl*c>hN8!xprQ5fk8l+8<~=OH%a>@xs!$=;ql iyK(118NRw$PgjqkvV0W2Y7_G52fy9EapMJY#D4(G<#ho7 diff --git a/examples/example_metadata.json b/examples/example_metadata.json index 92204dc..6043282 100644 --- a/examples/example_metadata.json +++ b/examples/example_metadata.json @@ -115,6 +115,6 @@ } }, "primary_column": "geometry", - "version": "1.0.0-dev" + "version": "1.0.0" } } \ No newline at end of file diff --git a/format-specs/geoparquet.md b/format-specs/geoparquet.md index 3837ddd..bbaa114 100644 --- a/format-specs/geoparquet.md +++ b/format-specs/geoparquet.md @@ -8,7 +8,7 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "S ## Version and schema -This is version 1.0.0-dev of the GeoParquet specification. See the [JSON Schema](schema.json) to validate metadata for this version. +This is version 1.0.0 of the GeoParquet specification. See the [JSON Schema](schema.json) to validate metadata for this version. ## Geometry columns diff --git a/format-specs/schema.json b/format-specs/schema.json index 8133a46..ad2fd94 100644 --- a/format-specs/schema.json +++ b/format-specs/schema.json @@ -7,7 +7,7 @@ "properties": { "version": { "type": "string", - "const": "1.0.0-dev" + "const": "1.0.0" }, "primary_column": { "type": "string", From 289db249efa6b0f9a56f48bfe9e01c0458be6745 Mon Sep 17 00:00:00 2001 From: Chris Holmes Date: Wed, 13 Sep 2023 07:40:12 -0700 Subject: [PATCH 2/7] Update README.md Co-authored-by: Tim Schaub --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 371a255..1b05190 100644 --- a/README.md +++ b/README.md @@ -65,7 +65,7 @@ will work much better if it is backing a system that is constantly updating the The goal of 1.0.0 was to establish a baseline of interoperability for geospatial information in Parquet. For 1.0.0 the only geometry encoding option is Well Known Binary, but there is an option to allow other encodings. The main goal of 1.1.0 will be to incorporate a more columnar-oriented geometry format, which is currently being worked on as part of the [GeoArrow spec](https://github.com/geoarrow/geoarrow). Once that gets finalized we will add the option to -GeoParquet. In general 1.1.0 will further explore spatial optimization, spatial indices and spatial partitioning to improve GeoParquet's. +GeoParquet. In general 1.1.0 will further explore spatial optimization, spatial indices and spatial partitioning to improve performance reading spatial subsets. ## Versioning From 4d75a416f04612f68d3006fb4d4769f1e9636ffd Mon Sep 17 00:00:00 2001 From: Chris Holmes Date: Wed, 13 Sep 2023 07:40:22 -0700 Subject: [PATCH 3/7] Update README.md Co-authored-by: Tim Schaub --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 1b05190..a1aef57 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ Initial work started in the [geo-arrow-spec](https://github.com/geoarrow/geoarro Arrow work in a compatible way, with this specification focused solely on Parquet. We are in the process of becoming an [OGC](https://ogc.org) official [Standards Working Group](https://portal.ogc.org/files/103450) and are on the path to be a full OGC standard. -The latest [stable specification](https://geoparquet.org/releases/v1.0.0) and [JSON schema](https://geoparquet.org/releases/v1.0.0/schema.json) are published at [geoparquet.org/releases/](https://geoparquet.org/releases/). +The latest [stable specification](https://geoparquet.org/releases/v1.0.0/) and [JSON schema](https://geoparquet.org/releases/v1.0.0/schema.json) are published at [geoparquet.org/releases/](https://geoparquet.org/releases/). The 'dev' versions of the spec are available in this repo: From 415462dd350a8d27228190f85d26d66e5a590d95 Mon Sep 17 00:00:00 2001 From: Chris Holmes Date: Wed, 13 Sep 2023 07:40:34 -0700 Subject: [PATCH 4/7] Update README.md Co-authored-by: Tim Schaub --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a1aef57..fb85195 100644 --- a/README.md +++ b/README.md @@ -23,7 +23,7 @@ The 'dev' versions of the spec are available in this repo: There are two tools that validate the metadata and the actual data. It is recommended to use one of them to ensure any GeoParquet you produce or are given is completely valid according to the specification: * **[GPQ](https://github.com/planetlabs/gpq)** - the `validate` command generates a report with `gpq validate example.parquet`. -* **[GDAL/OGR Validation Script](https://gdal.org/drivers/vector/parquet.html#validation-script)** - a python scrip that can check with `python3 validate_geoparquet.py --check-data my_geo.parquet` +* **[GDAL/OGR Validation Script](https://gdal.org/drivers/vector/parquet.html#validation-script)** - a Python script that can check compliance with `python3 validate_geoparquet.py --check-data my_geo.parquet` ## Goals From 5e001d6509b2c82bb85613cca59256258f581bd3 Mon Sep 17 00:00:00 2001 From: Chris Holmes Date: Wed, 13 Sep 2023 07:44:50 -0700 Subject: [PATCH 5/7] removed validator, as discussed in #184 --- validator/.gitignore | 160 ------------------ validator/README.md | 7 - validator/python/README.md | 40 ----- .../python/geoparquet_validator/__init__.py | 92 ---------- .../python/geoparquet_validator/schema.json | 1 - validator/python/setup.py | 28 --- 6 files changed, 328 deletions(-) delete mode 100644 validator/.gitignore delete mode 100644 validator/README.md delete mode 100644 validator/python/README.md delete mode 100755 validator/python/geoparquet_validator/__init__.py delete mode 120000 validator/python/geoparquet_validator/schema.json delete mode 100644 validator/python/setup.py diff --git a/validator/.gitignore b/validator/.gitignore deleted file mode 100644 index 68bc17f..0000000 --- a/validator/.gitignore +++ /dev/null @@ -1,160 +0,0 @@ -# Byte-compiled / optimized / DLL files -__pycache__/ -*.py[cod] -*$py.class - -# C extensions -*.so - -# Distribution / packaging -.Python -build/ -develop-eggs/ -dist/ -downloads/ -eggs/ -.eggs/ -lib/ -lib64/ -parts/ -sdist/ -var/ -wheels/ -share/python-wheels/ -*.egg-info/ -.installed.cfg -*.egg -MANIFEST - -# PyInstaller -# Usually these files are written by a python script from a template -# before PyInstaller builds the exe, so as to inject date/other infos into it. -*.manifest -*.spec - -# Installer logs -pip-log.txt -pip-delete-this-directory.txt - -# Unit test / coverage reports -htmlcov/ -.tox/ -.nox/ -.coverage -.coverage.* -.cache -nosetests.xml -coverage.xml -*.cover -*.py,cover -.hypothesis/ -.pytest_cache/ -cover/ - -# Translations -*.mo -*.pot - -# Django stuff: -*.log -local_settings.py -db.sqlite3 -db.sqlite3-journal - -# Flask stuff: -instance/ -.webassets-cache - -# Scrapy stuff: -.scrapy - -# Sphinx documentation -docs/_build/ - -# PyBuilder -.pybuilder/ -target/ - -# Jupyter Notebook -.ipynb_checkpoints - -# IPython -profile_default/ -ipython_config.py - -# pyenv -# For a library or package, you might want to ignore these files since the code is -# intended to run in multiple environments; otherwise, check them in: -# .python-version - -# pipenv -# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. -# However, in case of collaboration, if having platform-specific dependencies or dependencies -# having no cross-platform support, pipenv may install dependencies that don't work, or not -# install all needed dependencies. -#Pipfile.lock - -# poetry -# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. -# This is especially recommended for binary packages to ensure reproducibility, and is more -# commonly ignored for libraries. -# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control -#poetry.lock - -# pdm -# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. -#pdm.lock -# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it -# in version control. -# https://pdm.fming.dev/#use-with-ide -.pdm.toml - -# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm -__pypackages__/ - -# Celery stuff -celerybeat-schedule -celerybeat.pid - -# SageMath parsed files -*.sage.py - -# Environments -.env -.venv -env/ -venv/ -ENV/ -env.bak/ -venv.bak/ - -# Spyder project settings -.spyderproject -.spyproject - -# Rope project settings -.ropeproject - -# mkdocs documentation -/site - -# mypy -.mypy_cache/ -.dmypy.json -dmypy.json - -# Pyre type checker -.pyre/ - -# pytype static type analyzer -.pytype/ - -# Cython debug symbols -cython_debug/ - -# PyCharm -# JetBrains specific template is maintained in a separate JetBrains.gitignore that can -# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore -# and can be added to the global gitignore or merged into this file. For a more nuclear -# option (not recommended) you can uncomment the following to ignore the entire idea folder. -#.idea/ diff --git a/validator/README.md b/validator/README.md deleted file mode 100644 index 092746d..0000000 --- a/validator/README.md +++ /dev/null @@ -1,7 +0,0 @@ -# GeoParquet validator - -Command-line tools to validate a GeoParquet file. Using [JSON Schema](https://json-schema.org/). - -## Flavors - -- [GeoParquet validator - Python](./python) diff --git a/validator/python/README.md b/validator/python/README.md deleted file mode 100644 index 7a40cc6..0000000 --- a/validator/python/README.md +++ /dev/null @@ -1,40 +0,0 @@ -# GeoParquet validator - Python - -Command-line tool to validate a GeoParquet file. Written in Python. Using [JSON Schema](https://json-schema.org/). - -## Installation - -``` -pip install --no-binary geoparquet_validator . -``` - -**Update** - -``` -pip install --no-binary geoparquet_validator -U . -``` - -**Development** - -``` -pip install -e . -``` - -**Uninstall** - -``` -pip uninstall geoparquet_validator -``` - -## Usage - -``` -geoparquet_validator ../../examples/example.parquet -geoparquet_validator https://storage.googleapis.com/open-geodata/linz-examples/nz-buildings-outlines.parquet -``` - -The validator also supports remote files. - -- `http://` or `https://`: no further configuration is needed. -- `s3://`: `s3fs` needs to be installed (run `pip install .[s3]`) and you may need to set environment variables. Refer [here](https://s3fs.readthedocs.io/en/latest/#credentials) for how to define credentials. -- `gs://`: `gcsfs` needs to be installed (run `pip install .[gcs]`). By default, `gcsfs` will attempt to use your default gcloud credentials or, attempt to get credentials from the google metadata service, or fall back to anonymous access. diff --git a/validator/python/geoparquet_validator/__init__.py b/validator/python/geoparquet_validator/__init__.py deleted file mode 100755 index b19103d..0000000 --- a/validator/python/geoparquet_validator/__init__.py +++ /dev/null @@ -1,92 +0,0 @@ -import json -import click -import pyarrow.parquet as pq - -from pprint import pprint -from urllib.parse import urlparse -from importlib_resources import files -from jsonschema.validators import Draft7Validator -from pyarrow.fs import FSSpecHandler, PyFileSystem -from fsspec import AbstractFileSystem -from fsspec.implementations.http import HTTPFileSystem -from fsspec.implementations.local import LocalFileSystem - - -def choose_fsspec_fs(url_or_path: str) -> AbstractFileSystem: - """Choose fsspec filesystem by sniffing input url""" - parsed = urlparse(url_or_path) - - if parsed.scheme.startswith("http"): - return HTTPFileSystem() - - if parsed.scheme == "s3": - from s3fs import S3FileSystem - - return S3FileSystem() - - if parsed.scheme == "gs": - from gcsfs import GCSFileSystem - - return GCSFileSystem() - - # TODO: Add Azure - return LocalFileSystem() - - -def load_parquet_schema(url_or_path: str) -> pq.ParquetSchema: - """Load schema from local or remote Parquet file""" - fsspec_fs = choose_fsspec_fs(url_or_path) - pyarrow_fs = PyFileSystem(FSSpecHandler(fsspec_fs)) - return pq.read_schema(pyarrow_fs.open_input_file(url_or_path)) - - -def log(text: str, status="info"): - status_color = { - "info": "white", - "warning": "yellow", - "error": "red", - "success": "green"} - click.echo(click.style(text, fg=status_color[status])) - - -@click.command() -@click.argument("input_file") -def main(input_file): - schema_source = files("geoparquet_validator").joinpath("schema.json") - schema = json.loads(schema_source.read_text()) - - parquet_schema = load_parquet_schema(input_file) - - if b"geo" not in parquet_schema.metadata: - log("Parquet file schema does not have 'geo' key", "error") - exit(1) - - metadata = json.loads(parquet_schema.metadata[b"geo"]) - log("Metadata loaded from file:") - pprint(metadata) - - valid = True - log("Validating file...") - - errors = Draft7Validator(schema).iter_errors(metadata) - - for error in errors: - valid = False - log(f" - {error.json_path}: {error.message}", "warning") - if "description" in error.schema: - log(f" \"{error.schema['description']}\"", "warning") - - # Extra errors - if (metadata["primary_column"] not in metadata["columns"]): - valid = False - log("- $.primary_column: must be in $.columns", "warning") - - if valid: - log("This is a valid GeoParquet file.\n", "success") - else: - log("This is an invalid GeoParquet file.\n", "error") - exit(1) - - -if __name__ == "__main__": - main() diff --git a/validator/python/geoparquet_validator/schema.json b/validator/python/geoparquet_validator/schema.json deleted file mode 120000 index b667c23..0000000 --- a/validator/python/geoparquet_validator/schema.json +++ /dev/null @@ -1 +0,0 @@ -../../../format-specs/schema.json \ No newline at end of file diff --git a/validator/python/setup.py b/validator/python/setup.py deleted file mode 100644 index 330a16e..0000000 --- a/validator/python/setup.py +++ /dev/null @@ -1,28 +0,0 @@ -from setuptools import setup, find_packages - -setup( - name="geoparquet_validator", - version="0.0.1", - install_requires=[ - "jsonschema>=4.4", - "pyarrow>=7.0", - "fsspec>=2022.3", - "requests>=2.27", - "aiohttp>=3.8", - "click>=8.1", - "colorama>=0.4" - ], - extras_require={ - "s3": ["s3fs"], - "gcs": ["gcsfs"] - }, - packages=find_packages(), - package_data={ - "geoparquet_validator": ["schema.json"] - }, - entry_points={ - "console_scripts": [ - "geoparquet_validator=geoparquet_validator:main" - ] - } -) From 53d90c6423af82c3d12273ff4d8083dac7453948 Mon Sep 17 00:00:00 2001 From: Chris Holmes Date: Wed, 13 Sep 2023 19:15:44 -0700 Subject: [PATCH 6/7] removed validator job since the validator was removed --- .github/workflows/scripts.yml | 21 --------------------- 1 file changed, 21 deletions(-) diff --git a/.github/workflows/scripts.yml b/.github/workflows/scripts.yml index e8649f2..c694152 100644 --- a/.github/workflows/scripts.yml +++ b/.github/workflows/scripts.yml @@ -7,27 +7,6 @@ on: pull_request: jobs: - validate-examples: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v2 - - - name: Set up Python 3.8 - uses: actions/setup-python@v2 - with: - python-version: 3.8 - - - name: Install validator - run: | - cd validator/python - python -m pip install --no-binary geoparquet_validator . - - - name: Run validator - run: | - for example in $(ls examples/*.parquet); do - echo $example; - geoparquet_validator $example || exit 1; - done test-json-metadata: runs-on: ubuntu-latest From af3301e41ab237e3b94e5db005a821cbad8cf521 Mon Sep 17 00:00:00 2001 From: Chris Holmes Date: Mon, 18 Sep 2023 07:03:03 -0700 Subject: [PATCH 7/7] Develop on 1.1.0-dev --- examples/example.parquet | Bin 27798 -> 27814 bytes examples/example_metadata.json | 2 +- format-specs/geoparquet.md | 2 +- format-specs/schema.json | 2 +- 4 files changed, 3 insertions(+), 3 deletions(-) diff --git a/examples/example.parquet b/examples/example.parquet index 287c81fd6c57c480669cd2dbd058f756cbd25340..481e66b38952aac5ee9e2737bc454da3fc1c2794 100644 GIT binary patch delta 1910 zcmZuy&u`pB6mC$8D5^M7sV7=Rl@Ny%|H!s=P?2W6Yp>n4({(mp+dX;gwb$OUv&pXQ zwK?pKJF>)|z_}9QhB$KPM!9iEh(CY>Z^j7?DvmO--+c3aeDl4RZ{O^F_h#?)N3ZVx zardi(a(=l^&hvxghX?#){xNqL&dcu)UPbTU`SsrC^x4MkH~yFJy+G|R{-e9k`Y}R% z1dm#^d-(otJ5JHYQ*A*@s#weWwLjw*K~5r9kk;XhFSPVL*0PXbfhXZe?=f6vmUv`u zaBd_gN&o48_5nYyb%Fuosig2~?TIGq&MBuQV;Lo$V3&Aib2LUc+L6k0m=mm#J~$;g z7LFKj>e%a0B@--cd!8T_8sXO9C=ITICRXDL3xa1uNCGAP$l`*L@nl>}Rgg&+<4mf( zTDrn0%~*7+aq5=ANH6_t)j*sg*a_yj2NjPy0}C9EAgNQ9^~jq^d5|fUuab1a>(j3K z-U83gSWue_r`jOqG4G)X5PM@OZ%eHpw7Y@o#n`EjBASsnB#q8 zdRDc6&X4!3_Os}VeOEN)_U};IqL@P#|$JM5A+ zfki-*sitmAoQ11!WG_J~Yi%d%3pdki&+%4J%8H7ZigK!ga4993wF)vjg~KIIn6`-P zn9vin>+AyVmEoZiL2owEa-e*)PH?7dZfJ4ofp53&Q@Y*;huFLS53Y0p-VI&=L!2)| z=CaP?jBp(r7V$)#c&EDLIPA`$d!*}TIKkp1_Jv*p^Zaz5n<@4 z8~Z&&rmWhFK(t`*W2x|jx6bL|=(PPY{$w}h!aT8l6Y!Jm_TK{ zf;ffibh8RbE$1?A!gZHVihwW)HA=n5NveF@W=Zcir6IHE?xUfG*WS4i}F5HsU2g+tPi4tn_R zqXn}r>B^=&dcq7Z=;{epfI5ZH7QNKbOL(%1oE;4GrUOw2dw0*}c9bwpj zbxp%u{F~+(%Z-6=Jt zFzM<5uL`1Rx-?x#n~`=+P9JTl1!{q(tDC+dMq!{zP()zT@&5oIb9D>= delta 1815 zcmZ`(&92)-6z(mzh>9y#?FJT9m!Abl#deamzLgMTCvof>J2%OV$97j9|HMvgCv6ft zHWGOQHWoYrO9W!YD!-vnp*YEy({}Vd)sQ$|R^tM*izqs!o+R_{$3Bge23d|m> znUi>#u_i)09>plur|n|6K=DvUu{Y4x1oz9p6cf25 zE7eEpU=zQ3K8nFcYVH8Dbn&5+sf*QIP5j|FRyV4wWeyg%jy0i1xWJMUrlh}#XNtH9 zu%D9#XZU!}xu%qpL6T8tt!Xh=Q=_~wstt^VtC(AR3Qow{${$P$uUh59mQ)g9wIeMp z8y;FmkfrDT0_}`gEWA{F?q&uU>QuH6B?R|&&6+vLoMKKd{e%d9X1mZ~r%crM%;~Le^ z!;6za{aJ8vYAigi{sO>e{bTU-)D!fWftjW|>RzvwfPQ??i;=~PO=`^zQIf(F#I++g z@A@B+Haro`t!s)~8?SRO5u-p!nD@p)Z;iUW$|&hmB|ASgOHVLEC*4QzJPCn4hjON? zc1Ci$G~T=RN`0a8#=;OO0c~t7d`pY$*0LdI+%D=b{q_Mgmqx}nAVJK%bg~BTIL8#L zz%n=ob8vyO+u-eN((#lTQ{dqJ#bFn@r zinw!xg!AzvQR89;;q+rW(-J>TB40@;ODXGg%1r9Hqa>w+4Y(}s7*kd=Ht%V zrzw@SrGtII&EgQ)TpQ6lJ*w@-%>P=ktwz(ot#ooPo2=?@qqh&dPnALj7NjL54TC$m zFH0{-Dz-@6IBHv3ax57ON@U~6$Nf~xgX6mc2Y6qJLZb9J7ntlc+1!q<5oK)4fs9+i zJ}0Ta_6O#YRl`Rzw;y#GfYu)8_Qeg9!9RfRvS69Q{rda(-W>o-V*O|E?%`^}HI+qs z#<8+!wY3dd6Io0t*bq#{F*TDmHts}T(