Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce bounding box column definition #191
Introduce bounding box column definition #191
Changes from 1 commit
0882e85
e74af7d
8790bb4
388f743
7a02a39
bf95d7d
6b3fc80
8bfb5b7
44097bf
e0b3d47
40ebb37
4a3c59c
c540208
f3edcaf
eb6d3e0
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused about the semantics in case of spherical geometries.
"range" suggests to me that xmin should be always <= xmax, but this is not true in the spherical case, right? How to represent a bbox that crosses the 180.0° line of longitude or contains a pole? Or such a bbox cannot be represented?
It would be nice to add some guidance/warning about interpreting in the spherical case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the bounding box at the file-level metadata (https://github.com/opengeospatial/geoparquet/blob/main/format-specs/geoparquet.md#bbox), we refer to the GeoJSON spec:
Would that work here as well / provide sufficient information?
Of course, in contrast with the bbox metadata for the full file which is just a JSON array of 4 numbers, here we need to give the numbers explicit field names. The current proposal uses "xmin", "xmax", etc, which is not ideal for the geographical case.
(in general a simple bbox might not be ideal for geographical data anyway, and the current proposal leaves it open to add other "covering" types later)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, @csringhofer.
I agree with @jorisvandenbossche that we'd just defer to how GeoJSON defines bbox for anti-meridian crossings. (There's another discussion about also adopting GeoJSON's recommendation to split geometries at the anti-meridian but that's probably further out).
I suppose we could rename the bbox fields to "south", "west", etc. but it feels off to me and I think xmin, xmax is still the right name with the caveats Joris listed. It's also worth mentioning that naive queries against the bbox won't be effective for anti-meridian crossings, including row group filtering optimizations. But I think engines with specific knowledge of geospatial data could handle anti-meridian crossings appropriately including row group filtering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this sentence to the docs:
As with the top-level [
bbox](#bbox) column, the values follow the GeoJSON specification (RFC 7946, section 5), which also describes how to represent the bbox for geometries that cross the antimeridian.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that would require the geometries crossing the anti-meridian are in dedicated row groups. If you start mixing geometries crossing the A.M. and geometries not crossing it, then the min(minx), min(miny), max(maxx), max(maxy) statistics aren't going to make any sense.
e.g if you have a geometry [-10,-10,10,10] and a [170,-10,-170,10] (crossing A-M) in a single row group, then the row group stats are going to be [-10,-10,10,10] and thus the geometry crossing A-M will not be selected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes, that's a good point. Silently not selecting a row if you are not aware of this, doesn't sound good.
Shall we for now just say that this feature (bbox column) doesn't support A-M crossing geometries, and thus cannot be used for data including such geometries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, @rouault. We discussed this at the last GeoParquet meeting and decided that we'll just say that antimeridian crossings aren't supported for now. I'll also make an issue to track that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #198 to continue that discussion.