Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better extract #565

Merged
merged 17 commits into from
Dec 2, 2023
Merged

Better extract #565

merged 17 commits into from
Dec 2, 2023

Conversation

rafaqz
Copy link
Owner

@rafaqz rafaqz commented Nov 24, 2023

This PR mostly adds new keywords to extract

@tiemvanderdeure feel like reviewing this thing?

I made it return a vector - this isn't breaking as its still an iterable and we never promised the exact type or iterable.

I took the opportunity to polish the (pretty basic) implementation a little. As well as the new keywords, extract now properly handles flattening multiple polygons/feature collections/nested vectors of geoms so you always get a single "table" return value, and handles empty or all-missing inputs. The only known bug is mixed points and geometries in one vector or will fail. This is noted in the doc, and its kinda rare and not worth the effort right now.

Also once I squash the commits together you wont just be on that one empty commit but the whole thing, its just to get your name on the list.

@codecov-commenter
Copy link

codecov-commenter commented Nov 24, 2023

Codecov Report

Attention: 17 lines in your changes are missing coverage. Please review.

Comparison is base (ffa56ec) 80.61% compared to head (d9570c4) 80.60%.
Report is 4 commits behind head on main.

❗ Current head d9570c4 differs from pull request most recent head 3f27f18. Consider uploading reports for the commit 3f27f18 to get more accurate results

Files Patch % Lines
src/methods/extract.jl 76.05% 17 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #565      +/-   ##
==========================================
- Coverage   80.61%   80.60%   -0.02%     
==========================================
  Files          57       57              
  Lines        4127     4212      +85     
==========================================
+ Hits         3327     3395      +68     
- Misses        800      817      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rafaqz
Copy link
Owner Author

rafaqz commented Nov 25, 2023

@visr have you seen this anywhere else? we are getting failures in GDAL.jl gdalsetprojection only on macosx:

 Got exception outside of a @test
  GDALError (CE_Failure, code 1):
  	Invalid TOWGS84 node

See here: https://github.com/rafaqz/Rasters.jl/actions/runs/6697436007/job/18560020921#step:5:949

Linux tests are passing fine.

@visr
Copy link

visr commented Nov 25, 2023

Odd that this only happens on MacOS. I haven't seen this error before.

Comment on lines 39 to 41
# use `extract` to get values for all layers at each observation point.
# We `collect` to get a `Vector` from the lazy iterator.
collect(extract(st, pnts))
collect(extract(st, pnts; skipmissing=true))
Copy link
Contributor

@tiemvanderdeure tiemvanderdeure Nov 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No point in using collect on a Vector, right?

Suggested change
# use `extract` to get values for all layers at each observation point.
# We `collect` to get a `Vector` from the lazy iterator.
collect(extract(st, pnts))
collect(extract(st, pnts; skipmissing=true))
# use `extract` to get values for all layers at each observation point.
extract(st, pnts; skipmissing=true)

@tiemvanderdeure
Copy link
Contributor

Handling missings seems to work wonderfully when giving missing, but behaves unpredictably with NamedTuples containing missing.

E.g.

using Rasters
myrast = Raster(rand(X(10), Y(10)))
extract(myrast, ((X = missing , Y = missing), (X= 1, Y = 10))) # work but return Vector{Any}
extract(myrast, ((X= 1, Y = 10), (X = missing , Y = missing))) # errors
extract(myrast, ((X = missing , Y = 10),)) # Gives a StackOverflowError!

Or is this also what you mean by mixed geometries? Any point data that comes from a table and has some missing coordinates will look like this.

Maybe a solution could be to use a function that also skips geoms that contain missing instead of Base.skipmissing? It should be reasonable to treat something like (X = missing, Y = missing) in the exact same way as missing

I'll have time to look at this more tomorrow.

@rafaqz
Copy link
Owner Author

rafaqz commented Nov 25, 2023

We should force the return type to be at least Vector{NamedTuple{keys,Any}}

We are not actually using Base.skipmissing, but filter doing pretty much what you suggest whenskipmissing=true is passed in. Otherwise I think people can just filter them later as a dataframe however they like.

@rafaqz
Copy link
Owner Author

rafaqz commented Nov 25, 2023

@tiemvanderdeure handling this isn't possible because your points with missing coordinates are not valid GeoInterface geometries:

extract(myrast, ((X = missing , Y = missing), (X= 1, Y = 10)))

And to throw a better error there we need to disallow arbitrary objects like you tuple of points, its limited to an AbstractArray now

test/methods.jl Outdated Show resolved Hide resolved
@rafaqz rafaqz merged commit 19a53cc into main Dec 2, 2023
10 of 11 checks passed
@rafaqz rafaqz deleted the better_extract branch December 2, 2023 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants