-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Line and Polygon extract
optimisation
#824
base: main
Are you sure you want to change the base?
Conversation
Blazing fast and so clean! Don't understand all of the code yet, but I tried a little example just to profile and look at the code. It does seem like it's still allocating somewhere. using Rasters, GeoInterface, Rasters.Lookups
ras = rand(X(Sampled(1.0:90.0, sampling = Intervals(End()))), Y(Sampled(1.0:90.0, sampling = Intervals(End())))) |> Raster
linestrings = [GeoInterface.LineString([(rand(1.0:90.0), rand(1.0:90.0)), (rand(1.0:90), rand(1:90.0))]) for _ in 1:20_000]
@time extract(ras, linestrings);
20 thousand linestrings extracted in 0.1 second is still really really good |
In the example above Another minor comment is that I think progress could be false by default. For 99% of users extracting will just take a fraction of a second. |
I think the return types will be a union because the options change it? The top level isn't type stable. And the progress won't show at all if it's fast enough? I put it by default as you only usually know after you try something and see how slow it is. For the allocations it has to allocate all the vectors as it goes once it knows the Manhattan line distance of each line. But it does seem like a few too many allocations still. It can be at minimum ~20k in this example. My next idea is to use this line burning in |
You're right.
You're right again. The exact example I was trying was giving me a I understand each linestring will need its own vector allocation even if it's flattened after. Still seems there is a little too much red here for this to be quite right? |
Yeah it seems a bit much. There is some allocation because the union is too complicated and indexing needs a |
From here on out it's just perfectionism :) |
"Profile golf" |
This PR adds a number of optimizations to make line and polygon extract faster.
RasterStack
in combination with fixes in DDShould end up an order of magnitude better for rasters with threading, and two orders for larger stacks that had type instability.
This will need the latest DD to pass.
Also adds a few more keywords like
flatten
that allows you to opt-out of flattening extracted points to a single vector withflatten=false
.@tiemvanderdeure if you have time to have a look that would help, also the optimisations may be interesting!
(Proably will update DD properly elsewhere, but its needed here)