-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy rectilinear interpolator #6084
base: main
Are you sure you want to change the base?
Conversation
Nice to see this progressing @fnattino! Did you notice that CI is failing on this pull request? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6084 +/- ##
=======================================
Coverage 89.82% 89.83%
=======================================
Files 88 88
Lines 23150 23180 +30
Branches 5043 5043
=======================================
+ Hits 20794 20823 +29
+ Misses 1624 1622 -2
- Partials 732 735 +3 ☔ View full report in Codecov by Sentry. |
lib/iris/tests/unit/analysis/interpolation/test_RectilinearInterpolator.py
Show resolved
Hide resolved
@fnattino just want to reassure you that I have been looking at this, but since I have never worked with our interpolators before it is slow progress. Having another go this afternoon with help from some coffee ☕ |
No worries @trexfeathers, but thanks for the heads-up! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fnattino, thank you for your hard work on this.
Here is a partial review. I have left myself a couple of TODO comments. But the suggestions I have already might take some time, and may change the code significantly - mainly #6084 (comment) - so it seemed important to get these suggestions to you as soon as possible.
Also thank you to @HarriKeenan for helping me with this review last week 🤜🤛
def _interpolated_dtype(dtype, method): | ||
"""Determine the minimum base dtype required by the underlying interpolator.""" | ||
if method == "nearest": | ||
result = dtype | ||
else: | ||
result = np.result_type(_DEFAULT_DTYPE, dtype) | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this needs to stay (see my other comment about args=[self]
- #6084 (comment)), then I'd be interested in us unifying this function with RectilinearInterpolator._interpolated_dtype()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to put this back as a (static)method of the RectilinearInterpolator
? Or to merge the body of this function with RectilinearInterpolator._interpolate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean find any appropriate way for there to be only 1 _interpolated_dtype()
in this file, which can be used by both:
RectilinearInterpolator._interpolate()
RectilinearInterpolator._points()
Up to you how this is achieved.
lib/iris/tests/unit/analysis/interpolation/test_RectilinearInterpolator.py
Show resolved
Hide resolved
Co-authored-by: Martin Yeo <[email protected]>
Co-authored-by: Martin Yeo <[email protected]>
…iris into lazy-rectilinearinterpolator-2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trexfeathers thanks a lot for the review and apologies for the scattered response.
As I have tried to explain in reply to your comments, I am a bit hesitant to implement the solution that would copy the current instance of the RectilinearInterpolation
- but I am very curious to hear your thoughts on this!
def _interpolated_dtype(dtype, method): | ||
"""Determine the minimum base dtype required by the underlying interpolator.""" | ||
if method == "nearest": | ||
result = dtype | ||
else: | ||
result = np.result_type(_DEFAULT_DTYPE, dtype) | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to put this back as a (static)method of the RectilinearInterpolator
? Or to merge the body of this function with RectilinearInterpolator._interpolate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I got my head back in the space and I understand your logic better now. Nearly there.
If you're busy with other things please let me know and I can try actioning the remaining stuff myself.
def _interpolated_dtype(dtype, method): | ||
"""Determine the minimum base dtype required by the underlying interpolator.""" | ||
if method == "nearest": | ||
result = dtype | ||
else: | ||
result = np.result_type(_DEFAULT_DTYPE, dtype) | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean find any appropriate way for there to be only 1 _interpolated_dtype()
in this file, which can be used by both:
RectilinearInterpolator._interpolate()
RectilinearInterpolator._points()
Up to you how this is achieved.
lib/iris/tests/unit/analysis/interpolation/test_RectilinearInterpolator.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried writing a benchmark to demonstrate the benefits of this but everything either stayed the same or got slower.
Could you share with me the results you have seen? Especially if it's something I/we could turn into a benchmark. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clicked the wrong button with my previous review!
Thanks a lot for the effort of getting back to this @trexfeathers ! I should have time to get back to this next week, but I will keep you posted! |
🚀 Pull Request
Description
Different take to enable the rectilinear interpolator to run lazily #6002 .
Trying to address the same issue as #6006, but there I have made the underlying
_RegularGridInterpolator
from_scipy_interpolate
to run on lazy data, which required switching fromscipy.sparse
tosparse
(not ideal since it would add numba as a dependency).Here I have tried instead to implement a similar approach as used for regridding, which works on lazy data as well. The downside is that the chunks in the dimensions we are interpolating over need to be merged, but at least we could run interpolation in parallel over the chunks in the other dimensions (and we do not need to add extra dependencies to iris).