Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fast path for ArgMin / ArgMax when axis is contiguous #411

Merged
merged 3 commits into from
Nov 16, 2024

Conversation

robertknight
Copy link
Owner

Add a fast path for the common case where the min/max index is taken along a contiguous (stride-1) axis.

  • Add Lane::as_slice method for getting the remainder of a lane yielded by tensor.lanes(axis), if the lane has stride 1
  • Add fast path in ArgMin/ArgMax which leverages Lane::as_slice

Add a method to get contiguous lanes of a tensor as a slice. This is useful for
implementing fast paths for the case where the lane is contiguous.
Add an `IsNaN` trait which calls `f32::is_nan` to test for NaN-ness instead of
testing whether `self.partial_cmp(self)` is None.
@robertknight robertknight merged commit cbcfa9f into main Nov 16, 2024
2 checks passed
@robertknight robertknight deleted the arg-min-max-fast-path branch November 16, 2024 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant