-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-37484: [Python] Add a FixedSizeTensorScalar class #37533
Conversation
|
Would the numpy array api or https://data-apis.org/array-api/latest/purpose_and_scope.html add any value here? |
@alippai This PR would effectively implement a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! Added two suggestions, otherwise the Pyhton part LGTM.
I am very much hoping we could implement DLPack in Arrow: #33984. Specially for the new tensor arrays, it would be very beneficial!
+1 |
One more thing, can the change in the C++ code ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't yet look in detail, but added some quick drive-by comments. And thanks for working on this!
Can you also add some tests for the new Scalar class?
Currently, for the Python bindings, you added a get_tensor(i)
method on the array class, but wouldn't make sense to (also/instead) add a to_tensor()
method on the scalar class, since this is to get a Tensor for a single element (scalar) of the array?
Thanks for the helpful review @pitrou, I'm happy to see this moving forward! I've addressed your points, please let me know if more changes are needed. |
shape = obj.shape[1:] | ||
size = obj.size / obj.shape[0] | ||
shape = np.take(obj.shape, permutation) | ||
values = np.ravel(obj, order="K") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as_strided
can be a later PR if desired. The docstring addition is good for now!
@github-actions crossbow submit -g python -g wheel |
It may be nice to later add a doc section for tensors here: |
Revision: bf2ca0e Submitted crossbow builds: ursacomputing/crossbow @ actions-2a16c8cab9 |
Added an issue for the docs: #39998 |
and the rest of the dimensions will match the permuted shape of the fixed | ||
shape tensor. | ||
|
||
The conversion is zero-copy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit: this is only if the conversion to numpy is zero-copy (i.e. primitive numeric data without nulls)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, added to VariableShapeTensor
PR 8ca3bf7
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 026188e. There were 9 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 7 possible false positives for unstable benchmarks that are known to sometimes produce them. |
) ### Rationale for this change When working with `FixedSizeTensorArray` we want to access individual tensors. This would be enabled by adding: ```python def FixedSizeTensorScalar(pa.ExtensionScalar): def to_numpy_ndarray(): ... ``` See apache#37484. ### What changes are included in this PR? This adds `FixedSizeTensorScalar` and tests for it. ### Are there any user-facing changes? Yes, when calling `FixedSizeTensorArray[i]` we would get back `FixedSizeTensorScalar` instead of `ExtensionScalar`. * Closes: apache#37484 Lead-authored-by: Rok Mihevc <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Co-authored-by: Joris Van den Bossche <[email protected]> Co-authored-by: Alenka Frim <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
) ### Rationale for this change When working with `FixedSizeTensorArray` we want to access individual tensors. This would be enabled by adding: ```python def FixedSizeTensorScalar(pa.ExtensionScalar): def to_numpy_ndarray(): ... ``` See apache#37484. ### What changes are included in this PR? This adds `FixedSizeTensorScalar` and tests for it. ### Are there any user-facing changes? Yes, when calling `FixedSizeTensorArray[i]` we would get back `FixedSizeTensorScalar` instead of `ExtensionScalar`. * Closes: apache#37484 Lead-authored-by: Rok Mihevc <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Co-authored-by: Joris Van den Bossche <[email protected]> Co-authored-by: Alenka Frim <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
) ### Rationale for this change When working with `FixedSizeTensorArray` we want to access individual tensors. This would be enabled by adding: ```python def FixedSizeTensorScalar(pa.ExtensionScalar): def to_numpy_ndarray(): ... ``` See apache#37484. ### What changes are included in this PR? This adds `FixedSizeTensorScalar` and tests for it. ### Are there any user-facing changes? Yes, when calling `FixedSizeTensorArray[i]` we would get back `FixedSizeTensorScalar` instead of `ExtensionScalar`. * Closes: apache#37484 Lead-authored-by: Rok Mihevc <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Co-authored-by: Joris Van den Bossche <[email protected]> Co-authored-by: Alenka Frim <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Rationale for this change
When working with
FixedSizeTensorArray
we want to access individual tensors. This would be enabled by adding:See #37484.
What changes are included in this PR?
This adds
FixedSizeTensorScalar
and tests for it.Are there any user-facing changes?
Yes, when calling
FixedSizeTensorArray[i]
we would get backFixedSizeTensorScalar
instead ofExtensionScalar
.