Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Cython's array to back Py_ssize_t[::1] #504

Merged
merged 1 commit into from
Oct 22, 2024

Conversation

jakirkham
Copy link
Member

In Array, Py_ssize_t[::1] objects are currently backed by CPython array's with some internal bits expressed in Cython. However these are not compatible with Python's Limited API and Stable ABI. To address that, switch to Cython's own array type. As this is baked into Cython and doesn't use anything special, it is compatible with Python's Limited API and Stable ABI.

xref: rapidsai/build-planning#42

@jakirkham jakirkham added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Oct 22, 2024
@jakirkham jakirkham requested a review from a team as a code owner October 22, 2024 05:56
In `Array`, `Py_ssize_t[::1]` objects are currently backed by CPython
`array`'s with some internal bits expressed in Cython. However these are
not compatible with Python's Limited API and Stable ABI. To address
that, switch to Cython's own`array` type. As this is baked into Cython
and doesn't use anything special, it is compatible with Python's Limited
API and Stable ABI.

https://cython.readthedocs.io/en/latest/src/userguide/memoryviews.html#cython-arrays
@jakirkham
Copy link
Member Author

The performance of the two approaches can be compared with the following Cython test code

# filename: new_arr.pyx

from cpython.array cimport array, newarrayobject
from cpython.mem cimport PyMem_Malloc, PyMem_Free
from cpython.object cimport PyObject

from cython.view cimport array as cvarray


cdef array array_Py_ssize_t = array("q")


cpdef Py_ssize_t[::1] new_Py_ssize_t_array_0(Py_ssize_t n):
    return newarrayobject(
        (<PyObject*>array_Py_ssize_t).ob_type, n, array_Py_ssize_t.ob_descr
    )


cdef sizeof_Py_ssize_t = sizeof(Py_ssize_t)


cpdef Py_ssize_t[::1] new_Py_ssize_t_array_1(Py_ssize_t n):
    cdef cvarray a = cvarray((n,), sizeof_Py_ssize_t, b"q", "c", False)
    a.data = <char*>PyMem_Malloc(n * sizeof(Py_ssize_t))
    a.callback_free_data = PyMem_Free
    return a

Running it in IPython shows results like this

In [1]: from new_arr import new_Py_ssize_t_array_0, new_Py_ssize_t_array_1

In [2]: %timeit new_Py_ssize_t_array_0(1)
179 ns ± 0.952 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [3]: %timeit new_Py_ssize_t_array_1(1)
232 ns ± 1.11 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [4]: %timeit new_Py_ssize_t_array_0(5)
178 ns ± 0.74 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [5]: %timeit new_Py_ssize_t_array_1(5)
235 ns ± 1.95 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [6]: %timeit new_Py_ssize_t_array_0(1_000)
213 ns ± 2.5 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [7]: %timeit new_Py_ssize_t_array_1(1_000)
284 ns ± 5.85 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

This is ~1/3 slower across cases. That said, the runtime difference is ~60ns. So roughly the time of 1-2 attribute access calls on an object

Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @jakirkham

@jakirkham
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit ed2d6d0 into rapidsai:branch-24.12 Oct 22, 2024
54 checks passed
@jakirkham jakirkham deleted the use_cvarray_arr branch October 22, 2024 16:18
@jakirkham
Copy link
Member Author

Thanks Mads! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants