-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defining Pairwise Interactions #3
Comments
Could you provide a couple of concrete examples where special knowledge/handling of wrapped duck arrays is required? |
I'm not sure I understand your question in relation to this issue on formalizing agreed-upon interaction priorities. What do you mean by "special," and by who is this knowledge/handling required (array wrapping libraries, array utilizing libraries, library users, etc.)? |
I think what I am asking is: In many simple cases having an array implementation implement, e.g., |
Ah, I think then there is a misunderstanding here. This issue is indeed about such simple cases of implementing and delegating to lower-level libraries! The problem is resolving what "lower-level" means in an unambiguous (directed graph must be acyclic) and fully generalized (can insert any wrapping array library into the graph; don't want it to just be xarray, dask, and pint) way. I'm not aware of any use cases where the standard set of |
As summarized in #1, the interactions between duck array libraries cannot be sufficiently described by a (linked-)list of priorities (as can arise from
__array_priority__
), but is instead best described as a directed graph. So that the dispatch between types can work out consistently and unambiguously, this graph needs to be acyclic, which thereby requires agreement/coordination between duck array libraries.See also: dask/dask#6635
Current State
Presently, this coordination has been informally done through independent/ad-hoc implementations in each duck array library. Two main approaches have arisen:
For a limited set of commonly-used array types in the pydata stack, this has often worked out in practice so far. However, as the number of duck array libraries increases, maintaining agreement between libraries through the existing independent approaches becomes difficult.
As an example of what this type casting hierarchy looks like in practice, Pint has summarized the consensus DAG between several common array types (as of 2020) as follows:
Furthermore, these interactions often play out implicitly via protocols like
__array_ufunc__
and__array_function__
. In contrast, an explicit strategy like NEP 37 may be a preferable way to define these pairwise interactions.Specific Goals
Key Points Raised at Coordination Meeting
__array_module__
?Suggested Paths Forward
Duck Array DAG Library
Discussion (in this issue hopefully) on working out the details of a shared type resolution DAG library is needed! To get the conversation started, here are the points I'm aware of that need resolution:
pydata
is the most likely home for this library, but what to name it and who should lead its maintenance?__array_module__
?__array_ufunc__
/__array_function__
/array function modulesOnce these are resolved, then more detailed discussions (such as API creation) can presumably take place on this new library's repo.
Changes to Participating Libraries
Libraries currently using a "deny list"/"accept all" approach (namely, xarray and pint) may need to change to an "allow list" approach to meet the community consensus, which brings with it backwards compatibility concerns. However, it makes the most sense (to me at least) to make any such changes only at the point when the aforementioned DAG library is also adopted, and at most issue warnings for unknown, but handled-for-now, types for now.
The text was updated successfully, but these errors were encountered: