Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standard interface for vectors in S7 (vs vctrs) #514

Open
mjskay opened this issue Dec 18, 2024 · 0 comments
Open

Standard interface for vectors in S7 (vs vctrs) #514

mjskay opened this issue Dec 18, 2024 · 0 comments

Comments

@mjskay
Copy link

mjskay commented Dec 18, 2024

I am curious if there is a plan to create a canonical notion of a "vector" in S7. I think this would be incredibly valuable to have.

{vctrs} has demonstrated the value of having such a thing: it formalizes coercion rules and also makes a bunch of informal conventions around slicing and binding explicit. It's great.

However, after building a fairly complicated type on top of {vctrs} (posterior::rvar) and observing its use in practice over the past few years, there are several drawbacks of {vctrs} that I think could be rectified by a "canonical" approach adopted in S7:

  1. The definition of a vector in vctrs is fairly limiting: it is basically atomic vectors, or classes implementing vctrs::vec_proxy(). This excludes plenty of types that actually are vectors in a general sense, like S4 vectors.
  2. Since {vctrs} is not used everywhere, even implementing its interface does not guarantee your datatype will be considered a vector by everyone else.
  3. The multiple-dispatch-on-S3 stuff in {vctrs} is an obvious candidate to be done more cleanly in S7 (I assume y'all are already thinking about that).
  4. The requirement to implement vctrs::vec_proxy() to be considered a vector is costly for complex datatypes that do not have a constant-time translation to an atomic vector type (e.g. Bypass vec_proxy if it cannot be implemented in constant time? r-lib/vctrs#1411). The result is either dirty hacks that cause other downstream problems (Efficient saving/reading of data frames containing rvars? stan-dev/posterior#307) or accepting that algorithms that build on vctrs may have high computational complexity if they use operations that should be constant-time (e.g. a single slice) but which aren't because the proxy is not constant-time.

I think there's an opportunity with S7 to address these issues. It would be awesome to have a "canonical" notion of a vector built on a small, explicitly-defined interface, rather than proxies. Perhaps it would be a trait (#34)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant