-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add and/or compute functions #481
Conversation
b26d28e
to
acef812
Compare
202bf1a
to
c2727b7
Compare
c2727b7
to
6257749
Compare
let data = PrimitiveArray::from(mem::take(&mut self.data)); | ||
pub fn finish(mut self, dtype: DType) -> VarBinArray { | ||
let offsets = PrimitiveArray::from(self.offsets); | ||
let data = PrimitiveArray::from(Vec::from(self.data.freeze())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't zero copy is it? We can add a function to PrimitiveArray to construct directly from a buffer (for PType==u8?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, @robert3005 did we discuss VarBin data being a buffer instead of a child array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we thought about it but then we wanted to have something like ZstdEncoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m starting to think general purpose compression can be configured on buffers at write-time though; using the layouts mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not zero-copy, but still pretty cheap IMO. Constructing things from Bytes
is hard because there's no guarantee the instance is exclusive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But PrimitiveArray wraps a vortex-buffer, which itself wraps Bytes. So this copy is purely because the right API isn't exposed / isn't used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, fixing. Added a from_bytes
function.
&& constant_array.len() == other.len() | ||
{ | ||
if let Ok(array) = ConstantArray::try_from(other.clone()) { | ||
let lhs = constant_array.scalar().value().as_bool()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe first check if either scalar is_null, and then I think the more canonical conversion would be bool::try_from(array.scalar())?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should chat through the scalar API next week, it's a bit weird, you might have some ideas to improve
|
||
Ok(ConstantArray::new(scalar, constant_array.len()).into_array()) | ||
} else { | ||
AndFn::and(&constant_array.clone().into_bool()?, other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either this fallback will fail (into_bool when it's not?), or into_bool does an expensive expansion of the constant.
I think the best thing is to have the convention that constant goes on the RHS, so we should fallback to and(other, constant_array)
to allow the RHS encoding a chance to be efficient, e.g. if it's run-end it only needs to run over the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't actually fallback into Found a solution I'm happy with here.AndFn
here because this might also be an or
, I'm trying to think of a better way of generalizing boolean ops here (I do agree with trying to let rhs
to give us a more efficient implementation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a couple of small comments
let rhs = array.clone().into_canonical()?.into_arrow(); | ||
let rhs = rhs.as_boolean(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this maybe dispatch on the right hand side instead of converting to arrow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely possible, but I think we might be missing some larger abstraction here to solve the whole left/right issue
Doesn't quite get us where we want performance-wise, but does seem much better. Follow up of #481. --------- Co-authored-by: Robert Kruszewski <[email protected]>
No description provided.