-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Include stats in IPC messages (#302)
- Modifies the flatbuffer array type to include an ArrayStats table - Array Table contains optional fields corresponding to each currently available stats type - Current implementation populates all stats - Implementation of the Statistics trait for ArrayView - Implementation does no allocations or computation, only references values that exist in the underlying flatbuffer - For demonstrative purposes, I've [written (and removed) an implementation](de4077c) that allocates only if someone calls set(stat, value) to populate additional, possibly missing, stats. I've removed this because we don't currently have a use for it, but it's easy enough to do without any unsafe shenanigans. - Callers can specify which stats should be included with a serialized IPC array when constructing a ViewContext. By default, all stats are included. - Tests demonstrating the presence of correct stats after a round-trip through IPC for primitive and chunked arrays ~I've included a mechanism to configure all of the statistics by default here because the overhead they add to the flatbuffer message is relatively small, given that the arrays themselves are sufficiently large. I considered adding a mechanism to check the length of the arrays [here](https://github.com/spiraldb/vortex/pull/302/files#diff-b7cc44a4bd1e1c769cb029b5ecaa98f080fdb7aa48b79566a9c8bb1306b84149R212) to choose a subset of stats based on the size of the array (probably just drop the two frequency arrays, because they're much larger than everything else), but decided against it for now. I don't think we expect to frequently see arrays small enough that these stats would add a relatively significant amount of wire overhead~ --------- Co-authored-by: Nicholas Gates <[email protected]>
- Loading branch information
Showing
9 changed files
with
296 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
use flatbuffers::{FlatBufferBuilder, WIPOffset}; | ||
use itertools::Itertools; | ||
use vortex_flatbuffers::WriteFlatBuffer; | ||
|
||
use crate::stats::{Stat, Statistics}; | ||
|
||
impl WriteFlatBuffer for &dyn Statistics { | ||
type Target<'t> = crate::flatbuffers::ArrayStats<'t>; | ||
|
||
fn write_flatbuffer<'fb>( | ||
&self, | ||
fbb: &mut FlatBufferBuilder<'fb>, | ||
) -> WIPOffset<Self::Target<'fb>> { | ||
let trailing_zero_freq = self | ||
.get_as::<Vec<u64>>(Stat::TrailingZeroFreq) | ||
.ok() | ||
.map(|v| v.iter().copied().collect_vec()) | ||
.map(|v| fbb.create_vector(v.as_slice())); | ||
|
||
let bit_width_freq = self | ||
.get_as::<Vec<u64>>(Stat::BitWidthFreq) | ||
.ok() | ||
.map(|v| v.iter().copied().collect_vec()) | ||
.map(|v| fbb.create_vector(v.as_slice())); | ||
|
||
let min = self | ||
.get(Stat::Min) | ||
.map(|min| min.value().write_flatbuffer(fbb)); | ||
|
||
let max = self | ||
.get(Stat::Max) | ||
.map(|max| max.value().write_flatbuffer(fbb)); | ||
|
||
let stat_args = &crate::flatbuffers::ArrayStatsArgs { | ||
min, | ||
max, | ||
is_sorted: self.get_as::<bool>(Stat::IsSorted).ok(), | ||
is_strict_sorted: self.get_as::<bool>(Stat::IsStrictSorted).ok(), | ||
is_constant: self.get_as::<bool>(Stat::IsConstant).ok(), | ||
run_count: self.get_as_cast::<u64>(Stat::RunCount).ok(), | ||
true_count: self.get_as_cast::<u64>(Stat::TrueCount).ok(), | ||
null_count: self.get_as_cast::<u64>(Stat::NullCount).ok(), | ||
bit_width_freq, | ||
trailing_zero_freq, | ||
}; | ||
|
||
crate::flatbuffers::ArrayStats::create(fbb, stat_args) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.