-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix chunked array stat merging #303
Conversation
@jdcasale this is probably the way it should have been. Before we assumed that empty was completely empty but that is a weird assumption to make and can lead to hard to notice bugs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge_sortedness_stat is wrong too, fixed this and added some more tests in #306
If that merges into this branch, lgtm
Entry::Vacant(e) => { | ||
if let Some(min) = other.get(Stat::Max) { | ||
e.insert(min.clone()); | ||
fn merge_ordered<F: Fn(&Scalar, &Scalar) -> bool>(&mut self, stat: Stat, other: &Self, cmp: F) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth a docstring, eg
/// Merges stats if both are present, if either stat is not present, drops the stat from the result set. For example, if we know the minimums of two arrays, the minimum of their union is the minimum-of-minimums, but if we only know the minimum of one of the two arrays, we do not know the minimum of their union.
e.insert(min.clone()); | ||
} | ||
if let Entry::Occupied(mut e) = self.values.entry(stat) { | ||
if let Some(other_value) = other.get_as::<usize>(stat) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters because negative values will panic on the coercion to usize, but this will mutate the PType of the scalars to be merged -- ie if we merge an u32 with an u32, the result of this will be a u64.
Also maybe worth renaming this method to merge_count_stat
@@ -213,3 +193,29 @@ impl IntoIterator for StatsSet { | |||
self.values.into_iter() | |||
} | |||
} | |||
|
|||
#[cfg(test)] | |||
mod test { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want tests for the other codepaths as well
some additional fixes & tests for #303
Stats merging conflated min and max statistics and would wrongly merge max
values. If statistics were not computed for an array then merging would result
in incorrect results only spanning one part of the array