-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mastic aggregator and collector implementation #1107
base: main
Are you sure you want to change the base?
Conversation
5a8d144
to
2c36004
Compare
Implements aggregator and collector functionality for the Mastic protocol for weighted heavy-hitters and attribute-based metrics.
2c36004
to
d7d7fb8
Compare
d7d7fb8
to
f952b4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments through prep_init()
. Great start!
fa2c390
to
7995ed1
Compare
e66a7fb
to
69749ff
Compare
69749ff
to
27d2b2f
Compare
dafe718
to
4d47513
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments up to prepare_init()
.
} | ||
} | ||
|
||
/// Vidpf evaluation state | ||
/// | ||
/// Contains the values produced during input evaluation at a given level. | ||
#[derive(Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@divergentdave can you check if Debug
should be fenced by a feature flag? In the past we discussed using this pattern to prevent secrets from being logged in production environments. I'm not sure what the state of play is now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't currently gate any Debug
impls on that feature flag. A number of structs currently derive Debug
, like Seed
, Prio3InputShare
, Poplar1InputShare
.
e0d95bd
to
0ea57b9
Compare
0ea57b9
to
5a46f32
Compare
6c6e284
to
7ecf3ca
Compare
7ecf3ca
to
4f487df
Compare
part.encode(bytes)? | ||
}; | ||
if let Some(ref part) = self.helper_joint_rand_part_opt { | ||
if let Some(ref part) = self.joint_rand_part_opt { | ||
part.encode(bytes)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rust convention (AFAIK)
part.encode(bytes)? | |
part.encode(bytes)?; |
pub type SzkQueryState<const SEED_SIZE: usize> = Option<Seed<SEED_SIZE>>; | ||
|
||
/// Verifier type for the SZK proof. | ||
pub type SzkVerifier<F> = Vec<F>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we can now delete this type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notably, the two encode/decode implementations are unused, and since they are on a type alias, they would bleed outside of the module to any use of a Vec<T>
.
.zip(helper_share.flp_verifier) | ||
{ | ||
*x += y; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General control flow tip: Avoid nesting if statements. E.g., instead of
if cond1 {
do_this();
if cond2 {
do_that();
} else {
handle_error();
}
} else {
handle_error();
}
do
if !cond1 {
handle_error();
}
do_this();
if !cond2 {
handle_error();
}
do_that();
This makes code easier to read and maintain.
leader_share.joint_rand_part_opt, | ||
helper_share.joint_rand_part_opt, | ||
) { | ||
(Some(leader_part), Some(helper_part)) => Ok(Some([leader_part, helper_part])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we could already combine the parts into the seed. The advantage is we reduce the amount of bits on the wire by half. (Plus, IIRC, this is what the current spec says to do.)
} else { | ||
Err(SzkError::Decide("failed to verify FLP proof".to_string())) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | |
} | |
} | ||
} | ||
|
||
fn aggregate<M: IntoIterator<Item = Self::OutputShare>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convention: Use the concrete type here
fn aggregate<M: IntoIterator<Item = Self::OutputShare>>( | |
fn aggregate<M: IntoIterator<Item = MasticOutputShare<T::Field>>>( |
agg_shares: M, | ||
_num_measurements: usize, | ||
) -> Result<Self::AggregateResult, VdafError> { | ||
let n = agg_param.level_and_prefixes.prefixes().len(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let n = agg_param.level_and_prefixes.prefixes().len(); | |
let num_prefixes = agg_param.level_and_prefixes.prefixes().len(); |
for i in 0..n { | ||
let encoded_result = &agg_final.0 | ||
[i * self.vidpf.weight_parameter..(i + 1) * self.vidpf.weight_parameter]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use chunks()
to get an iterator instead: https://doc.rust-lang.org/std/slice/struct.Chunks.html#example
This is easier to read and allows the compiler to do some more optimization.
result.push( | ||
self.szk | ||
.typ | ||
.decode_result(&self.szk.typ.truncate(encoded_result.to_vec())?[..], 1)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've hard coded the number of measurements to be 1
. In fact, this is supposed to be the number of measurements that have the prefix in common. See https://github.com/jimouris/draft-mouris-cfrg-mastic/blob/482903a99a52cfb61ec06d979ade2fc0512e8c3f/poc/mastic.py#L351.
Rather than address it in this PR, I suggest leaving it as a TODO for #947 and taking care of it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we need to add the counter prefix to the encoded measurement, from sharding through to here. Once that's addressed in a follow-up, I think we'll want convenience constructors for Mastic with particular circuits, to take care of constructing the Szk
and Vidpf
objects, and calculating the VIDPF weight parameter from szk.typ.output_len() + 1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noticed that the spec currently calls truncate during preparation, before producing an output share, as opposed to during unsharding.
let first_input = VidpfInput::from_bytes(&[240u8, 0u8, 1u8, 4u8][..]); | ||
let second_input = VidpfInput::from_bytes(&[112u8, 0u8, 1u8, 4u8][..]); | ||
let third_input = VidpfInput::from_bytes(&[48u8, 0u8, 1u8, 4u8][..]); | ||
let fourth_input = VidpfInput::from_bytes(&[32u8, 0u8, 1u8, 4u8][..]); | ||
let fifth_input = VidpfInput::from_bytes(&[0u8, 0u8, 1u8, 4u8][..]); | ||
let first_prefix = VidpfInput::from_bools(&[false, false, true]); | ||
let second_prefix = VidpfInput::from_bools(&[false]); | ||
let third_prefix = VidpfInput::from_bools(&[true]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this in an array and reference the indices of the array below.
let first_input = VidpfInput::from_bytes(&[240u8, 0u8, 1u8, 4u8][..]); | |
let second_input = VidpfInput::from_bytes(&[112u8, 0u8, 1u8, 4u8][..]); | |
let third_input = VidpfInput::from_bytes(&[48u8, 0u8, 1u8, 4u8][..]); | |
let fourth_input = VidpfInput::from_bytes(&[32u8, 0u8, 1u8, 4u8][..]); | |
let fifth_input = VidpfInput::from_bytes(&[0u8, 0u8, 1u8, 4u8][..]); | |
let first_prefix = VidpfInput::from_bools(&[false, false, true]); | |
let second_prefix = VidpfInput::from_bools(&[false]); | |
let third_prefix = VidpfInput::from_bools(&[true]); | |
let prefixes = [ | |
VidpfInput::from_bytes(&[240u8, 0u8, 1u8, 4u8][..]), | |
VidpfInput::from_bytes(&[112u8, 0u8, 1u8, 4u8][..]), | |
... and so on | |
]; |
let (public_share, input_shares) = | ||
mastic.shard(&(first_input.clone(), false), &nonce).unwrap(); | ||
run_vdaf_prepare( | ||
&mastic, | ||
&verify_key, | ||
&first_agg_param, | ||
&nonce, | ||
public_share, | ||
input_shares, | ||
) | ||
.unwrap(); | ||
|
||
let (public_share, input_shares) = mastic.shard(&(first_input, true), &nonce).unwrap(); | ||
run_vdaf_prepare( | ||
&mastic, | ||
&verify_key, | ||
&first_agg_param, | ||
&nonce, | ||
public_share, | ||
input_shares, | ||
) | ||
.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use run_vdaf()
to run the whole thing end-to-end, including aggregation and collection:
Line 486 in d2fe428
pub fn run_vdaf<V, M, const SEED_SIZE: usize>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review, of the SZK module:
} | ||
} | ||
|
||
/// Vidpf evaluation state | ||
/// | ||
/// Contains the values produced during input evaluation at a given level. | ||
#[derive(Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't currently gate any Debug
impls on that feature flag. A number of structs currently derive Debug
, like Seed
, Prio3InputShare
, Poplar1InputShare
.
/// Returned when a user fails to store the length of the verifier so it | ||
/// can be properly read upon receipt. | ||
#[error("Part of Szk query state not stored")] | ||
InvalidState(String), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error variant is no longer used for this purpose. It's now used when the leader share and helper share have a mismatched presence/absence of joint randomness parts. Since that error condition is not related to the prepare state, I think we can remove this InvalidState
variant.
} | ||
} | ||
|
||
/// Joint share type for the SZK proof. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs some further explanation, since this is a novel term.
/// Joint share type for the SZK proof. | |
/// Joint share type for the SZK proof. | |
/// | |
/// This is produced by [`Szk::merge_verifiers`] as the result of combining two query shares. | |
/// It contains the joint randomness parts from each aggregator, if applicable. It is consumed by [`Szk::decide`]. |
Or, taking https://github.com/divviup/libprio-rs/pull/1107/files#r1823527498 into account:
/// Joint share type for the SZK proof. | |
/// Joint share type for the SZK proof. | |
/// | |
/// This is produced by [`Szk::merge_verifiers`] as the result of combining two query shares. | |
/// It contains the re-computed joint randomness seed, if applicable. It is consumed by [`Szk::decide`]. |
if !check_flp_proof { | ||
return Ok(false); | ||
} | ||
joint_share: &[Seed<SEED_SIZE>; 2], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could better encapsulate the SZK logic if this took in a &SzkJointShare
instead. This would let us remove the match block from Mastic::prepare_next()
.
) { | ||
(Some(leader_part), Some(helper_part)) => Ok(Some([leader_part, helper_part])), | ||
(None, None) => Ok(None), | ||
_ => Err(SzkError::InvalidState( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, this error case is no longer about the prepare state, since this just depends on query shares.
@@ -5,19 +5,22 @@ | |||
//! [draft-mouris-cfrg-mastic-01]: https://www.ietf.org/archive/id/draft-mouris-cfrg-mastic-01.html | |||
|
|||
use crate::{ | |||
bt::{BinaryTree, Path}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove #![allow(dead_code)]
etc. from the bt
module now that this is using it?
Relatedly, I think we ought to make the bt
module public now, since BinaryTree
appears in the arguments of the public Vidpf::eval_with_cache()
method.
if cache_tree.root.is_none() { | ||
cache_tree.root = Some(Box::new(Node::new(VidpfEvalCache { | ||
state: VidpfEvalState::init_from_key(id, key), | ||
share: W::zero(&self.weight_parameter), // not used | ||
}))); | ||
} | ||
|
||
let mut sub_tree = cache_tree.root.as_mut().expect("root was visited"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can combine these and remove the .expect()
by using Option::get_or_insert_with()
if cache_tree.root.is_none() { | |
cache_tree.root = Some(Box::new(Node::new(VidpfEvalCache { | |
state: VidpfEvalState::init_from_key(id, key), | |
share: W::zero(&self.weight_parameter), // not used | |
}))); | |
} | |
let mut sub_tree = cache_tree.root.as_mut().expect("root was visited"); | |
let mut sub_tree = cache_tree.root.get_or_insert_with(|| { | |
Box::new(Node::new(VidpfEvalCache { | |
state: VidpfEvalState::init_from_key(id, key), | |
share: W::zero(&self.weight_parameter), // not used | |
})) | |
}); |
} | ||
let mut cs = Vec::<VidpfProof>::with_capacity(*bits); | ||
for _ in 0..*bits { | ||
let mut proof = [0u8; 32]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: another place where we can use this constant
let mut proof = [0u8; 32]; | |
let mut proof = [0u8; VIDPF_PROOF_SIZE]; |
let mut bytes = vec![]; | ||
public.encode(&mut bytes).unwrap(); | ||
|
||
assert_eq!(public.encoded_len().unwrap(), bytes.len()); | ||
|
||
let decoded = VidpfPublicShare::<TestWeight>::decode_with_param( | ||
&(8, TEST_WEIGHT_LEN), | ||
&mut Cursor::new(&bytes), | ||
) | ||
.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Using the get_... helper methods can save some boilerplate, and gets us a check for unused bytes for free.
let mut bytes = vec![]; | |
public.encode(&mut bytes).unwrap(); | |
assert_eq!(public.encoded_len().unwrap(), bytes.len()); | |
let decoded = VidpfPublicShare::<TestWeight>::decode_with_param( | |
&(8, TEST_WEIGHT_LEN), | |
&mut Cursor::new(&bytes), | |
) | |
.unwrap(); | |
let bytes = public.get_encode().unwrap(); | |
assert_eq!(public.encoded_len().unwrap(), bytes.len()); | |
let decoded = VidpfPublicShare::<TestWeight>::get_decoded_with_param( | |
&(8, TEST_WEIGHT_LEN), | |
&bytes, | |
) | |
.unwrap(); |
/// Mastic prepare state. | ||
/// | ||
/// State held by an aggregator between rounds of Mastic preparation. Includes intermediate | ||
/// state for [`Szk``] verification, the output shares currently being validated, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo
/// state for [`Szk``] verification, the output shares currently being validated, and | |
/// state for [`Szk`] verification, the output shares currently being validated, and |
@@ -226,6 +225,57 @@ impl<W: VidpfValue, const NONCE_SIZE: usize> Vidpf<W, NONCE_SIZE> { | |||
}) | |||
} | |||
|
|||
/// [`Vidpf::eval_with_cache`] evaluates the entire `input` and produces a share of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere, we can drop the name of methods from their documentation, since this will appear just below the full method signature.
/// [`Vidpf::eval_with_cache`] evaluates the entire `input` and produces a share of the | |
/// Evaluates the entire `input` and produces a share of the |
let second_prefix = VidpfInput::from_bools(&[false]); | ||
let third_prefix = VidpfInput::from_bools(&[true]); | ||
let mastic = Mastic::new(algorithm_id, szk, sum_vidpf, 32); | ||
let first_agg_param = MasticAggregationParam::new(vec![first_prefix], true).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add some test cases covering require_weight_check = false
.
|
||
/// Mastic prepare share. | ||
/// | ||
/// Broadcast message from an aggregator between rounds of Mastic. Includes the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a suggested rephrasing
/// Broadcast message from an aggregator between rounds of Mastic. Includes the | |
/// Broadcast message from an aggregator during preparation. Includes the |
None | ||
}; | ||
let (prep_share, prep_state) = | ||
if let Some((szk_query_share, szk_query_state)) = szk_verify_opt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code could be further simplified by fusing this if-else and the if-else above. szk_verify_opt
is constructed by the previous if-else block, and immediately destructured here, with no other uses
result.push( | ||
self.szk | ||
.typ | ||
.decode_result(&self.szk.typ.truncate(encoded_result.to_vec())?[..], 1)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we need to add the counter prefix to the encoded measurement, from sharding through to here. Once that's addressed in a follow-up, I think we'll want convenience constructors for Mastic with particular circuits, to take care of constructing the Szk
and Vidpf
objects, and calculating the VIDPF weight parameter from szk.typ.output_len() + 1
.
result.push( | ||
self.szk | ||
.typ | ||
.decode_result(&self.szk.typ.truncate(encoded_result.to_vec())?[..], 1)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noticed that the spec currently calls truncate during preparation, before producing an output share, as opposed to during unsharding.
Implements preparation, aggregation, and unsharding for the Mastic VDAF, and adjusts the VIDPF and SZK modules in support of this development.