Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestWorld for sharded environments #982

Merged
merged 9 commits into from
Mar 20, 2024

Conversation

akoshelev
Copy link
Collaborator

This change introduces the ability to run very simple circuits on multiple shards in parallel. It is not possible to communicate between shards yet, but it is possible to use the same test infrastructure to create multiple shards and use PRSS inside them as well as provide the input for each shard and consume their output.

The next and hopefully final change will bring the ability to communicate across shards.

This change introduces the ability to run very simple circuits on multiple
shards in parallel. It is not possible to communicate between shards yet,
but it is possible to use the same test infrastructure to create multiple shards
and use PRSS inside them as well as provide the input for each shard and consume
their output.

The next and hopefully final change will bring the ability to communicate across
shards.
@akoshelev akoshelev requested a review from andyleiserson March 18, 2024 06:13
Copy link

codecov bot commented Mar 18, 2024

Codecov Report

Attention: Patch coverage is 88.11321% with 63 lines in your changes are missing coverage. Please review.

Project coverage is 89.17%. Comparing base (c531015) to head (50f1068).
Report is 12 commits behind head on main.

❗ Current head 50f1068 differs from pull request most recent head 2c6fc8e. Consider uploading reports for the commit 2c6fc8e to get more accurate results

Files Patch % Lines
ipa-core/src/protocol/context/semi_honest.rs 34.48% 19 Missing ⚠️
...a-core/src/helpers/transport/in_memory/handlers.rs 65.90% 15 Missing ⚠️
ipa-core/src/test_fixture/world.rs 95.29% 12 Missing ⚠️
ipa-core/src/protocol/context/validator.rs 25.00% 6 Missing ⚠️
ipa-core/src/sharding.rs 57.14% 6 Missing ⚠️
ipa-core/src/protocol/context/mod.rs 75.00% 3 Missing ⚠️
...pa-core/src/helpers/transport/in_memory/routing.rs 96.55% 1 Missing ⚠️
...a-core/src/helpers/transport/in_memory/sharding.rs 98.93% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #982      +/-   ##
==========================================
- Coverage   89.43%   89.17%   -0.26%     
==========================================
  Files         163      167       +4     
  Lines       21897    22805     +908     
==========================================
+ Hits        19584    20337     +753     
- Misses       2313     2468     +155     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@andyleiserson andyleiserson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might not be necessary to provide type safety for shard index vs. helper identity in the transport layer, since we have the context layer above it to provide safer APIs to protocols.

Do you imagine it will be possible to cast a sharded context into a non sharded context? Because of the potential of shard locality to leak information, I feel like protocols written against a sharded context are more dangerous, and call for more scrutiny, than protocols written against a non-sharded context.

ipa-core/src/helpers/mod.rs Outdated Show resolved Hide resolved
/// Helper trait to bind in-memory request handlers to transport identity.
pub trait IdentityHandlerExt: TransportIdentity {
type Handler: RequestHandler<Self>;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these new traits really need to be different for in-memory vs. network transport?

One strategy to simplify might be to move TransportCallbacks up a level. Instead of having five special purpose Box<dyn> handlers in TransportCallbacks, use Box<dyn Handler> (or Box<dyn TransportMessageHandler> for a more specific name) as the type of the callbacks: I::Handler argument.

The relationship amongst these traits also seems complicated to me. It might be possible to define something like:

trait Transport {
   type Identity;
   type Handler;
}

impl Transport for ShardTransport {
   type Identity = ShardIndex;
   type Handler = ...;
}

impl Transport for MpcTransport {
   type Identity = HelperIdentity;
   type Handler = HelperRequestHandler;
}

(This doesn't by itself do anything about the duplication between net and memory transports, however.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of moving generic parameter to associated type. I also agree that this stuff is complicated, don't see a clear way how to make it easier to follow.

The main idea here was to keep listen method the same for both shard and MPC in-memory transports. This method handles all incoming requests, but for shard traffic only Addr::Records request is valid, everything else only applies to MPC-MPC or client-MPC.

The handler stuff does not apply to net transports as callbacks are handled at the higher level. So adding Handler associated type to Transport may make HttpTransport implementation look clunky.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping that using dynamic dispatch might make it easier to make pieces of this the same across transport types (helper, and shard, and ideally HTTP as well).

Maybe something like fn listen takes Option<&dyn Handler> as the callbacks argument? (The HTTP transport doesn't have fn listen, but it does use TransportCallbacks.) The shard transport can omit the handler entirely (because it only needs to handle records, at least for now), and for all transport types they no longer need to enumerate the management request types ("prepare query", "receive query", etc.), which feels like something that transports don't need to know about, and that shouldn't need to behave differently across the different transport types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed IdentityHandlerExt in favour of ListenerSetup - makes it a little less confusing imo.
I am not sure I fully understand your proposal, will sync up with you offline

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at ListenerSetup and I like it 👍.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed it offline and agreed that TransportCallbacks needs to go away. A better interface for handlers would be similar to this one.

It is still worth to encapsulate the delivery method inside Transport interface, so messages delivered via HTTP or potentially CF workers channel have the same structure when handled by application logic. We may have a good abstraction for it already - Addr struct used inside in-memory network.

So our handler interface may look like this

trait Handler<I: TransportIdentity> {
   fn handle(&self, _: &Addr<I>) -> Result<Response>
}

HTTP transport can parse HTTP requests into Addr struct and application will have a unified logic to process requests based on Addr:RouteId field

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue is actually blocking now because prepare_inputs handler will require all transports to properly initialize the gateway. Naive approach would make MPC transport (that handles callbacks) to depend on Shard transport implementation (just to pass it to Gateway) which would make things way more complicated.

I am thinking that I would implement something like this

flowchart LR
  HelperApp --> Transports
  HelperApp --> QueryProcessor
  HelperApp --> Handler
  Handler --> Transports
  Handler --> QueryProcessor
Loading


/// A unique zero-based index of the helper shard.
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct ShardIndex(u32);

#[derive(Debug, Copy, Clone)]
pub struct Shard {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might call this Sharded? I like how SomeType<Sharded> reads better than SomeType<Shard>. Right now it's purely shard identity information, but Context<Sharded> feels like a better home for a get-channel-to-shard method than Context<Shard>.

(NoSharding could be renamed to NotSharded for symmetry, but that's a super nit.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind renaming it, We already have Sharded struct inside test infrastructure, but it may not be a big deal to have two

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too annoying to have two, renamed Sharded struct defined inside TestWorld to WithShards

inner: Inner<'a>,
gate: Gate,
total_records: TotalRecords,
sharding: B,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make more sense as part of Gateway, to avoid frequent cloning of static information?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining Gateway to be the same for sharded vs non-sharded world because that struct is complicated enough with the stall detection wrapper around it. It would just reference both transports and provide means to open shard-to-shard and mpc channels.

Depending on the context used, shard-to-shard channels will or will not be be accessible by MPC circuits.

I agree that cloning this stuff often may be less ideal - one way to mitigate that is to move this to Inner struct - the downside will be that this will have to be done for all context types separately. I am going to add a comment that we may need to do it later (shouldn't be too hard)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I am wondering why the shard identity should be returned out of the context, whereas the role is returned by asking the gateway:

    fn role(&self) -> Role {
        self.inner.gateway.role()
    }

Putting stuff in Inner doesn't change when it gets cloned -- I'm not sure why Base and Inner are two different things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested that this also exist on the gateway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point now. Yea, I think I can just return it from shard transport and when creating a gateway for non-sharded execution provide a panicking shard transport to get a runtime error. Maybe parametrizing Gateway can turn this into a compile error, but for some reason I seem to prefer not to do it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started changing Gateway in the next PR, in this change I won't be able to get it, so I'll follow up on it

@akoshelev
Copy link
Collaborator Author

It might not be necessary to provide type safety for shard index vs. helper identity in the transport layer, since we have the context layer above it to provide safer APIs to protocols.

I tried to re-use the existing functionality that we implemented for in-memory infrastructure and it seems that the only difference between MPC and shard channels is the identities used to authenticate peers, hence the change at the Transport trait. Gateway needs to understand that there are two different transport interfaces used and generic type seemed like the best way to provide that, but I am certainly open for suggestions here.

Do you imagine it will be possible to cast a sharded context into a non sharded context? Because of the potential of shard locality to leak information, I feel like protocols written against a sharded context are more dangerous, and call for more scrutiny, than protocols written against a non-sharded context.

I agree, that's why I spent quite a few hours modifying the Runner and parametrizing it with context type. Sharded circuits must be given an instance of ShardedContext and that thing cannot be converted to regular semi-honest or malicious context.

@andyleiserson
Copy link
Collaborator

I agree, that's why I spent quite a few hours modifying the Runner and parametrizing it with context type. Sharded circuits must be given an instance of ShardedContext and that thing cannot be converted to regular semi-honest or malicious context.

I was thinking of something like the malicious::Upgraded::as_base method. If the shuffle happens within oprf_ipa, then oprf_ipa is going to need to take a sharded context. But only the shuffle (and maybe the aggregation) actually needs the sharded context -- everything else that is called withing oprf_ipa can use a non-sharded context.

.iter()
.collect::<Vec<_>>()
.try_into()
.unwrap_or_else(|_| unreachable!())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that just unwrap() will achieve nearly the same thing. Is this due to a lack of a Debug impl?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, exactly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using .ok().unwrap() in this situation. (Not worth an edit just to change that, more of a side note.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok().unwrap() is strictly better, worth changing it

replace it with `ListenerSetup` trait that is hopefully less confusing to use
@akoshelev
Copy link
Collaborator Author

I was thinking of something like the malicious::Upgraded::as_base method. If the shuffle happens within oprf_ipa, then oprf_ipa is going to need to take a sharded context. But only the shuffle (and maybe the aggregation) actually needs the sharded context -- everything else that is called withing oprf_ipa can use a non-sharded context.

I am not sure if we want to make that distinction. ShardedContext should be a fully-functional semi-honest/malicious context that allows to run MPC with the same shards on other helpers. I am not sure I see a reason why we want to hide the fact that it is sharded from attribution and intra-shard aggregation steps (note that OPRF resharding also needs sharded context)

It feels a bit error prone to change the context types during the execution at least for now when we have the same shard instance running the whole OPRF IPA

@akoshelev akoshelev merged commit 4949f06 into private-attribution:main Mar 20, 2024
9 checks passed
@akoshelev akoshelev deleted the sharded-test-world branch March 20, 2024 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants