Make n-t-tcp fully asynchronous... Maybe.. #428

hyperthunk · 2018-11-23T23:23:28Z

Folks whom I've assigned this to - I'm just looking for feedback and suggestions as to where I can look in the code to potentially fix this issue. I know you're all off doing other things!

Perhaps an issue for network-transport to discuss initially!?

There are a number of fundamental issues that make things a struggle in Cloud Haskell land...

In distributed-process, the node controller can block when remote nodes are unavailable, which is less than ideal. This is the first issue that needs to be solved, since the NC blocking on network calls is a massive bottleneck for CH.

I am very aware that we do not wish to introduce unbounded buffers into the pipeline. It's a design point I completely agree with.

How can go about removing this potential bottleneck, and what other implementations out there ought I to be looking at?

facundominguez · 2018-11-26T20:15:31Z

IIRC, connect followed by send would need to be non-blocking.

One part of it is making connect non-blocking.

And another part of it is making send non-blocking. Perhaps it can be fixed by having a bounded queue of messages in unreliable connections. If a send is started and the connection is "being" established, we add the message to the queue and have send return control to the caller. If the queue is full, the message is discarded. If the connection is established and the send buffer is full, we also discard the message. Probably there is no need to yield an error to the caller if messages are discarded since the connection is unreliable.

One change that might be necessary in n-t is modifying the type of connect to:

- , connect :: EndPointAddress -> Reliability -> ConnectHints -> IO (Either (TransportError ConnectErrorCode) Connection)
+ , connect :: EndPointAddress -> Reliability -> ConnectHints -> IO Connection

and then add methods to Connection to poll or wait on the status.

That said, when I think that n-t-tcp wants to establish TCP connections to do unreliable communication, it makes me ponder whether this is the best strategy for communication between node controllers.

hyperthunk · 2018-11-27T00:13:14Z

Thank you! This is very helpful...

That said, when I think that n-t-tcp wants to establish TCP connections to do unreliable communication, it makes me ponder whether this is the best strategy for communication between node controllers.

Erlang has the distribution carrier spawn a port driver (mapped to a process) for each inter-node connection. There is also the net_kernel process, which monitors remote nodes and performs booking, as well as managing keep alive (not tcp keep alive, but the net_tick), as well as interacting with epmd. All of this infrastructure code is fully supervised...

I'm not sure what the answer to this is either, but it's definitely food for thought...

hyperthunk assigned hyperthunk, mboes, qnikst, edsko, facundominguez and dcoutts Nov 23, 2018

LaurentRDC added the network-transport-tcp label Sep 3, 2024

LaurentRDC transferred this issue from haskell-distributed/network-transport-tcp Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make n-t-tcp fully asynchronous... Maybe.. #428

Make n-t-tcp fully asynchronous... Maybe.. #428

hyperthunk commented Nov 23, 2018

facundominguez commented Nov 26, 2018

hyperthunk commented Nov 27, 2018

Make n-t-tcp fully asynchronous... Maybe.. #428

Make n-t-tcp fully asynchronous... Maybe.. #428

Comments

hyperthunk commented Nov 23, 2018

facundominguez commented Nov 26, 2018

hyperthunk commented Nov 27, 2018