Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make n-t-tcp fully asynchronous... Maybe.. #428

Open
hyperthunk opened this issue Nov 23, 2018 · 2 comments
Open

Make n-t-tcp fully asynchronous... Maybe.. #428

hyperthunk opened this issue Nov 23, 2018 · 2 comments

Comments

@hyperthunk
Copy link
Member

Folks whom I've assigned this to - I'm just looking for feedback and suggestions as to where I can look in the code to potentially fix this issue. I know you're all off doing other things!

Perhaps an issue for network-transport to discuss initially!?

There are a number of fundamental issues that make things a struggle in Cloud Haskell land...

In distributed-process, the node controller can block when remote nodes are unavailable, which is less than ideal. This is the first issue that needs to be solved, since the NC blocking on network calls is a massive bottleneck for CH.

I am very aware that we do not wish to introduce unbounded buffers into the pipeline. It's a design point I completely agree with.

How can go about removing this potential bottleneck, and what other implementations out there ought I to be looking at?

@facundominguez
Copy link
Contributor

IIRC, connect followed by send would need to be non-blocking.

One part of it is making connect non-blocking.

And another part of it is making send non-blocking. Perhaps it can be fixed by having a bounded queue of messages in unreliable connections. If a send is started and the connection is "being" established, we add the message to the queue and have send return control to the caller. If the queue is full, the message is discarded. If the connection is established and the send buffer is full, we also discard the message. Probably there is no need to yield an error to the caller if messages are discarded since the connection is unreliable.

One change that might be necessary in n-t is modifying the type of connect to:

- , connect :: EndPointAddress -> Reliability -> ConnectHints -> IO (Either (TransportError ConnectErrorCode) Connection)
+ , connect :: EndPointAddress -> Reliability -> ConnectHints -> IO Connection

and then add methods to Connection to poll or wait on the status.

That said, when I think that n-t-tcp wants to establish TCP connections to do unreliable communication, it makes me ponder whether this is the best strategy for communication between node controllers.

@hyperthunk
Copy link
Member Author

Thank you! This is very helpful...

That said, when I think that n-t-tcp wants to establish TCP connections to do unreliable communication, it makes me ponder whether this is the best strategy for communication between node controllers.

Erlang has the distribution carrier spawn a port driver (mapped to a process) for each inter-node connection. There is also the net_kernel process, which monitors remote nodes and performs booking, as well as managing keep alive (not tcp keep alive, but the net_tick), as well as interacting with epmd. All of this infrastructure code is fully supervised...

I'm not sure what the answer to this is either, but it's definitely food for thought...

@LaurentRDC LaurentRDC transferred this issue from haskell-distributed/network-transport-tcp Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants