-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using TCP/IP for DSM #2
Comments
Hi, Unfortunately, we don't complete this part of the code. If you do want to use TCP, here's some advice: Multiple vCPU threads share one communication channel backed by TCP/RDMA with another node. When a thread sends a request via the channel, you need to make sure the received response belongs to this thread. A mutex that protects send->receive is not recommended, unless you want to be drowned in the swamp of deadlocks. What we do for RDMA is that each send->receive pair is associated with a transaction id (tx_add->txid). The code in ivy.c guarantees that whenever DSM software issues network transmission, a txid is generated in send and DSM software tries to retrieve the response from receive with this txid. You may need to manage a buffer in ktcp.c. Consider how TCP handles disordered packets. In addition, you may be disappointed to find TCP is too slow to boot a vanilla Linux like Ubuntu. (Light-weighted experimental OSes like sv6, Barrelfish are okay) The swap device booting may be timeout, soft lockup may be triggered, etc. You probably know the reason why few people research DSM in the 21st century. |
TCP shouldn't be that much slower... is this just a current implementation limitation? |
Well, for a single packet delivery, TCP is ~10 times slower than RDMA. And the e2e results might be even worse (think about the queuing theory). Some time-sensitive services for Linux (e.g., waiting for some devices) may fail without hacking the guest. |
Assume you are referring to latency then? |
Hi,
I am trying to compile using the TCP/IP for network communication. However, when I compile I get this error:
Any idea on the issue?
The text was updated successfully, but these errors were encountered: