Enabling ECN for Datacenter Networks with RTT Variations (CoNEXT 19)
Resilient Datacenter Load Balancing in the Wild (SIGCOMM 17)
Please cite either of the following papers if you are using our simulator. Thanks! :P
@inproceedings{zhang2017resilient,
title={Resilient datacenter load balancing in the wild},
author={Zhang, Hong and Zhang, Junxue and Bai, Wei and Chen, Kai and Chowdhury, Mosharaf},
booktitle={Proceedings of the Conference of the ACM Special Interest Group on Data Communication},
pages={253--266},
year={2017},
organization={ACM}
}
@inproceedings{zhang2019enabling,
title={Enabling ECN for datacenter networks with RTT variations},
author={Zhang, Junxue and Bai, Wei and Chen, Kai},
booktitle={Proceedings of the 15th International Conference on Emerging Networking Experiments And Technologies},
pages={233--245},
year={2019},
organization={ACM}
}
- Ubuntu + gcc-4.9 has been verified to compatiable with the project.
docker run -it gcc:4.9
- Clone the project.
git clone [email protected]:snowzjx/ns3-ecn-sharp.git
- Configuration.
cd ns3-ecn-sharp
./waf -d optimized --enable-examples configure
- If you want to enable the debug mode for logging, can pass
-d debug
to the configuration.
./waf -d debug --enable-examples configure
- Compile the simulator.
./waf
You can also directly use our docker image for this simulator.
docker run -it snowzjx/ns3-ecn-sharp:optimized
cd ~/ns3-ecn-sharp
The ECN# (ECN Sharp)'s implementation is here:
The sojourn time is measured using ECNSharpTimestampTag
.
When a packet enqueues, we add a timestamp tag on the packet.
ECNSharpTimestampTag tag;
p->AddPacketTag (tag);
GetInternalQueue (0)->Enqueue (item);
When a packet dequeues, we calculate the sojourn time by deducing enqueue timestamp from current timestamp.
Ptr<QueueDiscItem> item = StaticCast<QueueDiscItem> (GetInternalQueue (0)->Dequeue ());
Ptr<Packet> p = item->GetPacket ();
ECNSharpTimestampTag tag;
bool found = p->RemovePacketTag (tag);
if (!found)
{
NS_LOG_ERROR ("Cannot find the ECNSharp Timestamp Tag");
return NULL;
}
Time sojournTime = now - tag.GetTxTime ();
if (sojournTime > m_instantMarkingThreshold)
{
instantaneousMarking = true;
}
bool okToMark = OkToMark (p, sojournTime, now);
if (m_marking)
{
if (!okToMark)
{
m_marking = false;
}
else if (now >= m_markNext)
{
m_markCount ++;
m_markNext = now + ECNSharpQueueDisc::ControlLaw ();
persistentMarking = true;
}
}
else
{
if (okToMark)
{
m_marking = true;
m_markCount = 1;
m_markNext = now + m_persistentMarkingInterval;
persistentMarking = true;
}
}
if (instantaneousMarking || persistentMarking)
{
if (!ECNSharpQueueDisc::MarkingECN (item))
{
NS_LOG_ERROR ("Cannot mark ECN");
return item; // Hey buddy, if the packet is not ECN supported, we should never drop it
}
}
Run large-scale
program for this experiment:
./waf --run "large-scale --help"
Please note, the default simulation time is very short, you should tune the simulation time by setting the --EndTime
and --FlowLaunchEndTime
to obtain a similar results in our paper.
In this program, TCN is identical to RED because here we only use one queue.
You should run:
./waf --run "large-scale --randomSeed=233 --load=0.6 --ID=TCN_High --AQM=TCN --TCNThreshold=70"
./waf --run "large-scale --randomSeed=233 --load=0.6 --ID=TCN_Low --AQM=TCN --TCNThreshold=30"
./waf --run "large-scale --randomSeed=233 --load=0.6 --ID=ECNSharp --AQM=ECNSharp --ECNShaprInterval=70 --ECNSharpTarget=10 --ECNShaprMarkingThreshold=70"
to compare the ECN#, RED with marking threshold calculated based on tail RTT and average RTT.
After simulation finishes, you will get a flow monitor file. The file is xml format and can be parsed by fct_parser.py
script. Please note, our flow monitor is slight different from the original version (some bugs are fixed).
python examples/rtt-variations/fct_parser.py Large_Scale_TCN_High_4X4_TCN_DcTcp_0.6.xml
python examples/rtt-variations/fct_parser.py Large_Scale_TCN_Low_4X4_TCN_DcTcp_0.6.xml
python examples/rtt-variations/fct_parser.py Large_Scale_ECNSharp_4X4_ECNSharp_DcTcp_0.6.xml
You can obtain the results as follows. Here we give a sample with default parameters (short simulation time) only to demonstrate the trends.
For ECN#:
...
AVG FCT: 0.009724
AVG Large flow FCT: 0.075342
AVG Small flow FCT: 0.001556
AVG Small flow 99 FCT: 0.008763
...
For RED (TCN) with marking threshold calculated based on high percentile RTT:
...
AVG FCT: 0.010133
AVG Large flow FCT: 0.073115
AVG Small flow FCT: 0.002375
AVG Small flow 99 FCT: 0.009398
...
For RED (TCN) with marking threshold calculated based on average RTT:
...
AVG FCT: 0.009593
AVG Large flow FCT: 0.081166
AVG Small flow FCT: 0.001483
AVG Small flow 99 FCT: 0.008892
...
We can see RED suffers from either throughput loss (poor FCT for all flows and large flows) or increased latency (poor FCT and tail FCT for short flows).
ECN# simultaneously deliver high throughput and low latency communications.
Run queue-track
program for this experiment:
./waf --run "queue-track --help"
You can use GNU Plot to plot the queue.
gnuplot Queue_Track_ ... .plt
The results are as follows, we can see ECN# can at the same time mitigate the persistent queue buildups and tolerate traffic burstiness.
Run mq
program for this experiment:
./waf --run "mq --help"
Both TCN and ECN# will output the throughput of all 3 flows. The results should be similar as follows, which shows both strategy can preserve the packet sceduling policy.
...
Flow: 0, throughput (Gbps): 9.57376
Flow: 1, throughput (Gbps): 0
Flow: 2, throughput (Gbps): 0
...
Flow: 0, throughput (Gbps): 6.55424
Flow: 1, throughput (Gbps): 3.01392
Flow: 2, throughput (Gbps): 0
...
Flow: 0, throughput (Gbps): 4.86864
Flow: 1, throughput (Gbps): 2.42928
Flow: 2, throughput (Gbps): 2.2372
...
When we anaylzing FCT of all flows, we should obtain the following results. This shows ECN# has much better results for short flows by mitigating the unnecessary persistent queue buildups.
For ECN#:
...
AVG Small flow FCT: 0.001329
AVG Small flow 99 FCT: 0.001694
...
For TCN:
...
AVG Small flow FCT: 0.002105
AVG Small flow 99 FCT: 0.005068
...
We have implemented the following transportation protocols, tc modules and load balance schemes in this simulator.
- ECN#
- RED
- TCN
- CoDel(with ECN)
- DWRR
- WFQ