-
Notifications
You must be signed in to change notification settings - Fork 733
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #522 from ywc689/performance
Performance
- Loading branch information
Showing
159 changed files
with
5,869 additions
and
162,236 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,284 @@ | ||
# DPVS Worker Performance Tuning | ||
|
||
This doc should tell you how to achieve best performance by tunning CPU cores on which DPVS process is running. | ||
|
||
### About DPVS Workers | ||
|
||
DPVS is a multi-thread DPDK application program. It is based on the "polling" framework, all the threads would get into an infinite loop to process various jobs registered for the thread during the initialization stage. We call each such DPVS thread to be a DPVS Worker. There exists three different DPVS Worker types now. | ||
|
||
* **Master Worker**: in charge of all the jobs from the control plane; | ||
* **Forwarding Worker**: the data plane worker in charge of packet receiving, processing, forwarding and all other jobs in the data plane; | ||
* **Isolate Recieving Worker**: an optional worker used to take the responsibility of Forwarding Worker to receive packets to reduce NIC packets imiss. | ||
|
||
As all other DPDK applications, each DPVS Worker is bound to a distinct CPU core to avoid they interfere with each other. By default, the first N CPUs of the system are bound with DPVS Workers. The performance may not good enough when many other work load are scheduled into these CPUs by the system. For example, CPU0, the first CPU core in the system, is generally a lot busier than other CPU cores, because many processes, interrupts, and kernel threads run on it by default. The following of this doc would tell you how to alleviate/offload irrelative work load on DPVS Workers. | ||
|
||
### When do you need to consider this performance tuning? | ||
|
||
In case of the following situations, you should consider this performance tuning. | ||
|
||
* There are increasing imissed packets from NIC. You can get imiss data using the following command. | ||
|
||
``` | ||
dpip link show -s | ||
dpip link show [dpvs-nic-name] -s -i 2 -C | ||
``` | ||
|
||
* There are frequent msg timeout from DPVS log. | ||
|
||
> Another msg timeout reason is the configuration "ctrl_defs/lcore_msg/sync_msg_timeout_us" is too small. The default value for this configuration is 2000 us. | ||
* There exists big worker loops. | ||
|
||
> To observe worker loop time, you should uncomment the macro "CONFIG_RECORD_BIG_LOOP" in src/config.mk,recompile DPVS program and run it. | ||
> | ||
> Besides, macros "BIG_LOOP_THRESH_SLAVE" and "BIG_LOOP_THRESH_MASTER" define the threshold time of worker loop. Modify them if needed. | ||
### Optimize work load on CPUs for DPVS Worker | ||
|
||
**Optimize Kernel Command Line Parameters (Boot Options)** | ||
|
||
``` | ||
default_hugepagesz=1G hugepagesz=1G hugepages=[nr-hugepages] isolcpus=[cpu-list] nohz_full=[cpu-list] nowatchdog nmi_watchdog=0 rcu_nocbs=[cpu-list] rcu_nocb_poll | ||
``` | ||
|
||
Refer to [Linux kernel parameters document](https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt) for detailed explanations of the above boot options. | ||
|
||
Refer to [DPDK performance report](https://fast.dpdk.org/doc/perf/) for more BOOT and BIOS settings about high performance of DPDK applications. | ||
|
||
Set the boot options and reboot your system,then you can check the hardware/software interrupts on each CPU from the system proc file `/proc/interrupts` and `/proc/softirqs`. Hopefully, the interrupts on the CPU specified by "cpu-list" would be reduced dramatically. | ||
|
||
**How to choose CPU cores for DPVS Workers ?** | ||
|
||
The [cpu-list] in the above kernel command line parameters is the list of CPU cores for DPVS Workers. | ||
|
||
Generally speaking, we may follow some practical rules below to choose the CPU core list. | ||
|
||
* Avoid using the first CPU core, including both HT(Hyper-Threading) CPU cores of the first physical CPU core. | ||
* Turn off HT(Hyper-Threading) CPUs in the system if possible. | ||
* Do not use boot HT CPU cores from one physical CPU, unless the workers are the "forwarding worker" and "isolate receiving worker" couples. | ||
|
||
You can get the CPU layout of your system by the script provided by DPDK `cpu_layout.py `, example as shown below. | ||
|
||
``` | ||
[root@~ dpdk]# python dpdk-stable-18.11.2/usertools/cpu_layout.py | ||
====================================================================== | ||
Core and Socket Information (as reported by '/sys/devices/system/cpu') | ||
====================================================================== | ||
cores = [0, 1, 2, 3, 4, 8, 9, 10, 11, 12] | ||
sockets = [0, 1] | ||
Socket 0 Socket 1 | ||
-------- -------- | ||
Core 0 [0] [10] | ||
Core 1 [1] [11] | ||
Core 2 [2] [12] | ||
Core 3 [3] [13] | ||
Core 4 [4] [14] | ||
Core 8 [5] [15] | ||
Core 9 [6] [16] | ||
Core 10 [7] [17] | ||
Core 11 [8] [18] | ||
Core 12 [9] [19] | ||
``` | ||
|
||
There are 20 HT CPU cores, which is derived from 10 physical CPU cores in my system. Following examples are based on this environment. | ||
|
||
**Example #1: Run DPVS with 1 master worker and 4 forwarding worker** | ||
|
||
* Master Worker: Cpu1 | ||
|
||
* Forwarding Worker: Cpu2 - Cpu5 | ||
|
||
* Boot Option Cpu List: Cpu1 - Cpu5, Cpu11 - Cpu15 | ||
|
||
Use DPDK EAL common option `--lcore` to map System CPU ID to DPVS Worker ID. Boot DPVS with the command: | ||
|
||
``` | ||
./bin/dpvs -- --lcores 0@1,1@2,2@3,3@4,4@5 | ||
``` | ||
|
||
> The above command indicates DPVS to map the system's Cpu1 - Cpu5 into DPVS Worker 0 - 4, respectively. | ||
Note: | ||
|
||
1. DPVS Worker ID is specified in DPVS configuration file "worker_defs/work [name]/cpu_id". | ||
2. EAL common option `-c COREMASK` or `-l CORELIST` is not supported by DPVS, because they conflict with DPVS Worker ID. | ||
|
||
``` | ||
### dpvs.conf worker config | ||
worker_defs { | ||
<init> worker cpu0 { | ||
type master | ||
cpu_id 0 # DPVS Master Worker ID | ||
} | ||
<init> worker cpu1 { | ||
type slave | ||
cpu_id 1 # DPVS Forwarding Worker ID | ||
port dpdk0 { | ||
rx_queue_ids 0 | ||
tx_queue_ids 0 | ||
} | ||
} | ||
<init> worker cpu2 { | ||
type slave | ||
cpu_id 2 # DPVS Forwarding Worker ID | ||
port dpdk0 { | ||
rx_queue_ids 1 | ||
tx_queue_ids 1 | ||
} | ||
} | ||
<init> worker cpu3 { | ||
type slave | ||
cpu_id 3 # DPVS Forwarding Worker ID | ||
port dpdk0 { | ||
rx_queue_ids 2 | ||
tx_queue_ids 2 | ||
} | ||
} | ||
<init> worker cpu4 { | ||
type slave | ||
cpu_id 4 # DPVS Forwarding Worker ID | ||
port dpdk0 { | ||
rx_queue_ids 3 | ||
tx_queue_ids 3 | ||
} | ||
} | ||
} | ||
``` | ||
|
||
Check CPU work load using the `top` command. The zero-idle CPUs are occupied by DPVS Workers. | ||
|
||
``` | ||
top - 17:46:28 up 2 days, 21:29, 1 user, load average: 8.35, 14.57, 16.19 | ||
Tasks: 244 total, 2 running, 241 sleeping, 0 stopped, 1 zombie | ||
%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu1 : 40.9 us, 59.1 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu2 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu3 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu4 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu5 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu6 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu7 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu8 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu9 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu10 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu11 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu12 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu13 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu14 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu15 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu16 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu17 : 0.3 us, 0.3 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu18 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu19 : 1.3 us, 0.3 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
KiB Mem : 65685352 total, 15709580 free, 48904136 used, 1071636 buff/cache | ||
KiB Swap: 4194300 total, 4194300 free, 0 used. 16183592 avail Mem | ||
``` | ||
|
||
**Example #2: Run DPVS with 1 master worker, 4 forwarding worker, 4 isolate receiving worker** | ||
|
||
- Master Worker: Cpu3 | ||
- Forwarding Worker: Cpu1, Cpu2, Cpu4, Cpu5 | ||
- Isolate Receiving Worker: Cpu11, Cpu12, Cpu14, Cpu15 | ||
|
||
- Boot Option Cpu List: Cpu1 - Cpu5, Cpu11 - Cpu15 | ||
|
||
In this case, we change the Master Worker to the 3rd Worker(Worker ID is 2), just to explain how to specify DPVS Master Worker to any DPVS Worker as we want. Use DPDK EAL common option `--master-lcore` to specify the Master Worker ID, use DPDK EAL common option `--lcore` to map System CPU ID to DPVS Worker ID. Boot DPVS with the command: | ||
|
||
``` | ||
./bin/dpvs -- --lcores 0@1,1@2,2@3,3@4,4@5,5@11,6@12,7@14,8@15 --master-lcore 2 | ||
``` | ||
|
||
> The above command indicates DPVS to map the system's Cpu1-Cpu5,Cpu11,Cpu12,Cpu14,Cpu15 into DPVS Worker ID 0-8, and use the 2nd worker as DPVS Master Worker, respectively. We skip Cpu13 because it resides in the same physical CPU core as Cpu3, on which the Master Worker is to run. | ||
The DPVS worker configurations for this case shown as below. Note the configurations for the "cpu_id" fields. | ||
|
||
``` | ||
### dpvs.conf worker config | ||
worker_defs { | ||
<init> worker cpu0 { | ||
type master | ||
cpu_id 2 # DPVS Master Worker ID | ||
} | ||
<init> worker cpu1 { | ||
type slave | ||
cpu_id 0 # DPVS Forwarding Worker ID | ||
port dpdk0 { | ||
rx_queue_ids 0 | ||
tx_queue_ids 0 | ||
isol_rx_cpu_ids 5 # DPVS Isolate Receiving Worker ID | ||
isol_rxq_ring_sz 1048576 | ||
} | ||
} | ||
<init> worker cpu2 { | ||
type slave | ||
cpu_id 1 # DPVS Forwarding Worker ID | ||
port dpdk0 { | ||
rx_queue_ids 1 | ||
tx_queue_ids 1 | ||
isol_rx_cpu_ids 6 # DPVS Isolate Receiving Worker ID | ||
isol_rxq_ring_sz 1048576 | ||
} | ||
} | ||
<init> worker cpu3 { | ||
type slave | ||
cpu_id 3 # DPVS Forwarding Worker ID | ||
port dpdk0 { | ||
rx_queue_ids 2 | ||
tx_queue_ids 2 | ||
isol_rx_cpu_ids 7 # DPVS Isolate Receiving Worker ID | ||
isol_rxq_ring_sz 1048576 | ||
} | ||
} | ||
<init> worker cpu4 { | ||
type slave | ||
cpu_id 4 # DPVS Forwarding Worker ID | ||
port dpdk0 { | ||
rx_queue_ids 3 | ||
tx_queue_ids 3 | ||
isol_rx_cpu_ids 8 # DPVS Isolate Receiving Worker ID | ||
isol_rxq_ring_sz 1048576 | ||
} | ||
} | ||
} | ||
``` | ||
|
||
Check CPU work load using the `top` command. The zero-idle CPUs are occupied by DPVS Workers. | ||
|
||
``` | ||
top - 19:38:15 up 2 days, 23:20, 1 user, load average: 6.19, 1.89, 0.75 | ||
Tasks: 249 total, 2 running, 246 sleeping, 0 stopped, 1 zombie | ||
%Cpu0 : 0.0 us, 0.4 sy, 0.0 ni, 99.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu1 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu2 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu3 : 63.3 us, 36.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu4 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu5 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu6 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu7 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu8 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu9 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu10 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu11 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu12 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu13 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu14 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu15 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu16 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu17 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu18 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
%Cpu19 : 1.0 us, 0.3 sy, 0.0 ni, 98.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st | ||
KiB Mem : 65685352 total, 15695352 free, 48908988 used, 1081012 buff/cache | ||
KiB Swap: 4194300 total, 4194300 free, 0 used. 16171432 avail Mem | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Header files in "./tools/keepalived/keepalived/include/conf/" | ||
is a mirror of "./include/conf/". | ||
|
||
Always keep the header files of the two paths the same! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.