trex_book.asciidoc revision a92ac3fe
1TRex 
2====
3:author: hhaim 
4:email: <hhaim@cisco.com> 
5:revnumber: 2.1
6:quotes.++:
7:numbered:
8:web_server_url: http://trex-tgn.cisco.com/trex
9:local_web_server_url: csi-wiki-01:8181/trex
10:toclevels: 4
11
12include::trex_ga.asciidoc[]
13
14
15== Introduction
16
17=== A word on traffic generators
18
19Traditionally, routers have been tested using commercial traffic generators, while performance
20typically has been measured using packets per second (PPS) metrics. As router functionality and
21services became more complex, stateful traffic generators now need to provide more realistic traffic scenarios.
22
23Advantages of realistic traffic generators:
24
25* Accurate performance metrics.
26* Discovering bottlenecks in realistic traffic scenarios.
27
28==== Current Challenges:
29
30* *Cost*: Commercial stateful traffic generators are very expensive.
31* *Scale*: Bandwidth does not scale up well with feature complexity.
32* *Standardization*: Lack of standardization of traffic patterns and methodologies.
33* *Flexibility*: Commercial tools do not allow agility when flexibility and changes are needed.
34
35==== Implications 
36
37* High capital expenditure (capex) spent by different teams.
38* Testing in low scale and extrapolation became a common practice. This is non-ideal and fails to indicate bottlenecks that appear in real-world scenarios.
39* Teams use different benchmark methodologies, so results are not standardized.
40* Delays in development and testing due to dependence on testing tool features.
41* Resource and effort investment in developing different ad hoc tools and test methodologies.
42
43=== Overview of TRex
44
45TRex addresses these problems through an innovative and extendable software implementation and by leveraging standard and open software and x86/UCS hardware.
46
47* Generates and analyzes L4-7 traffic. In one package, provides capabilities of commercial L7 tools.
48* Stateful traffic generator based on pre-processing and smart replay of real traffic templates.
49* Generates and *amplifies* both client and server side traffic.
50* Customized functionality can be added.
51* Scales to 200Gb/sec for one UCS (using Intel 40Gb/sec NICs).
52* Low cost.
53* Self-contained package that can be easily installed and deployed.
54* Virtual interface support enables TRex to be used in a fully virtual environment without physical NICs. Example use cases:
55** Amazon AWS
56** Cisco LaaS
57// Which LaaS is this? Location as a service? Linux?
58** TRex on your laptop
59
60
61
62.TRex Hardware 
63[options="header",cols="1^,1^"]
64|=================
65|Cisco UCS Platform | Intel NIC 
66| image:images/ucs200_2.png[title="generator"] | image:images/Intel520.png[title="generator"]  
67|=================
68
69=== Purpose of this guide
70
71This guide explains the use of TRex internals and the use of TRex together with Cisco ASR1000 Series routers. The examples illustrate novel traffic generation techniques made possible by TRex. 
72
73== Download and installation
74
75=== Hardware recommendations
76
77TRex operates in a Linux application environment, interacting with Linux kernel modules. 
78TRex curretly works on x86 architecture and can operate well on Cisco UCS hardware. The following platforms have been tested and are recommended for operating TRex. 
79
80[NOTE]
81=====================================
82 A high-end UCS platform is not required for operating TRex in its current version, but may be required for future versions.
83=====================================
84
85[NOTE]
86=====================================
87 Not all supported DPDK interfaces are supported by TRex
88=====================================
89 
90
91.Preferred UCS hardware
92[options="header",cols="1,3"]
93|=================
94| UCS Type | Comments 
95| UCS C220 Mx  | *Preferred Low-End*. Supports up to 40Gb/sec with 540-D2. With newer Intel NIC (recommended), supports 80Gb/sec with 1RU. See table below describing components.
96| UCS C200| Early UCS model.
97| UCS C210 Mx | Supports up to 40Gb/sec PCIe3.0.
98| UCS C240 Mx | *Preferred, High-End* Supports up to 200Gb/sec. 6x XL710 NICS (PCIex8) or 2xFM10K (PCIex16). See table below describing components.
99| UCS C260M2 | Supports up to 30Gb/sec (limited by V2 PCIe).
100|=================
101
102.Low-End UCS C220 Mx - Internal components
103[options="header",cols="1,2",width="60%"]
104|=================
105| Components |  Details 
106| CPU  | 2x E5-2620 @ 2.0 GHz. 
107| CPU Configuration | 2-Socket CPU configurations (also works with 1 CPU).
108| Memory | 2x4 banks f.or each CPU. Total of 32GB in 8 banks.	 
109| RAID | No RAID.
110|=================
111
112.High-End C240 Mx - Internal components 
113[options="header",cols="1,2",width="60%"]
114|=================
115| Components |  Details 
116| CPU  | 2x E5-2667 @ 3.20 GHz.
117| PCIe | 1x Riser PCI expansion card option A PID UCSC-PCI-1A-240M4 enables 2 PCIex16.
118| CPU Configuration | 2-Socket CPU configurations (also works with 1 CPU).
119| Memory | 2x4 banks for each CPU. Total of 32GB in 8 banks. 
120| RAID | No RAID.
121| Riser 1/2 | both left and right should support x16 PCIe. Right (Riser1) should be from option A x16 and Left (Riser2) should be x16. need to order both  
122|=================
123 
124.Supported NICs
125[options="header",cols="1,1,4",width="90%"]
126|=================
127| Chipset              | Bandwidth  (Gb/sec)  |  Example
128| Intel I350           | 1   | Intel 4x1GE 350-T4 NIC
129| Intel 82599          | 10  |   Cisco part ID:N2XX-AIPCI01 Intel x520-D2, Intel X520 Dual Port 10Gb SFP+ Adapter
130| Intel 82599 VF       | x  | 
131| Intel X710           | 10  | Cisco part ID:UCSC-PCIE-IQ10GF link:https://en.wikipedia.org/wiki/Small_form-factor_pluggable_transceiver[SFP+], *Preferred* support per stream stats in hardware link:http://www.silicom-usa.com/PE310G4i71L_Quad_Port_Fiber_SFP+_10_Gigabit_Ethernet_PCI_Express_Server_Adapter_49[Silicom PE310G4i71L]
132| Intel XL710          | 40  | Cisco part ID:UCSC-PCIE-ID40GF, link:https://en.wikipedia.org/wiki/QSFP[QSFP+] (copper/optical)
133| Intel XL710/X710 VF  | x  | 
134| Intel FM10420        | 25/100 | QSFP28,  by Silicom link:http://www.silicom-usa.com/100_Gigabit_Dual_Port_Fiber_Ethernet_PCI_Express_PE3100G2DQiR_96[Silicom PE3100G2DQiR_96] (*in development*) 
135| Mellanox ConnectX-4  | 25/40/50/56/100 | QSFP28, link:http://www.mellanox.com/page/products_dyn?product_family=201&[ConnectX-4] link:http://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX-4_VPI_Card.pdf[ConnectX-4-brief] (copper/optical) supported from v2.11 more details xref:connectx_support[TRex Support]
136| Mellanox ConnectX-5  | 25/40/50/56/100 | Not supported yet 
137| Cisco 1300 series    | 40              | QSFP+, VIC 1380,  VIC 1385, VIC 1387 see more xref:ciscovic_support[TRex Support]
138| VMXNET / +
139VMXNET3 (see notes) | VMware paravirtualized  | Connect using VMware vSwitch
140| E1000    | paravirtualized  | VMware/KVM/VirtualBox 
141| Virtio | paravirtualized  | KVM
142|=================
143
144// in table above, is it correct to list "paravirtualized" as chipset? Also, what is QSFP28? It does not appear on the lined URL. Clarify: is Intel X710 the preferred NIC?
145
146.SFP+ support 
147[options="header",cols="2,1,1,1",width="90%"]
148|=================
149| link:https://en.wikipedia.org/wiki/Small_form-factor_pluggable_transceiver[SFP+]  | Intel Ethernet Converged X710-DAX |  Silicom link:http://www.silicom-usa.com/PE310G4i71L_Quad_Port_Fiber_SFP+_10_Gigabit_Ethernet_PCI_Express_Server_Adapter_49[PE310G4i71L] (Open optic) | 82599EB 10-Gigabit
150| link:http://www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/data_sheet_c78-455693.html[Cisco SFP-10G-SR] | Does not work     | [green]*works* | [green]*works*
151| link:http://www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/data_sheet_c78-455693.html[Cisco SFP-10G-LR] | Does not work     | [green]*works* | [green]*works*
152| link:http://www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/data_sheet_c78-455693.html[Cisco SFP-H10GB-CU1M]| [green]*works* | [green]*works* | [green]*works*
153| link:http://www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/data_sheet_c78-455693.html[Cisco SFP-10G-AOC1M] | [green]*works* | [green]*works* | [green]*works*
154|=================
155
156[NOTE]
157=====================================
158 Intel X710 NIC (example: FH X710DA4FHBLK) operates *only* with Intel SFP+. For open optic, use the link:http://www.silicom-usa.com/PE310G4i71L_Quad_Port_Fiber_SFP+_10_Gigabit_Ethernet_PCI_Express_Server_Adapter_49[Silicom PE310G4i71L] NIC.
159=====================================
160
161// clarify above table and note
162
163.XL710 NIC base QSFP+ support 
164[options="header",cols="1,1,1",width="90%"]
165|=================
166| link:https://en.wikipedia.org/wiki/QSFP[QSFP+]             |  Intel Ethernet Converged XL710-QDAX | Silicom link:http://www.silicom-usa.com/Dual_Port_Fiber_40_Gigabit_Ethernet_PCI_Express_Server_Adapter_PE340G2Qi71_83[PE340G2Qi71] Open optic 
167| QSFP+ SR4 optics  |  APPROVED OPTICS [green]*works*,  Cisco QSFP-40G-SR4-S does *not* work   | Cisco QSFP-40G-SR4-S [green]*works* 
168| QSFP+ LR-4 Optics |   APPROVED OPTICS [green]*works*, Cisco QSFP-40G-LR4-S does *not* work   | Cisco QSFP-40G-LR4-S [green]*works* 
169| QSFP Active Optical Cables (AoC) | Cisco QSFP-H40G-AOC [green]*works*  | Cisco QSFP-H40G-AOC [green]*works* 
170| QSFP+ Intel Ethernet Modular Optics |    N/A            |  N/A
171| QSFP+ DA twin-ax cables | N/A  | N/A  
172| Active QSFP+ Copper Cables | Cisco QSFP-4SFP10G-CU [green]*works*  | Cisco QSFP-4SFP10G-CU [green]*works*
173|=================
174
175[NOTE]
176=====================================
177 For Intel XL710 NICs, Cisco SR4/LR QSFP+ does not operate. Use Silicom with Open Optic.
178=====================================
179
180
181.ConnectX-4 NIC base QSFP28 support (100gb)
182[options="header",cols="1,2",width="90%"]
183|=================
184| link:https://en.wikipedia.org/wiki/QSFP[QSFP28]             |  ConnectX-4
185| QSFP28 SR4 optics  |  N/A
186| QSFP28 LR-4 Optics |  N/A 
187| QSFP28 (AoC)       | Cisco QSFP-100G-AOCxM  [green]*works*  
188| QSFP28 DA twin-ax cables | Cisco QSFP-100G-CUxM [green]*works*  
189|=================
190
191.Cisco VIC NIC base QSFP+ support 
192[options="header",cols="1,2",width="90%"]
193|=================
194| link:https://en.wikipedia.org/wiki/QSFP[QSFP+]             |  Intel Ethernet Converged XL710-QDAX 
195| QSFP+ SR4 optics  |  N/A
196| QSFP+ LR-4 Optics |   N/A
197| QSFP Active Optical Cables (AoC) | Cisco QSFP-H40G-AOC [green]*works*  
198| QSFP+ Intel Ethernet Modular Optics |    N/A            
199| QSFP+ DA twin-ax cables | N/A  | N/A  
200| Active QSFP+ Copper Cables | N/A
201|=================
202
203
204// clarify above table and note. let's discuss.
205.FM10K QSFP28 support 
206[options="header",cols="1,1",width="70%"]
207|=================
208| QSFP28             | Example 
209| todo  |  todo
210|=================
211
212// do we want to show "todo"? maybe "pending"
213
214
215[IMPORTANT]
216=====================================
217* Intel SFP+ 10Gb/sec is the only one supported by default on the standard Linux driver. TRex also supports Cisco 10Gb/sec SFP+. 
218// above, replace "only one" with "only mode"?
219* For operating high speed throughput (example: several Intel XL710 40Gb/sec), use different link:https://en.wikipedia.org/wiki/Non-uniform_memory_access[NUMA] nodes for different NICs. +
220    To verify NUMA and NIC topology: `lstopo (yum install hwloc)` +
221    To display CPU info, including NUMA node: `lscpu` +
222    NUMA usage xref:numa-example[example]
223* For Intel XL710 NICs, verify that the NVM is v5.04 . xref:xl710-firmware[Info].
224**  `> sudo ./t-rex-64 -f cap2/dns.yaml -d 0 *-v 6* --nc | grep NVM` +
225    `PMD:  FW 5.0 API 1.5 NVM 05.00.04  eetrack 800013fc` 
226=====================================
227
228// above, maybe rename the bullet points "NIC usage notes"? should we create a subsection for NICs? Maybe it would be under "2.1 Hardware recommendations" as a subsection.
229
230
231.Sample order for recommended low-end Cisco UCSC-C220-M3S with 4x10Gb ports
232[options="header",cols="1,1",width="70%"]
233|=================
234| Component | Quantity 
235| UCSC-C220-M3S    |  1
236| UCS-CPU-E5-2650  |  2
237| UCS-MR-1X041RY-A |  8
238| A03-D500GC3      |  1
239| N2XX-AIPCI01     |  2
240| UCSC-PSU-650W    |  1
241| SFS-250V-10A-IS  |  1
242| UCSC-CMA1        |  1
243| UCSC-HS-C220M3   |  2
244| N20-BBLKD        |  7
245| UCSC-PSU-BLKP    |  1
246| UCSC-RAIL1       |  1
247|=================
248
249// should table above say "low-end Cisco UCS C220 M3S" instead of "low-end USCS-C220-M3S"?
250
251NOTE: Purchase the 10Gb/sec SFP+ separately. Cisco would be fine with TRex (but not for plain Linux driver).
252// does note above mean "TRex operates with 10Gb/sec SFP+ components, but plain Linux does not provide drivers."? if so, how does purchasing separately solve this? where do they get drivers?
253
254=== Installing OS 
255
256==== Supported versions
257
258Supported Linux versions:
259
260* Fedora 20-23, 64-bit kernel (not 32-bit)
261* Ubuntu 14.04.1 LTS, 64-bit kernel (not 32-bit)
262* Ubuntu 16.xx LTS, 64-bit kernel (not 32-bit) -- not fully supported
263* CentOs/RedHat 7.2 LTS, 64-bit kernel (not 32-bit)   -- The only working option for ConnectX-4
264
265NOTE: Additional OS version may be supported by compiling the necessary drivers.
266
267To check whether a kernel is 64-bit, verify that the ouput of the following command is `x86_64`.
268
269[source,bash]
270----
271$uname -m
272x86_64 
273----
274
275
276==== Download Linux
277
278ISO images for supported Linux releases can be downloaded from:
279
280.Supported Linux ISO image links
281[options="header",cols="1^,2^",width="50%"]
282|======================================
283| Distribution | SHA256 Checksum
284| link:http://archives.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/iso/Fedora-20-x86_64-DVD.iso[Fedora 20]
285    | link:http://archives.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/iso/Fedora-20-x86_64-CHECKSUM[Fedora 20 CHECKSUM] 
286| link:http://fedora-mirror01.rbc.ru/pub/fedora/linux/releases/21/Server/x86_64/iso/Fedora-Server-DVD-x86_64-21.iso[Fedora 21]
287    | link:http://fedora-mirror01.rbc.ru/pub/fedora/linux/releases/21/Server/x86_64/iso/Fedora-Server-21-x86_64-CHECKSUM[Fedora 21 CHECKSUM] 
288| link:http://old-releases.ubuntu.com/releases/14.04.1/ubuntu-14.04-desktop-amd64.iso[Ubuntu 14.04.1]
289    | http://old-releases.ubuntu.com/releases/14.04.1/SHA256SUMS[Ubuntu 14.04* CHECKSUMs]
290| link:http://releases.ubuntu.com/16.04.1/ubuntu-16.04.1-server-amd64.iso[Ubuntu 16.04.1]
291    | http://releases.ubuntu.com/16.04.1/SHA256SUMS[Ubuntu 16.04* CHECKSUMs]
292
293|======================================
294
295For Fedora downloads...
296
297* Select a mirror close to your location: +
298https://admin.fedoraproject.org/mirrormanager/mirrors/Fedora +
299Choose: "Fedora Linux http" -> releases -> <version number> -> Server -> x86_64 -> iso -> Fedora-Server-DVD-x86_64-<version number>.iso
300
301* Verify the checksum of the downloaded file matches the linked checksum values with the `sha256sum` command. Example:
302
303[source,bash]
304----
305$sha256sum Fedora-18-x86_64-DVD.iso
30691c5f0aca391acf76a047e284144f90d66d3d5f5dcd26b01f368a43236832c03 #<1>
307----
308<1> Should be equal to the link:https://en.wikipedia.org/wiki/SHA-2[SHA-256] values described in the linked checksum files. 
309
310
311==== Install Linux 
312
313Ask your lab admin to install the Linux using CIMC, assign an IP, and set the DNS. Request the sudo or super user password to enable you to ping and SSH.
314
315xref:fedora21_example[Example of installing Fedora 21 Server]
316
317[NOTE]
318=====================================
319 * To use TRex, you should have sudo on the machine or the root password.
320 * Upgrading the linux Kernel using `yum upgrade` requires building the TRex drivers. 
321 * In Ubuntu 16, auto-updater is enabled by default. It's advised to turn it off as with update of Kernel need to compile again the DPDK .ko file. +
322Command to remove it: +
323 > sudo apt-get remove unattended-upgrades
324=====================================
325
326==== Verify Intel NIC installation
327
328Use `lspci` to verify the NIC installation. 
329
330Example 4x 10Gb/sec TRex configuration (see output below):
331
332* I350 management port
333
334* 4x Intel Ethernet Converged Network Adapter model x520-D2 (82599 chipset)
335
336[source,bash]
337----
338$[root@trex]lspci | grep Ethernet
33901:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)                #<1>
34001:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)                #<2>
34103:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) #<3>
34203:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
34382:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
34482:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
345----
346<1> Management port 
347<2> CIMC port
348<3> 10Gb/sec traffic ports (Intel 82599EB) 
349
350=== Obtaining the TRex package
351
352Connect using `ssh` to the TRex machine and execute the commands described below.
353
354NOTE: Prerequisite: *$WEB_URL* is *{web_server_url}* or *{local_web_server_url}* (Cisco internal)
355
356Latest release:
357[source,bash]
358----
359$mkdir trex
360$cd trex
361$wget --no-cache $WEB_URL/release/latest 
362$tar -xzvf latest 
363----
364
365
366Bleeding edge version:
367[source,bash]
368----
369$wget --no-cache $WEB_URL/release/be_latest 
370----
371
372To obtain a specific version, do the following: 
373[source,bash]
374----
375$wget --no-cache $WEB_URL/release/vX.XX.tar.gz #<1>
376----
377
378<1> X.XX = Version number
379
380== First time Running
381
382=== Configuring for loopback
383
384Before connecting TRex to your DUT, it is strongly advised to verify that TRex and the NICs work correctly in loopback. +
385To get best performance, it is advised to loopback interfaces on the same NUMA (controlled by the same physical processor). If you do not know how to check this, you can ignore this advice for now. +
386
387[NOTE]
388=====================================================================
389If you are using 10Gbs NIC based on Intel 520-D2 NICs, and you loopback ports on the same NIC, using SFP+, it might not sync, and you will fail to get link up. +
390We checked many types of SFP+ (Intel/Cisco/SR/LR) and it worked for us. +
391If you still encounter link issues, you can either try to loopback interfaces from different NICs, or use link:http://www.fiberopticshare.com/tag/cisco-10g-twinax[Cisco twinax copper cable].
392=====================================================================
393
394.Loopback example
395image:images/loopback_example.png[title="Loopback example"] 
396
397==== Identify the ports 
398
399[source,bash]
400----
401 $>sudo ./dpdk_setup_ports.py -s
402
403 Network devices using DPDK-compatible driver
404 ============================================
405
406 Network devices using kernel driver
407 ===================================
408 0000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb #<1>
409 0000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
410 0000:13:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
411 0000:13:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
412 0000:02:00.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth2 drv=e1000 unused=igb_uio *Active* #<2>
413
414 Other network devices
415 =====================
416 <none>
417----
418
419<1> If you did not run any DPDK application, you will see list of interfaces binded to the kernel, or not binded at all.
420<2> Interface marked as 'active' is the one used by your ssh connection. *Never* put it in TRex config file.
421
422Choose ports to use and follow the instructions in the next section to create configuration file.
423
424==== Creating minimum configuration file
425
426Default configuration file name is: `/etc/trex_cfg.yaml`.
427
428You can copy basic configuration file from cfg folder
429
430[source,bash]
431----
432$cp  cfg/simple_cfg.yaml /etc/trex_cfg.yaml
433----
434
435Then, edit the configuration file and put your interface's and IP addresses details.
436
437Example:
438
439[source,bash]
440----
441<none>
442- port_limit      : 2
443  version         : 2
444#List of interfaces. Change to suit your setup. Use ./dpdk_setup_ports.py -s to see available options
445interfaces    : ["03:00.0", "03:00.1"]  #<1>
446 port_info       :  # Port IPs. Change to suit your needs. In case of loopback, you can leave as is.
447          - ip         : 1.1.1.1
448            default_gw : 2.2.2.2
449          - ip         : 2.2.2.2
450            default_gw : 1.1.1.1
451----
452<1> You need to edit this line to match the interfaces you are using.
453Notice that all NICs you are using should have the same type. You cannot mix different NIC types in one config file. For more info, see link:http://trex-tgn.cisco.com/youtrack/issue/trex-201[trex-201].
454
455You can find xref:trex_config[here] full list of configuration file options.
456
457=== Script for creating config file
458
459To help starting with basic configuration file that suits your needs, there a script that can automate this process.
460The script helps you getting started, and you can then edit the file and add advanced options from xref:trex_config[here]
461if needed. +
462There are two ways to run the script. Interactively (script will pormpt you for parameters), or providing all parameters
463using command line options.
464
465==== Interactive mode
466
467[source,bash]
468----
469sudo ./dpdk_setup_ports.py -i
470----
471
472You will see a list of available interfaces with their related information +
473Just follow the instructions to get basic config file.
474
475==== Specifying input arguments using command line options
476
477First, run this command to see the list of all interfaces and their related information:
478
479[source,bash]
480----
481sudo ./dpdk_setup_ports.py -t
482----
483
484* In case of *Loopback* and/or only *L1-L2 Switches* on the way, you do not need to provide IPs or destination MACs. +
485The script Will assume the following interface connections: 0&#8596;1, 2&#8596;3 etc. +
486Just run:
487
488[source,bash]
489----
490sudo ./dpdk_setup_ports.py -c <TRex interface 0> <TRex interface 1> ...
491----
492
493* In case of *Router* (or other next hop device, such as *L3 Switch*), you should specify the TRex IPs and default gateways, or
494MACs of the router as described below.
495
496.Additional arguments to creating script (dpdk_setup_ports.py -c)
497[options="header",cols="2,5,3",width="100%"]
498|=================
499| Arg | Description | Example
500| -c  | Create a configuration file by specified interfaces (PCI address or Linux names: eth1 etc.) | -c 03:00.1 eth1 eth4 84:00.0
501| --dump | Dump created config to screen. |
502| -o | Output the config to this file. | -o /etc/trex_cfg.yaml
503| --dest-macs | Destination MACs to be used per each interface. Specify this option if you want MAC based config instead of IP based one. You must not set it together with --ip and --def_gw | --dest-macs 11:11:11:11:11:11 22:22:22:22:22:22
504| --ip | List of IPs to use for each interface. If this option and --dest-macs is not specified, script assumes loopback connections (0&#8596;1, 2&#8596;3 etc.) | --ip 1.2.3.4 5.6.7.8
505|--def-gw | List of default gateways to use for each interface. If --ip given, you must provide --def_gw as well | --def-gw 3.4.5.6 7.8.9.10
506| --ci | Cores include: White list of cores to use. Make sure there is enough for each NUMA. | --ci 0 2 4 5 6
507| --ce | Cores exclude: Black list of cores to exclude. Make sure there will be enough for each NUMA. | --ci 10 11 12
508| --no-ht | No HyperThreading: Use only one thread of each Core in created config yaml. |
509| --prefix | Advanced option: prefix to be used in TRex config in case of parallel instances. | --prefix first_instance
510| --zmq-pub-port | Advanced option: ZMQ Publisher port to be used in TRex config in case of parallel instances. | --zmq-pub-port 4000
511| --zmq-rpc-port | Advanced option: ZMQ RPC port to be used in TRex config in case of parallel instances. | --zmq-rpc-port
512| --ignore-numa | Advanced option: Ignore NUMAs for config creation. Use this option only if you have to, as it might reduce performance. For example, if you have pair of interfaces at different NUMAs |
513|=================
514
515=== Configuring ESXi for running TRex
516
517To get best performance, it is advised to run TRex on bare metal hardware, and not use any kind of VM.
518Bandwidth on VM might be limited, and IPv6 might not be fully supported.
519Having said that, there are sometimes benefits for running on VM. +
520These include: +
521    * Virtual NICs can be used to bridge between TRex and NICs not supported by TRex. +
522    * If you already have VM installed, and do not require high performance. +
523
5241. Click the host machine, enter Configuration -> Networking.
525
526a. One of the NICs should be connected to the main vSwitch network to get an "outside" connection, for the TRex client and ssh: +
527image:images/vSwitch_main.png[title="vSwitch_main"]
528
529b. Other NICs that are used for TRex traffic should be in distinguish vSwitch: +
530image:images/vSwitch_loopback.png[title="vSwitch_loopback"]
531
5322. Right-click guest machine -> Edit settings -> Ensure the NICs are set to their networks: +
533image:images/vSwitch_networks.png[title="vSwitch_networks"]
534
535[NOTE]
536=====================================================================
537Before version 2.10, the following command did not function as expected:
538[subs="quotes"]
539....
540sudo ./t-rex-64 -f cap2/dns.yaml *--lm 1 --lo* -l 1000 -d 100
541....
542The vSwitch did not "know" where to route the packet. Was solved on version 2.10 when TRex started to support ARP.
543=====================================================================
544
545* Pass-through is the way to use directly the NICs from host machine inside the VM. Has no limitations except the NIC/hardware itself. The only difference via bare-metal OS is occasional spikes of latency (~10ms). Passthrough settings cannot be saved to OVA.
546
5471. Click on the host machine. Enter Configuration -> Advanced settings -> Edit. Mark the desired NICs. Reboot the ESXi to apply. +
548image:images/passthrough_marking.png[title="passthrough_marking"]
549
5502. Right click on guest machine. Edit settings -> Add -> *PCI device* -> Choose the NICs one by one. +
551image:images/passthrough_adding.png[title="passthrough_adding"]
552
553=== Configuring for running with router (or other L3 device) as DUT
554
555You can follow link:trex_config_guide.html[this] presentation for an example of how to configure router as DUT.
556
557=== Running TRex
558
559When all is set, use the following command to start basic TRex run for 10 seconds
560(it will use the default config file name /etc/trex_cfg.yaml):
561[source,bash]
562----
563$sudo ./t-rex-64 -f cap2/dns.yaml -c 4 -m 1 -d 10  -l 1000
564----
565
566If successful, the output will be similar to the following:
567
568[source,python]
569----
570$ sudo ./t-rex-64 -f cap2/dns.yaml -d 10 -l 1000
571Starting  TRex 2.09 please wait  ...
572zmq publisher at: tcp://*:4500 
573 number of ports found : 4
574  port : 0 
575  ------------
576  link         :  link : Link Up - speed 10000 Mbps - full-duplex      <1>
577  promiscuous  : 0 
578  port : 1 
579  ------------
580  link         :  link : Link Up - speed 10000 Mbps - full-duplex
581  promiscuous  : 0 
582  port : 2 
583  ------------
584  link         :  link : Link Up - speed 10000 Mbps - full-duplex
585  promiscuous  : 0 
586  port : 3 
587  ------------
588  link         :  link : Link Up - speed 10000 Mbps - full-duplex
589  promiscuous  : 0 
590
591
592 -Per port stats table 
593      ports |               0 |               1 |               2 |               3 
594 -------------------------------------------------------------------------------------
595   opackets |            1003 |            1003 |            1002 |            1002 
596     obytes |           66213 |           66229 |           66132 |           66132 
597   ipackets |            1003 |            1003 |            1002 |            1002 
598     ibytes |           66225 |           66209 |           66132 |           66132 
599    ierrors |               0 |               0 |               0 |               0 
600    oerrors |               0 |               0 |               0 |               0 
601      Tx Bw |     217.09 Kbps |     217.14 Kbps |     216.83 Kbps |     216.83 Kbps 
602
603 -Global stats enabled 
604 Cpu Utilization : 0.0  % <2>  29.7 Gb/core <3>
605 Platform_factor : 1.0    
606 Total-Tx        :     867.89 Kbps                                             <4>
607 Total-Rx        :     867.86 Kbps                                             <5>
608 Total-PPS       :       1.64 Kpps  
609 Total-CPS       :       0.50  cps  
610
611 Expected-PPS    :       2.00  pps   <6>
612 Expected-CPS    :       1.00  cps   <7>
613 Expected-BPS    :       1.36 Kbps   <8>
614
615 Active-flows    :        0 <9> Clients :      510   Socket-util  : 0.0000 %
616 Open-flows      :        1 <10> Servers :      254   Socket   :        1  Socket/Clients :  0.0
617 drop-rate       :       0.00  bps   <11>
618 current time    : 5.3 sec  
619 test duration   : 94.7 sec  
620
621 -Latency stats enabled 
622 Cpu Utilization : 0.2 %  <12>
623 if|   tx_ok , rx_ok  , rx   ,error,    average   ,   max         , Jitter ,  max window 
624   |         ,        , check,     , latency(usec),latency (usec) ,(usec)  ,             
625 --------------------------------------------------------------------------------------------------
626 0 |     1002,    1002,         0,   0,         51  ,      69,       0      |   0  69  67    <13>
627 1 |     1002,    1002,         0,   0,         53  ,     196,       0      |   0  196  53
628 2 |     1002,    1002,         0,   0,         54  ,      71,       0      |   0  71  69 
629 3 |     1002,    1002,         0,   0,         53  ,     193,       0      |   0  193  52 
630----
631<1> Link must be up for TRex to work.
632<2> Average CPU utilization of transmitters threads. For best results it should be lower than 80%.
633<3> Gb/sec generated per core of DP. Higher is better.
634<4> Total Tx must be the same as Rx at the end of the run
635<5> Total Rx must be the same as Tx at the end of the run
636<6> Expected number of packets per second (calculated without latency packets).
637<7> Expected number of connections per second (calculated without latency packets).
638<8> Expected number of bits per second (calculated without latency packets).
639<9> Number of TRex active "flows". Could be different than the number of router flows, due to aging issues. Usualy the TRex number of active flows is much lower than that of the router because the router ages flows slower.
640<10> Total number of TRex flows opened since startup (including active ones, and ones already closed).
641<11> Drop rate.
642<12> Rx and latency thread CPU utilization.
643<13> Tx_ok on port 0 should equal Rx_ok on port 1, and vice versa.
644
645More statistics information:
646
647*socket*::  Same as the active flows.  
648
649*Socket/Clients*:: Average of active flows per client, calculated as active_flows/#clients.
650
651*Socket-util*:: Estimation of number of L4 ports (sockets) used per client IP. This is approximately (100*active_flows/#clients)/64K, calculated as (average active flows per client*100/64K). Utilization of more than 50% means that TRex is generating too many flows per single client, and that more clients must be added in the generator config.
652// clarify above, especially the formula
653
654*Max window*:: Momentary maximum latency for a time window of 500 msec. There are few numbers shown per port.
655 The newest number (last 500msec) is on the right. Oldest on the left. This can help identifying spikes of high latency clearing after some time. Maximum latency is the total maximum over the entire test duration. To best understand this,
656 run TRex with latency option (-l) and watch the results with this section in mind.
657
658*Platform_factor*:: There are cases in which we duplicate the traffic using splitter/switch and we would like all numbers displayed by TRex to be multiplied by this factor, so that TRex counters will match the DUT counters.
659
660WARNING: If you don't see rx packets, revisit your MAC address configuration.
661
662include::trex_book_basic.asciidoc[]
663
664== Advanced features
665
666=== VLAN Trunk support 
667
668anchor:trex_vlan[]
669
670The VLAN Trunk TRex feature attempts to solve the router port bandwidth limitation when the traffic profile is asymmetric. Example: Asymmetric SFR profile.
671This feature converts asymmetric traffic to symmetric, from the port perspective, using router sub-interfaces.
672This requires TRex to send the traffic on two VLANs, as described below.
673
674.YAML format 
675[source,python]
676----
677  vlan       : { enable : 1  ,  vlan0 : 100 , vlan1 : 200 }
678----
679
680
681.Example
682[source,python]
683----
684- duration : 0.1
685  vlan       : { enable : 1  ,  vlan0 : 100 , vlan1 : 200 }   <1>
686----
687<1> Enable VLAN feature, vlan0==100 , vlan1==200
688        
689*Problem definition:*::
690
691Scenario: TRex with two ports and an SFR traffic profile.
692
693.Without VLAN/sub interfaces 
694[source,python]
695----
6960 ( client) -> [  ] - 1 ( server)
697----
698Without VLAN support the traffic is asymmetric. 10% of the traffic is sent from port 0 (client side), 90% is from port 1 (server). Port 1 become the bottlneck (10Gb/s limit) before port 0.
699
700.With VLAN/sub interfaces 
701[source,python]
702----
703port 0 ( client VLAN0) <->  |  | <-> port 1 ( server-VLAN0)
704port 0 ( server VLAN1) <->  |  | <-> port 1 ( client-VLAN1)
705----
706
707In this case both ports have the same amount of traffic.
708
709*Router configuation:*::
710[source,python]
711----
712        !
713        interface TenGigabitEthernet1/0/0      <1>
714         mac-address 0000.0001.0000   
715         mtu 4000
716         no ip address
717         load-interval 30
718        !
719        i
720        interface TenGigabitEthernet1/0/0.100
721         encapsulation dot1Q 100               <2> 
722         ip address 11.77.11.1 255.255.255.0
723         ip nbar protocol-discovery
724         ip policy route-map vlan_100_p1_to_p2 <3> 
725        !
726        interface TenGigabitEthernet1/0/0.200
727         encapsulation dot1Q 200               <4>
728         ip address 11.88.11.1 255.255.255.0
729         ip nbar protocol-discovery
730         ip policy route-map vlan_200_p1_to_p2 <5> 
731        !
732        interface TenGigabitEthernet1/1/0
733         mac-address 0000.0001.0000
734         mtu 4000
735         no ip address
736         load-interval 30
737        !
738        interface TenGigabitEthernet1/1/0.100
739         encapsulation dot1Q 100
740         ip address 22.77.11.1 255.255.255.0
741         ip nbar protocol-discovery
742         ip policy route-map vlan_100_p2_to_p1
743        !
744        interface TenGigabitEthernet1/1/0.200
745         encapsulation dot1Q 200
746         ip address 22.88.11.1 255.255.255.0
747         ip nbar protocol-discovery
748         ip policy route-map vlan_200_p2_to_p1
749        !
750        
751        arp 11.77.11.12 0000.0001.0000 ARPA      <6>
752        arp 22.77.11.12 0000.0001.0000 ARPA
753        
754        route-map vlan_100_p1_to_p2 permit 10    <7>
755         set ip next-hop 22.77.11.12
756        !
757        route-map vlan_100_p2_to_p1 permit 10
758         set ip next-hop 11.77.11.12
759        !
760        
761        route-map vlan_200_p1_to_p2 permit 10
762         set ip next-hop 22.88.11.12
763        !
764        route-map vlan_200_p2_to_p1 permit 10
765         set ip next-hop 11.88.11.12
766        !
767----
768<1> Disable the IP on the main port it is important.
769// above, clarify what's important
770<2> Enable VLAN1
771<3> PBR configuration
772<4> Enable VLAN2  
773<5> PBR configuration
774<6> TRex destination port MAC address
775<7> PBR configuration rules
776
777=== Static source MAC address setting  
778
779With this feature, TRex replaces the source MAC address with the client IP address.
780
781 Note: This feature was requested by the Cisco ISG group. 
782
783
784*YAML:*::
785[source,python]
786----
787 mac_override_by_ip : true
788----
789
790.Example
791[source,python]
792----
793- duration : 0.1
794 ..
795  mac_override_by_ip : true <1>
796----
797<1> In this case, the client side MAC address looks like this:
798SRC_MAC = IPV4(IP) + 00:00  
799
800=== IPv6 support 
801
802Support for IPv6 includes:
803
8041. Support for pcap files containing IPv6 packets
8052. Ability to generate IPv6 traffic from pcap files containing IPv4 packets
806The following command line option enables this feature: `--ipv6`
807The keywords (`src_ipv6` and `dst_ipv6`) specify the most significant 96 bits of the IPv6 address - for example:
808
809[source,python]
810----
811      src_ipv6 : [0xFE80,0x0232,0x1002,0x0051,0x0000,0x0000]
812      dst_ipv6 : [0x2001,0x0DB8,0x0003,0x0004,0x0000,0x0000]
813----
814      
815The IPv6 address is formed by placing what would typically be the IPv4
816address into the least significant 32 bits and copying the value provided
817in the src_ipv6/dst_ipv6 keywords into the most signficant 96 bits.
818If src_ipv6 and dst_ipv6 are not specified, the default
819is to form IPv4-compatible addresses (most signifcant 96 bits are zero).
820
821There is support for all plugins.
822  
823*Example:*::
824[source,bash]
825----
826$sudo ./t-rex-64 -f cap2l/sfr_delay_10_1g.yaml -c 4 -p -l 100 -d 100000 -m 30  --ipv6
827----
828
829*Limitations:*::
830
831* TRex cannot generate both IPv4 and IPv6 traffic.
832* The `--ipv6` switch must be specified even when using pcap file containing only IPv6 packets.
833
834
835*Router configuration:*::
836
837[source,python]
838----
839interface TenGigabitEthernet1/0/0
840 mac-address 0000.0001.0000
841 mtu 4000
842 ip address 11.11.11.11 255.255.255.0
843 ip policy route-map p1_to_p2
844 load-interval 30
845 ipv6 enable   ==> IPv6
846 ipv6 address 2001:DB8:1111:2222::1/64                  <1>
847 ipv6 policy route-map ipv6_p1_to_p2                    <2>
848!
849
850
851ipv6 unicast-routing                                    <3>
852
853ipv6 neighbor 3001::2 TenGigabitEthernet0/1/0 0000.0002.0002   <4>
854ipv6 neighbor 2001::2 TenGigabitEthernet0/0/0 0000.0003.0002
855
856route-map ipv6_p1_to_p2 permit 10                              <5>
857 set ipv6 next-hop 2001::2
858!
859route-map ipv6_p2_to_p1 permit 10
860 set ipv6 next-hop 3001::2
861!
862
863
864asr1k(config)#ipv6 route 4000::/64 2001::2                 
865asr1k(config)#ipv6 route 5000::/64 3001::2 
866----
867<1> Enable IPv6 
868<2> Add pbr 
869<3> Enable IPv6 routing 
870<4> MAC address setting. Should be TRex MAC.
871<5> PBR configuraion
872
873
874=== Client clustering configuration
875TRex supports testing complex topologies, with more than one DUT, using a feature called "client clustering".
876This feature allows specifying the distribution of clients TRex emulates.
877
878Let's look at the following topology:
879
880.Topology Example 
881image:images/topology.png[title="Client Clustering",width=850]
882
883We have two clusters of DUTs.
884Using config file, you can partition TRex emulated clients to groups, and define
885how they will be spread between the DUT clusters.
886
887Group configuration includes:
888
889* IP start range.
890* IP end range.
891* Initiator side configuration. - These are the parameters affecting packets sent from client side.
892* Responder side configuration. - These are the parameters affecting packets sent from server side.
893
894[NOTE]
895It is important to understand that this is *complimentary* to the client generator
896configured per profile - it only defines how the clients will be spread between clusters.
897
898Let's look at an example.
899
900We have a profile defining client generator.
901
902[source,bash]
903----
904$cat cap2/dns.yaml 
905- duration : 10.0
906  generator :  
907          distribution : "seq"           
908          clients_start : "16.0.0.1"
909          clients_end   : "16.0.0.255"   
910          servers_start : "48.0.0.1"
911          servers_end   : "48.0.0.255"   
912          dual_port_mask : "1.0.0.0" 
913  cap_info : 
914     - name: cap2/dns.pcap
915       cps : 1.0          
916       ipg : 10000        
917       rtt : 10000        
918       w   : 1            
919----
920
921We want to create two clusters with 4 and 3 devices respectively.
922We also want to send *80%* of the traffic to the upper cluster and *20%* to the lower cluster.
923We can specify to which DUT the packet will be sent by MAC address or IP. We will present a MAC
924based example, and then see how to change to be IP based.
925
926We will create the following cluster configuration file.
927
928[source,bash]
929----
930#
931# Client configuration example file
932# The file must contain the following fields
933#
934# 'vlan'   - if the entire configuration uses VLAN,
935#            each client group must include vlan
936#            configuration
937#
938# 'groups' - each client group must contain range of IPs
939#            and initiator and responder section
940#            'count' represents the number of different DUTs
941#            in the group.
942#
943
944# 'true' means each group must contain VLAN configuration. 'false' means no VLAN config allowed.
945vlan: true
946
947groups:
948
949-    ip_start  : 16.0.0.1
950     ip_end    : 16.0.0.204
951     initiator :
952                 vlan    : 100
953                 dst_mac : "00:00:00:01:00:00"
954     responder :
955                 vlan    : 200
956                 dst_mac : "00:00:00:02:00:00"
957
958     count     : 4
959
960-    ip_start  : 16.0.0.205
961     ip_end    : 16.0.0.255
962     initiator :
963                 vlan    : 101
964                 dst_mac : "00:00:01:00:00:00"
965
966     responder:
967                 vlan    : 201
968                 dst_mac : "00:00:02:00:00:00"
969
970     count     : 3
971
972----
973
974The above configuration will divide the generator range of 255 clients to two clusters. The range
975of IPs in all groups in the client config file together, must cover the entire range of client IPs
976from the traffic profile file.
977
978MACs will be allocated incrementally, with a wrap around after ``count'' addresses.
979
980e.g.
981
982*Initiator side: (packets with source in 16.x.x.x net)*
983
984* 16.0.0.1 -> 48.x.x.x - dst_mac: 00:00:00:01:00:00  vlan: 100 
985* 16.0.0.2 -> 48.x.x.x - dst_mac: 00:00:00:01:00:01  vlan: 100 
986* 16.0.0.3 -> 48.x.x.x - dst_mac: 00:00:00:01:00:02  vlan: 100 
987* 16.0.0.4 -> 48.x.x.x - dst_mac: 00:00:00:01:00:03  vlan: 100 
988* 16.0.0.5 -> 48.x.x.x - dst_mac: 00:00:00:01:00:00  vlan: 100 
989* 16.0.0.6 -> 48.x.x.x - dst_mac: 00:00:00:01:00:01  vlan: 100 
990
991*responder side: (packets with source in 48.x.x.x net)* 
992
993* 48.x.x.x -> 16.0.0.1  - dst_mac(from responder) : "00:00:00:02:00:00" , vlan:200
994* 48.x.x.x -> 16.0.0.2  - dst_mac(from responder) : "00:00:00:02:00:01" , vlan:200
995
996and so on. +
997 +
998This means that the MAC addresses of DUTs must be changed to be sequential. Other option is to
999specify instead of ``dst_mac'', ip address, using ``next_hop''. +
1000For example, config file first group will look like:
1001
1002[source,bash]
1003----
1004-    ip_start  : 16.0.0.1
1005     ip_end    : 16.0.0.204
1006     initiator :
1007                 vlan     : 100
1008                 next_hop : 1.1.1.1
1009                 src_ip   : 1.1.1.100
1010     responder :
1011                 vlan     : 200
1012                 next_hop : 2.2.2.1
1013                 src_ip   : 2.2.2.100
1014
1015     count     : 4
1016----
1017
1018In this case, TRex will try to resolve using ARP requests the addresses
10191.1.1.1, 1.1.1.2, 1.1.1.3, 1.1.1.4 (and the range 2.2.2.1-2.2.2.4). If not all IPs are resolved,
1020TRex will exit with an error message. ``src_ip'' will be used for sending gratitues ARP, and
1021for filling relevant fields in ARP request. If no ``src_ip'' given, TRex will look for source
1022IP in the relevant port section in the platform config file (/etc/trex_cfg.yaml). If none is found, TRex
1023will exit with an error message. +
1024If client config file is given, the ``dest_mac'' and ``default_gw'' parameters from the platform config
1025file are ignored.
1026
1027Now, streams will look like: +
1028*Initiator side: (packets with source in 16.x.x.x net)*
1029
1030* 16.0.0.1 -> 48.x.x.x - dst_mac: MAC of 1.1.1.1  vlan: 100
1031* 16.0.0.2 -> 48.x.x.x - dst_mac: MAC of 1.1.1.2  vlan: 100
1032* 16.0.0.3 -> 48.x.x.x - dst_mac: MAC of 1.1.1.3  vlan: 100
1033* 16.0.0.4 -> 48.x.x.x - dst_mac: MAC of 1.1.1.4  vlan: 100
1034* 16.0.0.5 -> 48.x.x.x - dst_mac: MAC of 1.1.1.1  vlan: 100
1035* 16.0.0.6 -> 48.x.x.x - dst_mac: MAC of 1.1.1.2  vlan: 100
1036
1037*responder side: (packets with source in 48.x.x.x net)*
1038
1039* 48.x.x.x -> 16.0.0.1  - dst_mac: MAC of 2.2.2.1 , vlan:200
1040* 48.x.x.x -> 16.0.0.2  - dst_mac: MAC of 2.2.2.2 , vlan:200
1041
1042
1043[NOTE]
1044It is important to understand that the ip to MAC coupling (both with MAC based config or IP based)
1045is done at the beginning and never changes. Meaning, for example, for the MAC case, packets
1046with source IP 16.0.0.2 will always have VLAN 100 and dst MAC 00:00:00:01:00:01.
1047Packets with destination IP 16.0.0.2 will always have VLAN 200 and dst MAC "00:00:00:02:00:01.
1048This way, you can predict exactly which packet (and how many packets) will go to each DUT.
1049
1050*Usage:*
1051
1052[source,bash]
1053----
1054sudo ./t-rex-64 -f cap2/dns.yaml --client_cfg my_cfg.yaml
1055----
1056
1057=== NAT support 
1058
1059TRex can learn dynamic NAT/PAT translation. To enable this feature add `--learn-mode <mode>` to the command line.
1060To learn the NAT translation, TRex must embed information describing the flow a packet belongs to, in the first
1061packet of each flow. This can be done in different methods, depending on the chosen <mode>.
1062
1063*mode 1:*::
1064
1065In case of TCP flow, flow info is embedded in the ACK of the first TCP SYN. +
1066In case of UDP flow, flow info is embedded in the IP identification field of the first packet in the flow. +
1067This mode was developed for testing NAT with firewalls (which usually do not work with mode 2).
1068In this mode, TRex also learn and compensate for TCP sequence number randomization that might be done by the DUT.
1069TRex can learn and compensate for seq num randomization in both directions of the connection.
1070
1071*mode 2:*::
1072
1073Flow info is added in a special IPv4 option header (8 bytes long 0x10 id). The option is added only to the first packet in the flow.
1074This mode does not work with DUTs that drop packets with IP options (for example, Cisco ASA firewall).
1075
1076*mode 3:*::
1077
1078This is like mode 1, with the only change being that TRex does not learn the seq num randomization in the server->client direction.
1079This mode can give much better connections per second performance than mode 1 (still, for all existing firewalls, mode 1 cps rate is more than enough).
1080
1081==== Examples
1082
1083*simple HTTP traffic*
1084
1085[source,bash]
1086----
1087$sudo ./t-rex-64 -f cap2/http_simple.yaml -c 4  -l 1000 -d 100000 -m 30  --learn-mode 1
1088----
1089
1090*SFR traffic without bundling/ALG support*
1091
1092[source,bash]
1093----
1094$sudo ./t-rex-64 -f avl/sfr_delay_10_1g_no_bundling.yaml -c 4  -l 1000 -d 100000 -m 10  --learn-mode 2
1095----
1096
1097*NAT terminal counters:*::
1098
1099[source,python]
1100----
1101-Global stats enabled 
1102 Cpu Utilization : 0.6  %  33.4 Gb/core 
1103 Platform_factor : 1.0
1104 Total-Tx        :       3.77 Gbps   NAT time out    :      917 <1> (0 in wait for syn+ack) <5>
1105 Total-Rx        :       3.77 Gbps   NAT aged flow id:        0 <2>
1106 Total-PPS       :     505.72 Kpps   Total NAT active:      163 <3> (12 waiting for syn) <6>
1107 Total-CPS       :      13.43 Kcps   Total NAT opened:    82677 <4>
1108----
1109<1> Number of connections for which TRex had to send the next packet in the flow, but did not learn the NAT translation yet. Should be 0. Usually, value different than 0 is seen if the DUT drops the flow (probably because it can't handle the number of connections)
1110<2> Number of flows for which when we got the translation info, flow was aged out already. Non 0 value here should be very rare. Can occur only when there is huge latency in the DUT input/output queue.
1111<3> Number of flows for which we sent the first packet, but did not learn the NAT translation yet. Value seen depends on the connection per second rate and round trip time.
1112<4> Total number of translations over the lifetime of the TRex instance. May be different from the total number of flows if template is uni-directional (and consequently does not need translation).
1113<5> Out of the timed out flows, how many were timed out while waiting to learn the TCP seq num randomization of the server->client from the SYN+ACK packet (Seen only in --learn-mode 1)
1114<6> Out of the active NAT sessions, how many are waiting to learn the client->server translation from the SYN packet (others are waiting for SYN+ACK from server) (Seen only in --learn-mode 1)
1115
1116*Configuration for Cisco ASR1000 Series:*::
1117
1118This feature was tested with the following configuration and  sfr_delay_10_1g_no_bundling. yaml traffic profile.
1119Client address range is 16.0.0.1 to 16.0.0.255 
1120
1121[source,python]
1122----
1123interface TenGigabitEthernet1/0/0            <1>
1124 mac-address 0000.0001.0000
1125 mtu 4000
1126 ip address 11.11.11.11 255.255.255.0
1127 ip policy route-map p1_to_p2
1128 ip nat inside                               <2>
1129 load-interval 30
1130!
1131
1132interface TenGigabitEthernet1/1/0
1133 mac-address 0000.0001.0000
1134 mtu 4000
1135 ip address 11.11.11.11 255.255.255.0
1136 ip policy route-map p1_to_p2
1137 ip nat outside                              <3>
1138 load-interval 30
1139
1140ip  nat pool my 200.0.0.0 200.0.0.255 netmask 255.255.255.0  <4>
1141
1142ip nat inside source list 7 pool my overload 
1143access-list 7 permit 16.0.0.0 0.0.0.255                      <5>
1144
1145ip nat inside source list 8 pool my overload                 <6>
1146access-list 8 permit 17.0.0.0 0.0.0.255                      
1147----
1148<1> Must be connected to TRex Client port (router inside port)
1149<2> NAT inside 
1150<3> NAT outside
1151<4> Pool of outside address with overload
1152<5> Match TRex YAML client range
1153<6> In case of dual port TRex
1154
1155// verify 1 and 5 above; rephrased
1156
1157
1158*Limitations:*::
1159
1160. The IPv6-IPv6 NAT feature does not exist on routers, so this feature can work only with IPv4.
1161. Does not support NAT64. 
1162. Bundling/plugin is not fully supported. Consequently, sfr_delay_10.yaml does not work. Use sfr_delay_10_no_bundling.yaml instead.
1163
1164[NOTE]
1165=====================================================================
1166* `--learn-verify` is a TRex debug mechanism for testing the TRex learn mechanism.
1167* Need to run it when DUT is configured without NAT. It will verify that the inside_ip==outside_ip and inside_port==outside_port.
1168=====================================================================
1169
1170=== Flow order/latency verification 
1171
1172In normal mode (without this feature enabled), received traffic is not checked by software. Hardware (Intel NIC) testing for dropped packets occurs at the end of the test. The only exception is the Latency/Jitter packets.
1173This is one reason that with TRex, you *cannot* check features that terminate traffic (for example TCP Proxy).
1174To enable this feature, add `--rx-check <sample>` to the command line options, where <sample> is the sample rate. 
1175The number of flows that will be sent to the software for verification is (1/(sample_rate). For 40Gb/sec traffic you can use a sample rate of 1/128. Watch for Rx CPU% utilization.
1176
1177[NOTE]
1178============
1179This feature changes the TTL of the sampled flows to 255 and expects to receive packets with TTL 254 or 255 (one routing hop). If you have more than one hop in your setup, use `--hops` to change it to a higher value. More than one hop is possible if there are number of routers betwean TRex client side and TRex server side.
1180============
1181
1182This feature ensures that:
1183
1184* Packets get out of DUT in order (from each flow perspective).
1185* There are no packet drops (no need to wait for the end of the test). Without this flag, you must wait for the end of the test in order to identify packet drops, because there is always a difference between TX and Rx, due to RTT.
1186
1187
1188.Full example 
1189[source,bash]
1190----
1191$sudo ./t-rex-64 -f avl/sfr_delay_10_1g.yaml -c 4 -p -l 100 -d 100000 -m 30  --rx-check 128
1192----
1193
1194[source,python]
1195----
1196Cpu Utilization : 0.1 %                                                                       <1>
1197 if|   tx_ok , rx_ok  , rx   ,error,    average   ,   max         , Jitter ,  max window 
1198   |         ,        , check,     , latency(usec),latency (usec) ,(usec)  ,             
1199 --------------------------------------------------------------------------------
1200 0 |     1002,    1002,      2501,   0,         61  ,      70,       3      |  60
1201 1 |     1002,    1002,      2012,   0,         56  ,      63,       2      |  50
1202 2 |     1002,    1002,      2322,   0,         66  ,      74,       5      |  68
1203 3 |     1002,    1002,      1727,   0,         58  ,      68,       2      |  52
1204
1205 Rx Check stats enabled                                                                       <2>
1206 -------------------------------------------------------------------------------------------
1207 rx check:  avg/max/jitter latency,       94  ,     744,       49      |  252  287  309       <3>
1208 
1209 active flows: <6>      10, fif: <5>     308,  drop:        0, errors:        0                <4>
1210 -------------------------------------------------------------------------------------------
1211----
1212<1> CPU% of the Rx thread. If it is too high, *increase* the sample rate.
1213<2> Rx Check section. For more detailed info, press 'r' during the test or at the end of the test.
1214<3> Average latency, max latency, jitter on the template flows in microseconds. This is usually *higher* than the latency check packet because the feature works more on this packet.
1215<4> Drop counters and errors counter should be zero. If not, press 'r' to see the full report or view the report at the end of the test.
1216<5> fif - First in flow. Number of new flows handled by the Rx thread.
1217<6> active flows - number of active flows handled by rx thread 
1218
1219.Press R to Display Full Report
1220[source,python]
1221----
1222 m_total_rx                              : 2 
1223 m_lookup                                : 2 
1224 m_found                                 : 1 
1225 m_fif                                   : 1 
1226 m_add                                   : 1 
1227 m_remove                                : 1 
1228 m_active                                : 0 
1229                                                        <1>
1230 0  0  0  0  1041  0  0  0  0  0  0  0  0  min_delta  : 10 usec 
1231 cnt        : 2 
1232 high_cnt   : 2 
1233 max_d_time : 1041 usec
1234 sliding_average    : 1 usec                            <2>
1235 precent    : 100.0 %
1236 histogram 
1237 -----------
1238 h[1000]  :  2 
1239 tempate_id_ 0 , errors:       0,  jitter: 61           <3>
1240 tempate_id_ 1 , errors:       0,  jitter: 0 
1241 tempate_id_ 2 , errors:       0,  jitter: 0 
1242 tempate_id_ 3 , errors:       0,  jitter: 0 
1243 tempate_id_ 4 , errors:       0,  jitter: 0 
1244 tempate_id_ 5 , errors:       0,  jitter: 0 
1245 tempate_id_ 6 , errors:       0,  jitter: 0 
1246 tempate_id_ 7 , errors:       0,  jitter: 0 
1247 tempate_id_ 8 , errors:       0,  jitter: 0 
1248 tempate_id_ 9 , errors:       0,  jitter: 0 
1249 tempate_id_10 , errors:       0,  jitter: 0 
1250 tempate_id_11 , errors:       0,  jitter: 0 
1251 tempate_id_12 , errors:       0,  jitter: 0 
1252 tempate_id_13 , errors:       0,  jitter: 0 
1253 tempate_id_14 , errors:       0,  jitter: 0 
1254 tempate_id_15 , errors:       0,  jitter: 0 
1255 ager :
1256 m_st_alloc                                 : 1 
1257 m_st_free                                  : 0 
1258 m_st_start                                 : 2 
1259 m_st_stop                                  : 1 
1260 m_st_handle                                : 0 
1261----
1262<1> Errors, if any, shown here
1263<2> Low pass filter on the active average of latency events 
1264<3> Error per template info
1265
1266// IGNORE: this line added to help rendition. Without this line, the "Notes and Limitations" section below does not appear.
1267
1268*Notes and Limitations:*::
1269
1270** To receive the packets TRex does the following:
1271*** Changes the TTL to 0xff and expects 0xFF (loopback) or oxFE (route). (Use `--hop` to configure this value.)
1272*** Adds 24 bytes of metadata as ipv4/ipv6 option header.
1273// clarify "ipv4/ipv6 option header" above
1274
1275== Reference
1276
1277=== Traffic YAML (parameter of -f option)
1278
1279==== Global Traffic YAML section 
1280
1281[source,python]
1282----
1283- duration : 10.0                          <1>
1284  generator :                              <2>
1285          distribution : "seq"           
1286          clients_start : "16.0.0.1"     
1287          clients_end   : "16.0.0.255"   
1288          servers_start : "48.0.0.1"     
1289          servers_end   : "48.0.0.255"   
1290          clients_per_gb : 201
1291          min_clients    : 101
1292          dual_port_mask : "1.0.0.0" 
1293          tcp_aging      : 1
1294          udp_aging      : 1
1295  mac        : [0x00,0x00,0x00,0x01,0x00,0x00] <3>
1296  cap_ipg    : true                            <4>
1297  cap_ipg_min    : 30                          <5>
1298  cap_override_ipg    : 200                    <6>
1299  vlan       : { enable : 1  ,  vlan0 : 100 , vlan1 : 200 } <7>
1300  mac_override_by_ip : true  <8>
1301----
1302<1> Test duration (seconds). Can be overridden using the `-d` option.
1303<2> See the link:trex_manual.html#_clients_servers_ip_allocation_scheme[generator] section.
1304// what does note 2 mean? see somewhere else? isn't this simply the generator section?
1305<3> Default source/destination MAC address. The configuration YAML can override this.
1306<4> true (default) indicates that the IPG is taken from the cap file (also taking into account cap_ipg_min and cap_override_ipg if they exist). false indicates that IPG is taken from per template section.
1307<5> The following two options can set the min ipg in microseconds: (if (pkt_ipg<cap_ipg_min) { pkt_ipg=cap_override_ipg} )
1308<6> Value to override (microseconds), as described in note above.
1309<7> Enable vlan feature. See xref:trex_vlan[trex_vlan section] for info.
1310<8> Enable MAC address replacement by client IP.
1311
1312
1313==== Timer Wheel section configuration  
1314
1315(from v2.13)
1316see xref:timer_w[Timer Wheel section] 
1317
1318==== Per template section 
1319// clarify "per template"  
1320
1321[source,python]
1322----
1323     - name: cap2/dns.pcap <1>
1324       cps : 10.0          <2>
1325       ipg : 10000         <3>
1326       rtt : 10000         <4>
1327       w   : 1             <5>
1328       server_addr : "48.0.0.7"    <6>
1329       one_app_server : true       <7>
1330       
1331----
1332<1> The name of the template pcap file. Can be relative path from the t-rex-64 image directory, or an absolute path. The pcap file should include only one flow. (Exception: in case of plug-ins).
1333<2> Connection per second. This is the value that will be used if specifying -m 1 from command line (giving -m x will multiply this 
1334<3> If the global section of the YAML file includes `cap_ipg    : false`, this line sets the inter-packet gap in microseconds. 
1335<4> Should be set to the same value as ipg (microseconds). 
1336<5> Default value: w=1. This indicates to the IP generator how to generate the flows. If w=2, two flows from the same template will be generated in a burst (more for HTTP that has burst of flows).
1337<6> If `one_app_server` is set to true, then all templates will use the same server.
1338<7> If the same server address is required, set this value to true.
1339
1340
1341
1342=== Configuration YAML (parameter of --cfg option)
1343
1344anchor:trex_config[]
1345
1346The configuration file, in YAML format, configures TRex behavior, including:
1347 
1348- IP address or MAC address for each port (source and destination).
1349- Masked interfaces, to ensure that TRex does not try to use the management ports as traffic ports.
1350- Changing the zmq/telnet TCP port.
1351
1352You specify which config file to use by adding --cfg <file name> to the command line arguments. +
1353If no --cfg given, the default `/etc/trex_cfg.yaml` is used. +
1354Configuration file examples can be found in the `$TREX_ROOT/scripts/cfg` folder.
1355
1356==== Basic Configurations
1357
1358[source,python]
1359----
1360     - port_limit    : 2    #mandatory <1>
1361       version       : 2    #mandatory <2>
1362       interfaces    : ["03:00.0", "03:00.1"]   #mandatory <3>
1363       #enable_zmq_pub  : true #optional <4>
1364       #zmq_pub_port    : 4500 #optional <5>
1365       #prefix          : setup1 #optional <6>
1366       #limit_memory    : 1024 #optional <7>
1367       c               : 4 #optional <8>
1368       port_bandwidth_gb : 10 #optional <9>
1369       port_info       :  # set eh mac addr  mandatory
1370            - default_gw : 1.1.1.1   # port 0 <10>
1371              dest_mac   : '00:00:00:01:00:00' # Either default_gw or dest_mac is mandatory <10>
1372              src_mac    : '00:00:00:02:00:00' # optional <11>
1373              ip         : 2.2.2.2 # optional <12>
1374              vlan       : 15 # optional <13>
1375            - dest_mac   : '00:00:00:03:00:00'  # port 1
1376              src_mac    : '00:00:00:04:00:00'
1377            - dest_mac   : '00:00:00:05:00:00'  # port 2
1378              src_mac    : '00:00:00:06:00:00'
1379            - dest_mac   :   [0x0,0x0,0x0,0x7,0x0,0x01]  # port 3 <14>
1380              src_mac    :   [0x0,0x0,0x0,0x8,0x0,0x02] # <14>
1381----
1382<1>  Number of ports. Should be equal to the number of interfaces listed in 3. - mandatory
1383<2>  Must be set to 2. - mandatory
1384<3>  List of interfaces to use. Run `sudo ./dpdk_setup_ports.py --show` to see the list you can choose from. - mandatory
1385<4>  Enable the ZMQ publisher for stats data, default is true. 
1386<5>  ZMQ port number. Default value is good. If running two TRex instances on the same machine, each should be given distinct number. Otherwise, can remove this line.
1387<6>  If running two TRex instances on the same machine, each should be given distinct name. Otherwise, can remove this line. ( Passed to DPDK as --file-prefix arg)
1388<7>  Limit the amount of packet memory used. (Passed to dpdk as -m arg)
1389<8> Number of threads (cores) TRex will use per interface pair ( Can be overridden by -c command line option )
1390<9> The bandwidth of each interface in Gbs. In this example we have 10Gbs interfaces. For VM, put 1. Used to tune the amount of memory allocated by TRex.
1391<10> TRex need to know the destination MAC address to use on each port. You can specify this in one of two ways: +
1392Specify dest_mac directly. +
1393Specify default_gw (since version 2.10). In this case (only if no dest_mac given), TRex will issue ARP request to this IP, and will use
1394the result as dest MAC. If no dest_mac given, and no ARP response received, TRex will exit.
1395
1396<11> Source MAC to use when sending packets from this interface. If not given (since version 2.10), MAC address of the port will be used.
1397<12> If given (since version 2.10), TRex will issue gratitues ARP for the ip + src MAC pair on appropriate port. In stateful mode,
1398gratitues ARP for each ip will be sent every 120 seconds (Can be changed using --arp-refresh-period argument).
1399<13> If given, gratitues ARP and ARP request will be sent using the given VLAN tag.
1400<14> Old MAC address format. New format is supported since version v2.09.
1401
1402[NOTE]
1403=========================================================================================
1404If you use version earlier than 2.10, or choose to omit the ``ip''
1405and have mac based configuration, be aware that TRex will not send any
1406gratitues ARP and will not answer ARP requests. In this case, you must configure static
1407ARP entries pointing to TRex port on your DUT. For an example config, you can look
1408xref:trex_config[here].
1409=========================================================================================
1410
1411To find out which interfaces (NIC ports) can be used, perform the following:
1412
1413[source,bash]
1414----
1415 $>sudo ./dpdk_setup_ports.py --show
1416
1417 Network devices using DPDK-compatible driver
1418 ============================================
1419
1420 Network devices using kernel driver
1421 ===================================
1422 0000:02:00.0 '82545EM Gigabit Ethernet Controller' if=eth2 drv=e1000 unused=igb_uio *Active* #<1>
1423 0000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb #<2>
1424 0000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
1425 0000:13:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
1426 0000:13:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
1427
1428 Other network devices
1429 =====================
1430 <none>
1431----
1432<1> We see that 02:00.0 is active (our management port).
1433<2> All other NIC ports (03:00.0, 03:00.1, 13:00.0, 13:00.1) can be used.
1434
1435minimum configuration file is:
1436
1437[source,bash]
1438----
1439<none>
1440- port_limit    : 4         
1441  version       : 2
1442  interfaces    : ["03:00.0","03:00.1","13:00.1","13:00.0"]
1443----
1444
1445==== Memory section configuration  
1446
1447The memory section is optional. It is used when there is a need to tune the amount of memory used by TRex packet manager.
1448Default values (from the TRex source code), are usually good for most users. Unless you have some unusual needs, you can
1449eliminate this section.
1450
1451[source,python]
1452----
1453        - port_limit      : 2                                                                           
1454          version       : 2                                                                             
1455          interfaces    : ["03:00.0","03:00.1"]                                                         
1456          memory    :                                           <1>
1457             mbuf_64     : 16380                                <2>
1458             mbuf_128    : 8190
1459             mbuf_256    : 8190
1460             mbuf_512    : 8190
1461             mbuf_1024   : 8190
1462             mbuf_2048   : 4096
1463             traffic_mbuf_64     : 16380                        <3>
1464             traffic_mbuf_128    : 8190
1465             traffic_mbuf_256    : 8190
1466             traffic_mbuf_512    : 8190
1467             traffic_mbuf_1024   : 8190
1468             traffic_mbuf_2048   : 4096
1469             dp_flows    : 1048576                              <4>
1470             global_flows : 10240                               <5>
1471----
1472<1> Memory section header
1473<2> Numbers of memory buffers allocated for packets in transit, per port pair. Numbers are specified per packet size.
1474<3> Numbers of memory buffers allocated for holding the part of the packet which is remained unchanged per template.
1475You should increase numbers here, only if you have very large amount of templates.
1476<4> Number of TRex flow objects allocated (To get best performance they are allocated upfront, and not dynamically).
1477If you expect more concurrent flows than the default (1048576), enlarge this.
1478<5> Number objects TRex allocates for holding NAT ``in transit'' connections. In stateful mode, TRex learn NAT
1479translation by looking at the address changes done by the DUT to the first packet of each flow. So, these are the
1480number of flows for which TRex sent the first flow packet, but did not learn the translation yet. Again, default
1481here (10240) should be good. Increase only if you use NAT and see issues.
1482
1483
1484==== Platform section configuration  
1485
1486The platform section is optional. It is used to tune the performance and allocate the cores to the right NUMA  
1487a configuration file now has the folowing struct to support multi instance 
1488
1489[source,python]
1490----
1491- version       : 2
1492  interfaces    : ["03:00.0","03:00.1"]
1493  port_limit    : 2 
1494....
1495  platform :                                                    <1>
1496        master_thread_id  : 0                                   <2>
1497        latency_thread_id : 5                                   <3>
1498        dual_if   :                                             <4>
1499             - socket   : 0                                     <5>
1500               threads  : [1,2,3,4]                             <6>
1501----
1502<1> Platform section header.
1503<2> Hardware thread_id for control thread.
1504<3> Hardware thread_id for RX thread.
1505<4> ``dual_if'' section defines info for interface pairs (according to the order in ``interfaces'' list).
1506each section, starting with ``- socket'' defines info for different interface pair.
1507<5> The NUMA node from which memory will be allocated for use by the interface pair.
1508<6> Hardware threads to be used for sending packets for the interface pair. Threads are pinned to cores, so specifying threads
1509actually determines the hardware cores.
1510
1511*Real example:* anchor:numa-example[]
1512
1513We connected 2 Intel XL710 NICs close to each other on the motherboard. They shared the same NUMA:
1514
1515image:images/same_numa.png[title="2_NICSs_same_NUMA"]
1516
1517CPU utilization was very high ~100%, with c=2 and c=4 the results were same.
1518
1519Then, we moved the cards to different NUMAs:
1520
1521image:images/different_numa.png[title="2_NICSs_different_NUMAs"]
1522
1523*+*
1524We added configuration to the /etc/trex_cfg.yaml:
1525
1526[source,python]
1527  platform :
1528        master_thread_id  : 0
1529        latency_thread_id : 8
1530        dual_if   :
1531             - socket   : 0
1532               threads  : [1, 2, 3, 4, 5, 6, 7]
1533             - socket   : 1
1534               threads  : [9, 10, 11, 12, 13, 14, 15]
1535
1536This gave best results: with *\~98 Gb/s* TX BW and c=7, CPU utilization became *~21%*! (40% with c=4)
1537
1538
1539==== Timer Wheeel  section configuration  
1540
1541anchor:timer_w[]
1542
1543The memory section is optional. It is used when there is a need to tune the amount of memory used by TRex packet manager.
1544Default values (from the TRex source code), are usually good for most users. Unless you have some unusual needs, you can
1545eliminate this section.
1546
1547==== Timer Wheel section configuration  
1548The flow scheduler uses timer wheel to schedule flows. To tune it for a large number of flows it is possible to change the default values.
1549This is an advance configuration, don't use it if you don't know what you are doing. it can be configure in trex_cfg file and trex traffic profile.
1550
1551[source,python]
1552----
1553  tw :                             
1554     buckets : 1024                <1>
1555     levels  : 3                   <2>
1556     bucket_time_usec : 20.0       <3>
1557----
1558<1> the number of buckets in each level, higher number will improve performance, but will reduce the maximum levels.
1559<2> how many levels.  
1560<3> bucket time in usec. higher number will create more bursts  
1561
1562
1563=== Command line options 
1564
1565anchor:cml-line[]
1566
1567*--allow-coredump*::
1568Allow creation of core dump.
1569
1570*--arp-refresh-period <num>*::
1571Period in seconds between sending of gratuitous ARP for our addresses. Value of 0 means ``never send``.
1572
1573*-c <num>*::
1574Number of hardware threads to use per interface pair. Use at least 4 for TRex 40Gbs. +
1575TRex uses 2 threads for inner needs. Rest of the threads can be used. Maximum number here, can be number of free threads
1576divided by number of interface pairs. +
1577For virtual NICs on VM, we always use one thread per interface pair.
1578
1579*--cfg <file name>*::
1580TRex configuration file to use. See relevant manual section for all config file options.
1581
1582*--checksum-offload*::
1583Enable IP, TCP and UDP tx checksum offloading, using DPDK. This requires all used interfaces to support this.
1584
1585*--client_cfg <file>*::
1586YAML file describing clients configuration. Look link:trex_manual.html#_client_clustering_configuration[here] for details.
1587
1588*-d <num>*::
1589Duration of the test in seconds.
1590
1591*-e*::
1592  Same as `-p`, but change the src/dst IP according to the port. Using this, you will get all the packets of the
1593  same flow from the same port, and with the same src/dst IP. +
1594  It will not work good with NBAR as it expects all clients ip to be sent from same direction.
1595
1596*-f <yaml file>*::
1597Specify traffic YAML configuration file to use. Mandatory option for stateful mode.
1598
1599*--hops <num>*::
1600   Provide number of hops in the setup (default is one hop). Relevant only if the Rx check is enabled.
1601   Look link:trex_manual.html#_flow_order_latency_verification[here] for details.
1602
1603*--iom <mode>*::
1604        I/O mode. Possible values: 0 (silent), 1 (normal), 2 (short).
1605
1606*--ipv6*::
1607       Convert templates to IPv6 mode.
1608
1609*-k <num>*::
1610   Run ``warm up'' traffic for num seconds before starting the test. This is needed if TRex is connected to switch running
1611   spanning tree. You want the switch to see traffic from all relevant source MAC addresses before starting to send real
1612   data. Traffic sent is the same used for the latency test (-l option) +
1613   Current limitation (holds for TRex version 1.82): does not work properly on VM.
1614
1615*-l <rate>*::
1616    In parallel to the test, run latency check, sending packets at rate/sec from each interface.
1617
1618*--learn-mode <mode>*::
1619    Learn the dynamic NAT translation. Look link:trex_manual.html#_nat_support[here] for details.
1620
1621*--learn-verify*::
1622   Used for testing the NAT learning mechanism. Do the learning as if DUT is doing NAT, but verify that packets
1623   are not actually changed.
1624
1625*--limit-ports <port num>*::
1626   Limit the number of ports used. Overrides the ``port_limit'' from config file.
1627
1628*--lm <hex bit mask>*::
1629Mask specifying which ports will send traffic. For example, 0x1 - Only port 0 will send. 0x4 - only port 2 will send.
1630This can be used to verify port connectivity. You can send packets from one port, and look at counters on the DUT.
1631
1632*--lo*::
1633   Latency only - Send only latency packets. Do not send packets from the templates/pcap files.
1634
1635*-m <num>*::
1636   Rate multiplier. TRex will multiply the CPS rate of each template by num.
1637
1638*--nc*::
1639    If set, will terminate exacly at the end of the specified duration.
1640    This provides faster, more accurate TRex termination.
1641    By default (without this option), TRex waits for all flows to terminate gracefully. In case of a very long flow, termination might prolong. 
1642
1643*--no-flow-control-change*::
1644  Prevents TRex from changing flow control. By default (without this option), TRex disables flow control at startup for all cards, except for the Intel XL710 40G card.
1645
1646*--no-key*:: Daemon mode, don't get input from keyboard.
1647
1648*--no-watchdog*:: Disable watchdog.
1649    
1650*-p*::
1651Send all packets of the same flow from the same direction. For each flow, TRex will randomly choose between client port and
1652server port, and send all the packets from this port. src/dst IPs keep their values as if packets are sent from two ports.
1653Meaning, we get on the same port packets from client to server, and from server to client. +
1654If you are using this with a router, you can not relay on routing rules to pass traffic to TRex, you must configure policy
1655based routes to pass all traffic from one DUT port to the other. +
1656
1657*-pm <num>*::
1658   Platform factor. If the setup includes splitter, you can multiply all statistic number displayed by TRex by this factor, so that they will match the DUT counters.
1659  
1660*-pubd*::
1661  Disable ZMQ monitor's publishers.
1662
1663*--rx-check <sample rate>*::
1664        Enable Rx check module. Using this, each thread randomly samples 1/sample_rate of the flows and checks packet order, latency, and additional statistics for the sampled flows.
1665        Note: This feature works on the RX thread.
1666
1667*-v <verbosity level>*::
1668   Show debug info. Value of 1 shows debug info on startup. Value of 3, shows debug info during run at some cases. Might slow down operation.
1669
1670*--vlan*:: Relevant only for stateless mode with Intel 82599 10G NIC.
1671   When configuring flow stat and latency per stream rules, assume all streams uses VLAN.
1672
1673*-w <num seconds>*::
1674   Wait additional time between NICs initialization and sending traffic. Can be useful if DUT needs extra setup time. Default is 1 second.
1675
1676*--active-flows*::            
1677    An experimental switch to scale up or down the number of active flows.
1678    It is not accurate due to the quantization of flow scheduler and in some case does not work. 
1679    Example --active-flows 500000 wil set the ballpark of the active flow to be ~0.5M 
1680
1681
1682ifndef::backend-docbook[]
1683
1684
1685endif::backend-docbook[]
1686
1687== Appendix
1688
1689=== Simulator 
1690 
1691The TRex simulator is a linux application (no DPDK needed) that can run on any Linux (it can also run on TRex machine itself).
1692you can create output pcap file from input of traffic YAML.
1693
1694====  Simulator 
1695
1696
1697[source,bash]
1698----
1699
1700$./bp-sim-64-debug -f avl/sfr_delay_10_1g.yaml -v 1
1701
1702 -- loading cap file avl/delay_10_http_get_0.pcap 
1703 -- loading cap file avl/delay_10_http_post_0.pcap 
1704 -- loading cap file avl/delay_10_https_0.pcap 
1705 -- loading cap file avl/delay_10_http_browsing_0.pcap 
1706 -- loading cap file avl/delay_10_exchange_0.pcap 
1707 -- loading cap file avl/delay_10_mail_pop_0.pcap 
1708 -- loading cap file avl/delay_10_mail_pop_1.pcap 
1709 -- loading cap file avl/delay_10_mail_pop_2.pcap 
1710 -- loading cap file avl/delay_10_oracle_0.pcap 
1711 -- loading cap file avl/delay_10_rtp_160k_full.pcap 
1712 -- loading cap file avl/delay_10_rtp_250k_full.pcap 
1713 -- loading cap file avl/delay_10_smtp_0.pcap 
1714 -- loading cap file avl/delay_10_smtp_1.pcap 
1715 -- loading cap file avl/delay_10_smtp_2.pcap 
1716 -- loading cap file avl/delay_10_video_call_0.pcap 
1717 -- loading cap file avl/delay_10_sip_video_call_full.pcap 
1718 -- loading cap file avl/delay_10_citrix_0.pcap 
1719 -- loading cap file avl/delay_10_dns_0.pcap 
1720 id,name                                    , tps, cps,f-pkts,f-bytes, duration,   Mb/sec,   MB/sec,   c-flows,  PPS,total-Mbytes-duration,errors,flows    #<2>
1721 00, avl/delay_10_http_get_0.pcap             ,404.52,404.52,    44 ,   37830 ,   0.17 , 122.42 ,   15.30 ,         67 , 17799 ,       2 , 0 , 1 
1722 01, avl/delay_10_http_post_0.pcap            ,404.52,404.52,    54 ,   48468 ,   0.21 , 156.85 ,   19.61 ,         85 , 21844 ,       2 , 0 , 1 
1723 02, avl/delay_10_https_0.pcap                ,130.87,130.87,    96 ,   91619 ,   0.22 ,  95.92 ,   11.99 ,         29 , 12564 ,       1 , 0 , 1 
1724 03, avl/delay_10_http_browsing_0.pcap        ,709.89,709.89,    37 ,   34425 ,   0.13 , 195.50 ,   24.44 ,         94 , 26266 ,       2 , 0 , 1 
1725 04, avl/delay_10_exchange_0.pcap             ,253.81,253.81,    43 ,    9848 ,   1.57 ,  20.00 ,    2.50 ,        400 , 10914 ,       0 , 0 , 1 
1726 05, avl/delay_10_mail_pop_0.pcap             ,4.76,4.76,    20 ,    5603 ,   0.17 ,   0.21 ,    0.03 ,          1 ,    95 ,       0 , 0 , 1 
1727 06, avl/delay_10_mail_pop_1.pcap             ,4.76,4.76,   114 ,  101517 ,   0.25 ,   3.86 ,    0.48 ,          1 ,   543 ,       0 , 0 , 1 
1728 07, avl/delay_10_mail_pop_2.pcap             ,4.76,4.76,    30 ,   15630 ,   0.19 ,   0.60 ,    0.07 ,          1 ,   143 ,       0 , 0 , 1 
1729 08, avl/delay_10_oracle_0.pcap               ,79.32,79.32,   302 ,   56131 ,   6.86 ,  35.62 ,    4.45 ,        544 , 23954 ,       0 , 0 , 1 
1730 09, avl/delay_10_rtp_160k_full.pcap          ,2.78,8.33,  1354 , 1232757 ,  61.24 ,  27.38 ,    3.42 ,        170 ,  3759 ,       0 , 0 , 3 
1731 10, avl/delay_10_rtp_250k_full.pcap          ,1.98,5.95,  2069 , 1922000 ,  61.38 ,  30.48 ,    3.81 ,        122 ,  4101 ,       0 , 0 , 3 
1732 11, avl/delay_10_smtp_0.pcap                 ,7.34,7.34,    22 ,    5618 ,   0.19 ,   0.33 ,    0.04 ,          1 ,   161 ,       0 , 0 , 1 
1733 12, avl/delay_10_smtp_1.pcap                 ,7.34,7.34,    35 ,   18344 ,   0.21 ,   1.08 ,    0.13 ,          2 ,   257 ,       0 , 0 , 1 
1734 13, avl/delay_10_smtp_2.pcap                 ,7.34,7.34,   110 ,   96544 ,   0.27 ,   5.67 ,    0.71 ,          2 ,   807 ,       0 , 0 , 1 
1735 14, avl/delay_10_video_call_0.pcap           ,11.90,11.90,  2325 , 2532577 ,  36.56 , 241.05 ,   30.13 ,        435 , 27662 ,       3 , 0 , 1 
1736 15, avl/delay_10_sip_video_call_full.pcap    ,29.35,58.69,  1651 ,  120315 ,  24.56 ,  28.25 ,    3.53 ,        721 , 48452 ,       0 , 0 , 2 
1737 16, avl/delay_10_citrix_0.pcap               ,43.62,43.62,   272 ,   84553 ,   6.23 ,  29.51 ,    3.69 ,        272 , 11866 ,       0 , 0 , 1 
1738 17, avl/delay_10_dns_0.pcap                  ,1975.02,1975.02,     2 ,     162 ,   0.01 ,   2.56 ,    0.32 ,         22 ,  3950 ,       0 , 0 , 1 
1739
1740 00, sum                                      ,4083.86,93928.84,  8580 , 6413941 ,   0.00 , 997.28 ,  124.66 ,       2966 , 215136 ,      12 , 0 , 23 
1741 Memory usage 
1742 size_64        : 1687 
1743 size_128       : 222 
1744 size_256       : 798 
1745 size_512       : 1028 
1746 size_1024      : 86 
1747 size_2048      : 4086 
1748 Total    :       8.89 Mbytes  159% util #<1>
1749
1750----
1751<1> the memory usage of the templates 
1752<2> CSV for all the templates 
1753
1754
1755=== firmware update to XL710/X710 
1756anchor:xl710-firmware[]
1757 
1758To upgrade the firmware  follow  this
1759
1760==== Download the driver 
1761
1762*Download driver i40e from link:https://downloadcenter.intel.com/download/24411/Network-Adapter-Driver-for-PCI-E-40-Gigabit-Network-Connections-under-Linux-[here]
1763*Build the kernel module
1764
1765[source,bash]
1766----
1767$tar -xvzf i40e-1.3.47
1768$cd i40e-1.3.47/src
1769$make 
1770$sudo insmod  i40e.ko
1771----
1772
1773
1774====  Bind the NIC to Linux
1775
1776In this stage we bind the NIC to Linux (take it from DPDK)
1777
1778[source,bash]
1779----
1780$sudo ./dpdk_nic_bind.py --status # show the ports 
1781
1782Network devices using DPDK-compatible driver
1783============================================
17840000:02:00.0 'Device 1583' drv=igb_uio unused=      #<1>
17850000:02:00.1 'Device 1583' drv=igb_uio unused=      #<2>
17860000:87:00.0 'Device 1583' drv=igb_uio unused=
17870000:87:00.1 'Device 1583' drv=igb_uio unused=
1788
1789$sudo dpdk_nic_bind.py -u 02:00.0  02:00.1          #<3> 
1790
1791$sudo dpdk_nic_bind.py -b i40e 02:00.0 02:00.1      #<4>
1792
1793$ethtool -i p1p2                                    #<5>  
1794
1795driver: i40e
1796version: 1.3.47
1797firmware-version: 4.24 0x800013fc 0.0.0             #<6>
1798bus-info: 0000:02:00.1
1799supports-statistics: yes
1800supports-test: yes
1801supports-eeprom-access: yes
1802supports-register-dump: yes
1803supports-priv-flags: yes
1804
1805   
1806$ethtool -S p1p2 
1807$lspci -s 02:00.0 -vvv                              #<7>
1808
1809
1810----
1811<1> XL710 ports that need to unbind from DPDK
1812<2> XL710 ports that need to unbind from DPDK
1813<3> Unbind from DPDK using this command
1814<4> Bind to linux to i40e driver 
1815<5> Show firmware version throw linux driver 
1816<6> Firmare version
1817<7> More info 
1818
1819
1820====  Upgrade 
1821
1822Download NVMUpdatePackage.zip from Intel site link:http://downloadcenter.intel.com/download/24769/NVM-Update-Utility-for-Intel-Ethernet-Converged-Network-Adapter-XL710-X710-Series[here]  
1823It includes the utility `nvmupdate64e`
1824
1825Run this:
1826
1827[source,bash]
1828----
1829$sudo ./nvmupdate64e  
1830----
1831
1832You might need a power cycle and to run this command a few times to get the latest firmware  
1833
1834====  QSFP+ support for XL710
1835
1836see link:https://www.google.co.il/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwjJhPSH3b3LAhUp7nIKHSkACUYQFggaMAA&url=http%3A%2F%2Fwww.intel.co.id%2Fcontent%2Fdam%2Fwww%2Fpublic%2Fus%2Fen%2Fdocuments%2Frelease-notes%2Fxl710-ethernet-controller-feature-matrix.pdf&usg=AFQjCNFhwozfz-XuKGMOy9_MJDbetw15Og&sig2=ce7YU9F9Et6xf6KvqSFBxg&bvm=bv.116636494,d.bGs[QSFP+ support] for QSFP+ support and Firmware requirement for XL710
1837
1838
1839=== TRex with ASA 5585
1840
1841When running TRex aginst ASA 5585, you have to notice following things:
1842
1843* ASA can't forward ipv4 options, so there is a need to use --learn-mode 1 (or 3) in case of NAT. In this mode, bidirectional UDP flows are not supported.
1844--learn-mode 1 support TCP sequence number randomization in both sides of the connection (client to server and server client). For this to work, TRex must learn
1845the translation of packets from both sides, so this mode reduce the amount of connections per second TRex can generate (The number is still high enough to test
1846any existing firewall). If you need higher cps rate, you can use --learn-mode 3. This mode handles sequence number randomization on client->server side only.
1847* Latency should be tested using ICMP with `--l-pkt-mode 2`
1848
1849====  ASA 5585 sample configuration
1850
1851[source,bash]
1852----
1853ciscoasa# show running-config 
1854: Saved
1855
1856: 
1857: Serial Number: JAD194801KX
1858: Hardware:   ASA5585-SSP-10, 6144 MB RAM, CPU Xeon 5500 series 2000 MHz, 1 CPU (4 cores)
1859:
1860ASA Version 9.5(2) 
1861!
1862hostname ciscoasa
1863enable password 8Ry2YjIyt7RRXU24 encrypted
1864passwd 2KFQnbNIdI.2KYOU encrypted
1865names
1866!
1867interface Management0/0
1868 management-only
1869 nameif management
1870 security-level 100
1871 ip address 10.56.216.106 255.255.255.0 
1872!             
1873interface TenGigabitEthernet0/8
1874 nameif inside
1875 security-level 100
1876 ip address 15.0.0.1 255.255.255.0 
1877!             
1878interface TenGigabitEthernet0/9
1879 nameif outside
1880 security-level 0
1881 ip address 40.0.0.1 255.255.255.0 
1882!             
1883boot system disk0:/asa952-smp-k8.bin
1884ftp mode passive
1885pager lines 24
1886logging asdm informational
1887mtu management 1500
1888mtu inside 9000
1889mtu outside 9000
1890no failover   
1891no monitor-interface service-module 
1892icmp unreachable rate-limit 1 burst-size 1
1893no asdm history enable
1894arp outside 40.0.0.2 90e2.baae.87d1 
1895arp inside 15.0.0.2 90e2.baae.87d0 
1896arp timeout 14400
1897no arp permit-nonconnected
1898route management 0.0.0.0 0.0.0.0 10.56.216.1 1
1899route inside 16.0.0.0 255.0.0.0 15.0.0.2 1
1900route outside 48.0.0.0 255.0.0.0 40.0.0.2 1
1901timeout xlate 3:00:00
1902timeout pat-xlate 0:00:30
1903timeout conn 1:00:00 half-closed 0:10:00 udp 0:02:00 sctp 0:02:00 icmp 0:00:02
1904timeout sunrpc 0:10:00 h323 0:05:00 h225 1:00:00 mgcp 0:05:00 mgcp-pat 0:05:00
1905timeout sip 0:30:00 sip_media 0:02:00 sip-invite 0:03:00 sip-disconnect 0:02:00
1906timeout sip-provisional-media 0:02:00 uauth 0:05:00 absolute
1907timeout tcp-proxy-reassembly 0:01:00
1908timeout floating-conn 0:00:00
1909user-identity default-domain LOCAL
1910http server enable
1911http 192.168.1.0 255.255.255.0 management
1912no snmp-server location
1913no snmp-server contact
1914crypto ipsec security-association pmtu-aging infinite
1915crypto ca trustpool policy
1916telnet 0.0.0.0 0.0.0.0 management
1917telnet timeout 5
1918ssh stricthostkeycheck
1919ssh timeout 5 
1920ssh key-exchange group dh-group1-sha1
1921console timeout 0
1922!             
1923tls-proxy maximum-session 1000
1924!             
1925threat-detection basic-threat
1926threat-detection statistics access-list
1927no threat-detection statistics tcp-intercept
1928dynamic-access-policy-record DfltAccessPolicy
1929!             
1930class-map icmp-class
1931 match default-inspection-traffic
1932class-map inspection_default
1933 match default-inspection-traffic
1934!             
1935!             
1936policy-map type inspect dns preset_dns_map
1937 parameters   
1938  message-length maximum client auto
1939  message-length maximum 512
1940policy-map icmp_policy
1941 class icmp-class
1942  inspect icmp 
1943policy-map global_policy
1944 class inspection_default
1945  inspect dns preset_dns_map 
1946  inspect ftp 
1947  inspect h323 h225 
1948  inspect h323 ras 
1949  inspect rsh 
1950  inspect rtsp 
1951  inspect esmtp 
1952  inspect sqlnet 
1953  inspect skinny  
1954  inspect sunrpc 
1955  inspect xdmcp 
1956  inspect sip  
1957  inspect netbios 
1958  inspect tftp 
1959  inspect ip-options 
1960!             
1961service-policy global_policy global
1962service-policy icmp_policy interface outside
1963prompt hostname context 
1964!             
1965jumbo-frame reservation
1966!             
1967no call-home reporting anonymous
1968: end         
1969ciscoasa# 
1970----
1971
1972====  TRex commands example 
1973
1974Using these commands the configuration is:
1975
19761. NAT learn mode (TCP-ACK)
19772. Delay of 1 second at start up (-k 1). It was added because ASA drops the first packets.
19783. Latency is configured to ICMP reply mode (--l-pkt-mode 2).
1979
1980
1981*Simple HTTP:*::
1982[source,bash]
1983----
1984$sudo ./t-rex-64 -f cap2/http_simple.yaml -d 1000 -l 1000 --l-pkt-mode 2 -m 1000  --learn-mode 1 -k 1
1985----
1986
1987This is more realistic traffic for enterprise (we removed from SFR file the bidirectional UDP traffic templates, which (as described above), are not supported in this mode).
1988
1989*Enterprise profile:*::
1990[source,bash]
1991----
1992$sudo ./t-rex-64 -f avl/sfr_delay_10_1g_asa_nat.yaml -d 1000 -l 1000 --l-pkt-mode 2 -m 4 --learn-mode 1 -k 1
1993----
1994
1995The TRex output
1996
1997[source,bash]
1998----
1999-Per port stats table 
2000      ports |               0 |               1 
2001 -----------------------------------------------------------------------------------------
2002   opackets |       106347896 |       118369678 
2003     obytes |     33508291818 |    118433748567 
2004   ipackets |       118378757 |       106338782 
2005     ibytes |    118434305375 |     33507698915 
2006    ierrors |               0 |               0 
2007    oerrors |               0 |               0 
2008      Tx Bw |     656.26 Mbps |       2.27 Gbps 
2009
2010-Global stats enabled 
2011 Cpu Utilization : 18.4  %  31.7 Gb/core 
2012 Platform_factor : 1.0  
2013 Total-Tx        :       2.92 Gbps   NAT time out    :        0 #<1> (0 in wait for syn+ack) #<1>
2014 Total-Rx        :       2.92 Gbps   NAT aged flow id:        0 #<1>
2015 Total-PPS       :     542.29 Kpps   Total NAT active:      163  (12 waiting for syn)
2016 Total-CPS       :       8.30 Kcps   Nat_learn_errors:        0
2017
2018 Expected-PPS    :     539.85 Kpps
2019 Expected-CPS    :       8.29 Kcps  
2020 Expected-BPS    :       2.90 Gbps  
2021
2022 Active-flows    :     7860  Clients :      255   Socket-util : 0.0489 %    
2023 Open-flows      :  3481234  Servers :     5375   Socket :     7860 Socket/Clients :  30.8 
2024 drop-rate       :       0.00  bps   #<1>
2025 current time    : 425.1 sec  
2026 test duration   : 574.9 sec  
2027
2028-Latency stats enabled 
2029 Cpu Utilization : 0.3 %  
2030 if|   tx_ok , rx_ok  , rx   ,error,    average   ,   max         , Jitter ,  max window 
2031   |         ,        , check,     , latency(usec),latency (usec) ,(usec)  ,             
2032 ---------------------------------------------------------------------------------------------------------------- 
2033 0 |   420510,  420495,         0,   1,         58  ,    1555,      14      |  240  257  258  258  219  930  732  896  830  472  190  207  729 
2034 1 |   420496,  420509,         0,   1,         51  ,    1551,      13      |  234  253  257  258  214  926  727  893  826  468  187  204  724
2035----
2036<1>  These counters should be zero 
2037
2038anchor:fedora21_example[]
2039
2040=== Fedora 21 Server installation
2041
2042Download the .iso file from link above, boot with it using Hypervisor or CIMC console. +
2043Troubleshooting -> install in basic graphics mode
2044
2045* In packages selection, choose:
2046
2047** C Development Tools and Libraries
2048
2049** Development Tools
2050
2051** System Tools
2052
2053* Set Ethernet configuration if needed
2054
2055* Use default hard-drive partitions, reclaim space if needed
2056
2057* After installation, edit file /etc/selinux/config +
2058set: +
2059SELINUX=disabled
2060
2061* Run: +
2062systemctl disable firewalld 
2063
2064* Edit file /etc/yum.repos.d/fedora-updates.repo +
2065set everywhere: +
2066enabled=0
2067
2068* Reboot
2069
2070=== Configure Linux host as network emulator
2071
2072There are lots of Linux tutorials on the web, so this will not be full tutorial, only highlighting some key points. Commands
2073were checked on Ubuntu system.
2074
2075For this example:
2076
20771. TRex Client side network is 16.0.0.x 
20782. TRex Server side network is 48.0.0.x 
20793. Linux Client side network eth0 is configured with IPv4 as 172.168.0.1 
20804. Linux Server side network eth1 is configured with IPv4 as 10.0.0.1 
2081
2082[source,bash]
2083----
2084
2085  TRex-0 (16.0.0.1->48.0.0.1 )   <-->
2086
2087                ( 172.168.0.1/255.255.0.0)-eth0 [linux] -( 10.0.0.1/255.255.0.0)-eth1 
2088
2089                <--> TRex-1 (16.0.0.1<-48.0.0.1)
2090  
2091----
2092
2093
2094==== Enable forwarding
2095One time (will be discarded after reboot): +
2096
2097[source,bash]
2098----
2099echo 1 > /proc/sys/net/ipv4/ip_forward
2100----
2101To make this permanent, add the following line to the file /etc/sysctl.conf: +
2102----
2103net.ipv4.ip_forward=1
2104----
2105
2106==== Add static routes
2107Example if for the default TRex networks, 48.0.0.0 and 16.0.0.0.
2108
2109Routing all traffic from 48.0.0.0 to the gateway 10.0.0.100
2110[source,bash]
2111----
2112route add -net 48.0.0.0 netmask 255.255.0.0 gw 10.0.0.100
2113----
2114
2115Routing all traffic from 16.0.0.0 to the gateway 172.168.0.100
2116[source,bash]
2117----
2118route add -net 16.0.0.0 netmask 255.255.0.0 gw 172.168.0.100
2119----
2120If you use stateless mode, and decide to add route only in one direction, remember to disable reverse path check. +
2121For example, to disable on all interfaces:
2122[source,bash]
2123----
2124for i in /proc/sys/net/ipv4/conf/*/rp_filter ; do
2125  echo 0 > $i 
2126done
2127----
2128
2129Alternatively, you can edit /etc/network/interfaces, and add something like this for both ports connected to TRex.
2130This will take effect, only after restarting networking (rebooting the machine in an alternative also).
2131----
2132auto eth1
2133iface eth1 inet static
2134address 16.0.0.100
2135netmask 255.0.0.0
2136network 16.0.0.0
2137broadcast 16.255.255.255
2138... same for 48.0.0.0
2139----
2140
2141==== Add static ARP entries
2142[source,bash]
2143----
2144sudo arp -s 10.0.0.100 <Second TRex port MAC>
2145sudo arp -s 172.168.0.100 <TRex side the NICs are not visible to ifconfig, run:
2146----
2147
2148=== Configure Linux to use VF on Intel X710 and 82599 NICs
2149
2150TRex supports paravirtualized interfaces such as VMXNET3/virtio/E1000 however when connected to a vSwitch, the vSwitch limits the performance. VPP or OVS-DPDK can improve the performance but require more software resources to handle the rate.
2151SR-IOV can accelerate the performance and reduce CPU resource usage as well as latency by utilizing NIC hardware switch capability (the switching  is done by hardware).
2152TRex version 2.15 now includes SR-IOV support for XL710 and X710.
2153The following diagram compares between vSwitch and SR-IOV.
2154
2155image:images/sr_iov_vswitch.png[title="vSwitch_main",width=850]
2156
2157One use case which shows the performance gain that can be acheived by using SR-IOV is when a user wants to create a pool of TRex VMs that tests a pool of virtual DUTs (e.g. ASAv,CSR etc.)
2158When using newly supported SR-IOV, compute, storage and networking resources can be controlled dynamically (e.g by using OpenStack)
2159 
2160image:images/sr_iov_trex.png[title="vSwitch_main",width=850]
2161
2162The above diagram is an example of one server with two NICS. TRex VMs can be allocated on one NIC while the DUTs can be allocated on another.
2163
2164
2165Following are some links we used and lessons we learned while putting up an environment for testing TRex with VF interfaces (using SR-IOV).
2166This is by no means a full toturial of VF usage, and different Linux distributions might need slightly different handling.
2167
2168==== Links and resources
2169link:http://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/xl710-sr-iov-config-guide-gbe-linux-brief.pdf[This]
2170is a good tutorial by Intel of SR-IOV and how to configure. +
2171
2172link:http://dpdk.org/doc/guides/nics/intel_vf.html[This] is a tutorial from DPDK documentation. +
2173
2174==== Linux configuration
2175First, need to verify BIOS support for the feature. Can consult link:http://kpanic.de/node/8[this link] for directions. +
2176Second, need to make sure you have the correct kernel options. +
2177We added the following options to the kernel boot command on Grub: ``iommu=pt intel_iommu=on pci_pt_e820_access=on''. This
2178was needed on Fedora and Ubuntu. When using Centos, adding these options was not needed. +
2179To load the kernel module with the correct VF parameters after reboot add following line ``options i40e max_vfs=1,1'' to some file in ``/etc/modprobe.d/'' +
2180On Centos, we needed to also add the following file (example for x710 ): +
2181
2182[source,bash]
2183----
2184cat /etc/sysconfig/modules/i40e.modules
2185#!/bin/sh
2186rmmod i40e >/dev/null 2>&1
2187exec /sbin/modprobe i40e >/dev/null 2>&1
2188----
2189
2190==== x710 specific instructions
2191For x710 (i40e driver), we needed to download latest kernel driver. On all distributions we were using, existing driver was not new enough. +
2192To make the system use your new compiled driver with the correct parameters: +
2193Copy the .ko file to /lib/modules/Your kernel version as seen by uname -r/kernel/drivers/net/ethernet/intel/i40e/i40e.ko +
2194
2195==== 82599 specific instructions
2196In order to make VF interfaces work correctly, we had to increase mtu on related PF interfaces. +
2197For example, if you run with max_vfs=1,1 (one VF per PF), you will have something like this:
2198
2199[source,bash]
2200----
2201sudo ./dpdk_nic_bind.py -s
2202Network devices using DPDK-compatible driver
2203============================================
22040000:03:10.0 '82599 Ethernet Controller Virtual Function' drv=igb_uio unused=
22050000:03:10.1 '82599 Ethernet Controller Virtual Function' drv=igb_uio unused=
2206
2207Network devices using kernel driver
2208===================================
22090000:01:00.0 'I350 Gigabit Network Connection' if=eth0 drv=igb unused=igb_uio *Active*
22100000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=eth2 drv=ixgbe unused=igb_uio
22110000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=eth3 drv=ixgbe unused=igb_uio
2212----
2213
2214In order to work with 0000:03:10.0 and 0000:03:10.1, you will have to run the following +
2215[source,bash]
2216----
2217sudo ifconfig eth3 up mtu 9000
2218sudo ifconfig eth2 up mtu 9000
2219----
2220
2221
2222TRex stateful performance::
2223
2224Using the following command, running on x710 card with VF driver, we can see that TRex can reach 30GBps, using only one core. We can also see that the average latency is around 20 usec, which is pretty much the same value we get on loopback ports with x710 physical function without VF.
2225$sudo ./t-rex-64 -f cap2/http_simple.yaml -m 40000 -l 100 -c 1 -p
2226
2227[source,python]
2228----
2229
2230$sudo ./t-rex-64 -f cap2/http_simple.yaml -m 40000 -l 100 -c 1 -p
2231
2232  -Per port stats table
2233      ports |               0 |               1
2234  -----------------------------------------------------------------------------------------
2235   opackets |       106573954 |       107433792
2236     obytes |     99570878833 |    100374254956
2237   ipackets |       107413075 |       106594490
2238     ibytes | 100354899813    |     99590070585
2239    ierrors |            1038 |            1027
2240    oerrors |               0 |               0
2241      Tx Bw |      15.33 Gbps |      15.45 Gbps
2242 
2243-Global stats enabled
2244Cpu Utilization : 91.5  %  67.3 Gb/core
2245Platform_factor : 1.0
2246Total-Tx :      30.79 Gbps
2247Total-Rx :      30.79 Gbps
2248Total-PPS :       4.12 Mpps
2249Total-CPS :     111.32 Kcps
2250 
2251Expected-PPS :       4.11 Mpps
2252Expected-CPS :     111.04 Kcps
2253Expected-BPS :      30.71 Gbps
2254 
2255Active-flows :    14651  Clients : 255   Socket-util : 0.0912 %
2256Open-flows :  5795073  Servers : 65535   Socket :    14652 Socket/Clients :  57.5
2257drop-rate :       0.00 bps
2258current time : 53.9 sec
2259test duration : 3546.1 sec
2260 
2261 -Latency stats enabled
2262Cpu Utilization : 23.4 %
2263if| tx_ok , rx_ok  , rx check ,error,       latency (usec) ,    Jitter         
2264   | ,        ,          , ,   average   , max  ,    (usec)
2265  -------------------------------------------------------------------------------
22660 | 5233,    5233,         0, 0,         19  , 580,       5      | 37  37  37 4
22671 | 5233,    5233,         0, 0,         22  , 577,       5      | 38  40  39 3
2268----
2269
2270TRex stateless performance::
2271
2272[source,python]
2273----
2274
2275$sudo ./t-rex-64 -i -c 1
2276
2277trex>portattr
2278Port Status
2279 
2280     port |          0           |          1
2281  -------------------------------------------------------------
2282  driver          | net_i40e_vf      |     net_i40e_vf
2283  description     | XL710/X710 Virtual  |  XL710/X710 Virtual
2284 
2285With the console command:
2286start -f stl/imix.py -m 8mpps --force --port 0
2287We can see, that we can reach 8M packet per second, which in this case is around 24.28 Gbit/second.
2288 
2289Global Statistics
2290 
2291connection   : localhost, Port 4501                  total_tx_L2  : 24.28 Gb/sec
2292version      : v2.15 total_tx_L1  : 25.55 Gb/sec
2293cpu_util.    : 80.6% @ 1 cores (1 per port)          total_rx     : 24.28 Gb/sec
2294rx_cpu_util. : 66.8%                                 total_pps    : 7.99 Mpkt/sec
2295async_util.  : 0.18% / 1.84 KB/sec                   drop_rate    : 0.00 b/sec
2296queue_full   : 3,467 pkts
2297 
2298Port Statistics
2299 
2300   port    |         0         |         1         | total
2301  ----------------------------------------------------------------------
2302  owner      |           ibarnea |           ibarnea |
2303  link       |                UP |                UP |
2304  state      | TRANSMITTING      |              IDLE |
2305  speed      |           40 Gb/s |           40 Gb/s |
2306  CPU util.  | 80.6%             |              0.0% |
2307  --         |                   |                   |
2308  Tx bps L2  | 24.28 Gbps        |          0.00 bps |        24.28 Gbps
2309  Tx bps L1  | 25.55 Gbps        |             0 bps |        25.55 Gbps
2310  Tx pps     | 7.99 Mpps         |          0.00 pps |         7.99 Mpps
2311  Line Util. |           63.89 % |            0.00 % |
2312  ---        |                   |                   |
2313  Rx bps     | 0.00 bps          |        24.28 Gbps |        24.28 Gbps
2314  Rx pps     | 0.00 pps          |         7.99 Mpps |         7.99 Mpps
2315  ----       |                   |                   |
2316  opackets   | 658532501         |                 0 |         658532501
2317  ipackets   |                 0 |         658612569 |         658612569
2318  obytes     | 250039721918      |                 0 |      250039721918
2319  ibytes     |                 0 |      250070124150 |      250070124150
2320  tx-bytes   | 250.04 GB         |               0 B |         250.04 GB
2321  rx-bytes   |               0 B |         250.07 GB |         250.07 GB
2322  tx-pkts    | 658.53 Mpkts      |            0 pkts |      658.53 Mpkts
2323  rx-pkts    | 0 pkts            |      658.61 Mpkts |      658.61 Mpkts
2324  -----      |                   |                   |
2325  oerrors    |                 0 |                 0 |                 0
2326  ierrors    |                 0 |            15,539 |            15,539
2327----
2328
2329
2330==== Performance
2331See the performance tests we did link:trex_vm_bench.html[here]
2332
2333=== Mellanox ConnectX-4 support 
2334
2335anchor:connectx_support[]
2336
2337Mellanox ConnectX-4 adapter family supports 100/56/40/25/10 Gb/s Ethernet speeds. 
2338Its DPDK support is a bit different from Intel DPDK support, more information can be found link:http://dpdk.org/doc/guides/nics/mlx5.html[here].
2339Intel NICs do not require additional kernel drivers (except for igb_uio which is already supported in most distributions). ConnectX-4 works on top of Infiniband API (verbs) and requires special kernel modules/user space libs. 
2340This means that it is required to install OFED package to be able to work with this NIC.
2341Installing the full OFED package is the simplest way to make it work (trying to install part of the package can work too but didn't work for us).
2342The advantage of this model is that you can control it using standard Linux tools (ethtool and ifconfig will work).
2343The disadvantage is the OFED dependency.
2344
2345==== Installation  
2346
2347==== Install Linux 
2348
2349We tested the following distro with TRex and OFED. Others might work too.
2350
2351* CentOS 7.2 
2352
2353Following distro was tested and did *not* work for us.
2354
2355* Fedora 21 (3.17.4-301.fc21.x86_64)  
2356* Ubuntu 14.04.3 LTS (GNU/Linux 3.19.0-25-generic x86_64)  -- crash when RSS was enabled link:https://trex-tgn.cisco.com/youtrack/issue/trex-294[MLX RSS issue]
2357
2358==== Install OFED 
2359
2360Information was taken from  link:http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers[Install OFED]
2361
2362* Download 3.4-2 OFED tar for your distro  
2363
2364[IMPORTANT]
2365=====================================
2366The version must be *MLNX_OFED_LINUX-3.4-2* 
2367=====================================
2368
2369[IMPORTANT]
2370=====================================
2371Make sure you have an internet connection without firewalls for HTTPS/HTTP - required by yum/apt-get
2372=====================================
2373
2374.Verify md5
2375[source,bash]
2376----
2377$md5sum md5sum MLNX_OFED_LINUX-3.4-2.0.0.0-rhel7.2-x86_64.tgz
237858b9fb369d7c62cedbc855661a89a9fd  MLNX_OFED_LINUX-3.4-2.0.0.0-rhel7.2-x86_64.tgz
2379----
2380
2381.Open the tar
2382[source,bash]
2383----
2384$tar -xvzf MLNX_OFED_LINUX-3.4-2.0.0.0-rhel7.2-x86_64.tgz
2385$cd MLNX_OFED_LINUX-3.4-2.0.0.0-rhel7.2-x86_64
2386----
2387
2388.Run Install script
2389[source,bash]
2390----
2391$sudo ./mlnxofedinstall  
2392
2393
2394Log: /tmp/ofed.build.log
2395Logs dir: /tmp/MLNX_OFED_LINUX.10406.logs
2396
2397Below is the list of MLNX_OFED_LINUX packages that you have chosen
2398(some may have been added by the installer due to package dependencies):
2399
2400ofed-scripts
2401mlnx-ofed-kernel-utils
2402mlnx-ofed-kernel-dkms
2403iser-dkms
2404srp-dkms
2405mlnx-sdp-dkms
2406mlnx-rds-dkms
2407mlnx-nfsrdma-dkms
2408libibverbs1
2409ibverbs-utils
2410libibverbs-dev
2411libibverbs1-dbg
2412libmlx4-1
2413libmlx4-dev
2414libmlx4-1-dbg
2415libmlx5-1
2416libmlx5-dev
2417libmlx5-1-dbg
2418libibumad
2419libibumad-static
2420libibumad-devel
2421ibacm
2422ibacm-dev
2423librdmacm1
2424librdmacm-utils
2425librdmacm-dev
2426mstflint
2427ibdump
2428libibmad
2429libibmad-static
2430libibmad-devel
2431libopensm
2432opensm
2433opensm-doc
2434libopensm-devel
2435infiniband-diags
2436infiniband-diags-compat
2437mft
2438kernel-mft-dkms
2439libibcm1
2440libibcm-dev
2441perftest
2442ibutils2
2443libibdm1
2444ibutils
2445cc-mgr
2446ar-mgr
2447dump-pr
2448ibsim
2449ibsim-doc
2450knem-dkms
2451mxm
2452fca
2453sharp
2454hcoll
2455openmpi
2456mpitests
2457knem
2458rds-tools
2459libdapl2
2460dapl2-utils
2461libdapl-dev
2462srptools
2463mlnx-ethtool
2464libsdp1
2465libsdp-dev
2466sdpnetstat
2467
2468This program will install the MLNX_OFED_LINUX package on your machine.
2469Note that all other Mellanox, OEM, OFED, or Distribution IB packages will be removed.
2470Do you want to continue?[y/N]:y
2471
2472Checking SW Requirements...
2473
2474One or more required packages for installing MLNX_OFED_LINUX are missing.
2475Attempting to install the following missing packages:
2476autotools-dev tcl debhelper dkms tk8.4 libgfortran3 graphviz chrpath automake dpatch flex bison autoconf quilt m4 tcl8.4 libltdl-dev pkg-config pytho
2477bxml2 tk swig gfortran libnl1
2478
2479..
2480
2481Removing old packages...
2482Installing new packages
2483Installing ofed-scripts-3.4...
2484Installing mlnx-ofed-kernel-utils-3.4...
2485Installing mlnx-ofed-kernel-dkms-3.4...
2486
2487Removing old packages...
2488Installing new packages
2489Installing ofed-scripts-3.4...
2490Installing mlnx-ofed-kernel-utils-3.4...
2491Installing mlnx-ofed-kernel-dkms-3.4...
2492Installing iser-dkms-1.8.1...
2493Installing srp-dkms-1.6.1...
2494Installing mlnx-sdp-dkms-3.4...
2495Installing mlnx-rds-dkms-3.4...
2496Installing mlnx-nfsrdma-dkms-3.4...
2497Installing libibverbs1-1.2.1mlnx1...
2498Installing ibverbs-utils-1.2.1mlnx1...
2499Installing libibverbs-dev-1.2.1mlnx1...
2500Installing libibverbs1-dbg-1.2.1mlnx1...
2501Installing libmlx4-1-1.2.1mlnx1...
2502Installing libmlx4-dev-1.2.1mlnx1...
2503Installing libmlx4-1-dbg-1.2.1mlnx1...
2504Installing libmlx5-1-1.2.1mlnx1...
2505Installing libmlx5-dev-1.2.1mlnx1...
2506Installing libmlx5-1-dbg-1.2.1mlnx1...
2507Installing libibumad-1.3.10.2.MLNX20150406.966500d...
2508Installing libibumad-static-1.3.10.2.MLNX20150406.966500d...
2509Installing libibumad-devel-1.3.10.2.MLNX20150406.966500d...
2510Installing ibacm-1.2.1mlnx1...
2511Installing ibacm-dev-1.2.1mlnx1...
2512Installing librdmacm1-1.1.0mlnx...
2513Installing librdmacm-utils-1.1.0mlnx...
2514Installing librdmacm-dev-1.1.0mlnx...
2515Installing mstflint-4.5.0...
2516Installing ibdump-4.0.0...
2517Installing libibmad-1.3.12.MLNX20160814.4f078cc...
2518Installing libibmad-static-1.3.12.MLNX20160814.4f078cc...
2519Installing libibmad-devel-1.3.12.MLNX20160814.4f078cc...
2520Installing libopensm-4.8.0.MLNX20160906.32a95b6...
2521Installing opensm-4.8.0.MLNX20160906.32a95b6...
2522Installing opensm-doc-4.8.0.MLNX20160906.32a95b6...
2523Installing libopensm-devel-4.8.0.MLNX20160906.32a95b6...
2524Installing infiniband-diags-1.6.6.MLNX20160814.999c7b2...
2525Installing infiniband-diags-compat-1.6.6.MLNX20160814.999c7b2...
2526Installing mft-4.5.0...
2527Installing kernel-mft-dkms-4.5.0...
2528Installing libibcm1-1.0.5mlnx2...
2529Installing libibcm-dev-1.0.5mlnx2...
2530Installing perftest-3.0...
2531Installing ibutils2-2.1.1...
2532Installing libibdm1-1.5.7.1...
2533Installing ibutils-1.5.7.1...
2534Installing cc-mgr-1.0...
2535Installing ar-mgr-1.0...
2536Installing dump-pr-1.0...
2537Installing ibsim-0.6...
2538Installing ibsim-doc-0.6...
2539Installing knem-dkms-1.1.2.90mlnx1...
2540Installing mxm-3.5.220c57f...
2541Installing fca-2.5.2431...
2542Installing sharp-1.1.1.MLNX20160915.8763a35...
2543Installing hcoll-3.6.1228...
2544Installing openmpi-1.10.5a1...
2545Installing mpitests-3.2.18...
2546Installing knem-1.1.2.90mlnx1...
2547Installing rds-tools-2.0.7...
2548Installing libdapl2-2.1.9mlnx...
2549Installing dapl2-utils-2.1.9mlnx...
2550Installing libdapl-dev-2.1.9mlnx...
2551Installing srptools-1.0.3...
2552Installing mlnx-ethtool-4.2...
2553Installing libsdp1-1.1.108...
2554Installing libsdp-dev-1.1.108...
2555Installing sdpnetstat-1.60...
2556Selecting previously unselected package mlnx-fw-updater.
2557(Reading database ... 70592 files and directories currently installed.)
2558Preparing to unpack .../mlnx-fw-updater_3.4-1.0.0.0_amd64.deb ...
2559Unpacking mlnx-fw-updater (3.4-1.0.0.0) ...
2560Setting up mlnx-fw-updater (3.4-1.0.0.0) ...
2561
2562Added RUN_FW_UPDATER_ONBOOT=no to /etc/infiniband/openib.conf
2563
2564Attempting to perform Firmware update...
2565Querying Mellanox devices firmware ...
2566
2567Device #1:
2568
2569  Device Type:      ConnectX4
2570  Part Number:      MCX416A-CCA_Ax
2571  Description:      ConnectX-4 EN network interface card; 100GbE dual-port QSFP28; PCIe3.0 x16; ROHS R6
2572  PSID:             MT_2150110033
2573  PCI Device Name:  03:00.0
2574  Base GUID:        248a07030014fc60
2575  Base MAC:         0000248a0714fc60
2576  Versions:         Current        Available     
2577     FW             12.16.1006     12.17.1010    
2578     PXE            3.4.0812       3.4.0903      
2579
2580  Status:           Update required
2581
2582
2583Found 1 device(s) requiring firmware update...
2584
2585Device #1: Updating FW ... Done
2586
2587Restart needed for updates to take effect.
2588Log File: /tmp/MLNX_OFED_LINUX.16084.logs/fw_update.log
2589Please reboot your system for the changes to take effect.
2590Device (03:00.0):
2591        03:00.0 Ethernet controller: Mellanox Technologies MT27620 Family
2592        Link Width: x16
2593        PCI Link Speed: 8GT/s
2594
2595Device (03:00.1):
2596        03:00.1 Ethernet controller: Mellanox Technologies MT27620 Family
2597        Link Width: x16
2598        PCI Link Speed: 8GT/s
2599
2600Installation passed successfully
2601To load the new driver, run:
2602/etc/init.d/openibd restart
2603-----
2604
2605
2606.Reboot 
2607[source,bash]
2608----
2609$sudo  reboot 
2610----
2611
2612
2613.After reboot 
2614[source,bash]
2615----
2616$ibv_devinfo 
2617hca_id: mlx5_1
2618        transport:                      InfiniBand (0)
2619        fw_ver:                         12.17.1010             << 12.17.00
2620        node_guid:                      248a:0703:0014:fc61
2621        sys_image_guid:                 248a:0703:0014:fc60
2622        vendor_id:                      0x02c9
2623        vendor_part_id:                 4115
2624        hw_ver:                         0x0
2625        board_id:                       MT_2150110033
2626        phys_port_cnt:                  1
2627        Device ports:
2628                port:   1
2629                        state:                  PORT_DOWN (1)
2630                        max_mtu:                4096 (5)
2631                        active_mtu:             1024 (3)
2632                        sm_lid:                 0
2633                        port_lid:               0
2634                        port_lmc:               0x00
2635                        link_layer:             Ethernet
2636
2637hca_id: mlx5_0
2638        transport:                      InfiniBand (0)
2639        fw_ver:                         12.17.1010
2640        node_guid:                      248a:0703:0014:fc60
2641        sys_image_guid:                 248a:0703:0014:fc60
2642        vendor_id:                      0x02c9
2643        vendor_part_id:                 4115
2644        hw_ver:                         0x0
2645        board_id:                       MT_2150110033
2646        phys_port_cnt:                  1
2647        Device ports:
2648                port:   1
2649                        state:                  PORT_DOWN (1)
2650                        max_mtu:                4096 (5)
2651                        active_mtu:             1024 (3)
2652                        sm_lid:                 0
2653                        port_lid:               0
2654                        port_lmc:               0x00
2655                        link_layer:             Ethernet
2656
2657----
2658
2659.ibdev2netdev
2660[source,bash]
2661-----
2662$ibdev2netdev
2663mlx5_0 port 1 ==> eth6 (Down)
2664mlx5_1 port 1 ==> eth7 (Down)
2665-----
2666
2667.find the ports 
2668[source,bash]
2669-----
2670
2671        $sudo ./dpdk_setup_ports.py -t
2672  +----+------+---------++---------------------------------------------
2673  | ID | NUMA |   PCI   ||                      Name     |  Driver   | 
2674  +====+======+=========++===============================+===========+=
2675  | 0  | 0    | 06:00.0 || VIC Ethernet NIC              | enic      | 
2676  +----+------+---------++-------------------------------+-----------+-
2677  | 1  | 0    | 07:00.0 || VIC Ethernet NIC              | enic      | 
2678  +----+------+---------++-------------------------------+-----------+-
2679  | 2  | 0    | 0a:00.0 || 82599ES 10-Gigabit SFI/SFP+ Ne| ixgbe     | 
2680  +----+------+---------++-------------------------------+-----------+-
2681  | 3  | 0    | 0a:00.1 || 82599ES 10-Gigabit SFI/SFP+ Ne| ixgbe     | 
2682  +----+------+---------++-------------------------------+-----------+-
2683  | 4  | 0    | 0d:00.0 || Device 15d0                   |           | 
2684  +----+------+---------++-------------------------------+-----------+-
2685  | 5  | 0    | 10:00.0 || I350 Gigabit Network Connectio| igb       | 
2686  +----+------+---------++-------------------------------+-----------+-
2687  | 6  | 0    | 10:00.1 || I350 Gigabit Network Connectio| igb       | 
2688  +----+------+---------++-------------------------------+-----------+-
2689  | 7  | 1    | 85:00.0 || 82599ES 10-Gigabit SFI/SFP+ Ne| ixgbe     | 
2690  +----+------+---------++-------------------------------+-----------+-
2691  | 8  | 1    | 85:00.1 || 82599ES 10-Gigabit SFI/SFP+ Ne| ixgbe     | 
2692  +----+------+---------++-------------------------------+-----------+-
2693  | 9  | 1    | 87:00.0 || MT27700 Family [ConnectX-4]   | mlx5_core |  #<1>
2694  +----+------+---------++-------------------------------+-----------+-
2695  | 10 | 1    | 87:00.1 || MT27700 Family [ConnectX-4]   | mlx5_core |  #<2>
2696  +----+------+---------++---------------------------------------------
2697-----
2698<1>  ConnectX-4 port 0
2699<2>  ConnectX-4 port 1
2700
2701
2702.Config file example
2703[source,bash]
2704-----
2705### Config file generated by dpdk_setup_ports.py ###
2706
2707 - port_limit: 2
2708   version: 2
2709   interfaces: ['87:00.0', '87:00.1']
2710   port_info:
2711      - ip: 1.1.1.1
2712        default_gw: 2.2.2.2
2713      - ip: 2.2.2.2
2714        default_gw: 1.1.1.1
2715
2716   platform:
2717      master_thread_id: 0
2718      latency_thread_id: 1
2719      dual_if:
2720        - socket: 1
2721          threads: [8,9,10,11,12,13,14,15,24,25,26,27,28,29,30,31]
2722-----
2723
2724
2725
2726
2727==== TRex specific implementation details 
2728
2729TRex uses flow director filter to steer specific packets to specific queues. 
2730To support that, we change IPv4.TOS/Ipv6.TC  LSB to *1* for packets we want to handle by software (Other packets will be dropped). So latency packets will have this bit turned on (This is true for all NIC types, not only for ConnectX-4).
2731This means taht if the DUT for some reason clears this bit (change TOS LSB to 0, e.g. change it from 0x3 to 0x2 for example) some TRex features (latency measurement for example) will not work properly.
2732
2733==== Which NIC to buy?
2734
2735NIC with two ports will work better from performance prospective, so it is better to have MCX456A-ECAT(dual 100gb ports) and *not* the  MCX455A-ECAT (single 100gb port).
2736
2737==== Limitation/Issues  
2738
2739* Stateless mode ``per stream statistics'' feature is handled in software (No hardware support like in X710 card).
2740* link:https://trex-tgn.cisco.com/youtrack/issue/trex-261[Latency issue]
2741* link:https://trex-tgn.cisco.com/youtrack/issue/trex-262[Statful RX out of order]
2742* link:https://trex-tgn.cisco.com/youtrack/issue/trex-273[Fedora 21 & OFED 3.4.1]
2743
2744
2745==== Performance Cycles/Packet ConnectX-4 vs Intel XL710
2746
2747For TRex version v2.11, these are the comparison results between XL710 and ConnectX-4 for various scenarios.
2748
2749.Stateless MPPS/Core [Preliminary]
2750image:images/xl710_vs_mlx5_64b.png[title="Stateless 64B"] 
2751
2752.Stateless Gb/Core [Preliminary]
2753image:images/xl710_vs_mlx5_var_size.png[title="Stateless variable size packet"] 
2754
2755*Comments*::
2756 
27571. MLX5 can reach ~50MPPS while XL710 is limited to 35MPPS. (With potential future fix it will be ~65MPPS)
27582. For Stateless/Stateful 256B profiles, ConnectX-4 uses half of the CPU cycles per packet. ConnectX-4 probably can handle in a better way chained mbufs (scatter gather).
27593. In the average stateful scenario, ConnectX-4 is the same as XL710.
27604. For Stateless 64B/IMIX profiles, ConnectX-4 uses 50-90% more CPU cycles per packet (it is actually even more because there is the TRex scheduler overhead) - it means that in worst case scenario, you will need x2 CPU for the same total MPPS.
2761
2762
2763[NOTE]
2764=====================================
2765There is a task to automate the production of thess reports
2766=====================================
2767
2768==== Troubleshooting
2769
2770* Before running TRex make sure the commands    `ibv_devinfo` and  `ibdev2netdev` present the NICS
2771* `ifconfig` should work too, you need to be able to ping from those ports
2772* run TRex server with '-v 7' for example `$sudo ./t-rex-64 -i -v 7`
2773
2774
2775=== Cisco VIC support 
2776
2777anchor:ciscovic_support[]
2778
2779* Supported from TRex version v2.12
2780* Only 1300 series Cisco adapter supported
2781* Must have VIC firmware version 2.0(13) for UCS C-series servers. Will be GA in Febuary 2017.
2782* Must have VIC firmware version 3.1(2) for blade servers (which supports more filtering capabilities).
2783* The feature can be enabled via Cisco CIMC or USCM with the 'advanced filters' radio button.  When enabled, these additional flow director modes are available:
2784        RTE_ETH_FLOW_NONFRAG_IPV4_OTHER
2785        RTE_ETH_FLOW_NONFRAG_IPV4_SCTP
2786        RTE_ETH_FLOW_NONFRAG_IPV6_UDP
2787        RTE_ETH_FLOW_NONFRAG_IPV6_TCP
2788        RTE_ETH_FLOW_NONFRAG_IPV6_SCTP
2789        RTE_ETH_FLOW_NONFRAG_IPV6_OTHER
2790
2791==== vNIC Configuration Parameters
2792
2793*Number of Queues*::
2794  The maximum number of receive queues (RQs), work queues (WQs) and completion queues (CQs) are configurable on a per vNIC basis through the Cisco UCS Manager (CIMC or UCSM).
2795  These values should be configured as follows:
2796  * The number of WQs should be greater or equal to the number of threads (-c value) plus 1
2797  * The number of RQs should be greater than 5
2798  * The number of CQs should set to WQs + RQs 
2799  * Unless there is a lack of resources due to creating many vNICs, it is recommended that the WQ and RQ sizes be set to the *maximum*. 
2800
2801*Advanced filters*::
2802  advanced filter should be enabled 
2803
2804*MTU*::
2805  set the MTU to maximum 9000-9190 (Depends on the FW version)
2806  
2807more information could be found here link:http://www.dpdk.org/doc/guides/nics/enic.html?highlight=enic[enic DPDK]
2808 
2809image:images/UCS-B-adapter_policy_1.jpg[title="vic configuration",align="center",width=800]
2810image:images/UCS-B-adapter_policy_2.jpg[title="vic configuration",align="center",width=800]
2811
2812In case it is not configured correctly, this error will be seen
2813
2814.VIC error in case of wrong RQ/WQ 
2815[source,bash]
2816----
2817Starting  TRex v2.15 please wait  ...
2818no client generator pool configured, using default pool
2819no server generator pool configured, using default pool
2820zmq publisher at: tcp://*:4500
2821Number of ports found: 2
2822set driver name rte_enic_pmd
2823EAL: Error - exiting with code: 1
2824  Cause: Cannot configure device: err=-22, port=0     #<1>
2825----
2826<1>There is not enough queues 
2827
2828
2829running it with verbose mode with CLI  `-v 7`  
2830
2831[source,bash]
2832----
2833$sudo ./t-rex-64 -f cap2/dns.yaml -c 1 -m 1 -d 10  -l 1000 -v 7
2834----
2835
2836will give move info 
2837
2838[source,bash]
2839----
2840EAL:   probe driver: 1137:43 rte_enic_pmd
2841PMD: rte_enic_pmd: Advanced Filters available
2842PMD: rte_enic_pmd: vNIC MAC addr 00:25:b5:99:00:4c wq/rq 256/512 mtu 1500, max mtu:9190
2843PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss yes intr mode any type min 
2844PMD: rte_enic_pmd: vNIC resources avail: wq 2 rq 2 cq 4 intr 6                  #<1>
2845EAL: PCI device 0000:0f:00.0 on NUMA socket 0
2846EAL:   probe driver: 1137:43 rte_enic_pmd
2847PMD: rte_enic_pmd: Advanced Filters available
2848PMD: rte_enic_pmd: vNIC MAC addr 00:25:b5:99:00:5c wq/rq 256/512 mtu 1500, max 
2849----
2850<1> rq is 2 which mean 1 input queue which is less than minimum required by trex (rq should be at least 5)
2851
2852
2853==== Limitations/Issues
2854
2855* Stateless mode ``per stream statistics'' feature is handled in software (No hardware support like in X710 card).
2856* link:https://trex-tgn.cisco.com/youtrack/issue/trex-272[QSFP+ issue]
2857
2858
2859=== More active flows   
2860
2861From version v2.13 there is a new Stateful scheduler that works better in the case of high concurrent/active flows.
2862In case of EMIX 70% better performance was observed.
2863In this tutorial there are 14 DP cores & up to 8M flows.  
2864There is a  special config file to enlarge the number of flows. This tutorial  present the difference in performance between the old scheduler and the new. 
2865
2866==== Setup details
2867
2868[cols="1,5"]
2869|=================
2870| Server: | UCSC-C240-M4SX
2871| CPU:    | 2 x Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
2872| RAM:    | 65536 @ 2133 MHz
2873| NICs:   | 2 x Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 01)
2874| QSFP:   | Cisco QSFP-H40G-AOC1M
2875| OS:     | Fedora 18
2876| Switch: | Cisco Nexus 3172 Chassis, System version: 6.0(2)U5(2).
2877| TRex:   | v2.13/v2.12 using 7 cores per dual interface.
2878|=================
2879
2880==== Traffic profile 
2881
2882.cap2/cur_flow_single.yaml
2883[source,python]
2884----
2885- duration : 0.1
2886  generator :  
2887          distribution : "seq"
2888          clients_start : "16.0.0.1"
2889          clients_end   : "16.0.0.255"
2890          servers_start : "48.0.0.1"
2891          servers_end   : "48.0.255.255"
2892          clients_per_gb : 201
2893          min_clients    : 101
2894          dual_port_mask : "1.0.0.0" 
2895  cap_info : 
2896     - name: cap2/udp_10_pkts.pcap  <1>
2897       cps : 100
2898       ipg : 200
2899       rtt : 200
2900       w   : 1
2901----
2902<1> One directional UDP flow with 10 packets of 64B
2903
2904
2905==== Config file command 
2906
2907./cfg/trex_08_5mflows.yaml
2908[source,python]
2909----
2910- port_limit: 4
2911  version: 2
2912  interfaces: ['05:00.0', '05:00.1', '84:00.0', '84:00.1']
2913  port_info:
2914      - ip: 1.1.1.1
2915        default_gw: 2.2.2.2
2916      - ip: 3.3.3.3
2917        default_gw: 4.4.4.4
2918
2919      - ip: 4.4.4.4
2920        default_gw: 3.3.3.3
2921      - ip: 2.2.2.2
2922        default_gw: 1.1.1.1
2923
2924  platform:
2925      master_thread_id: 0
2926      latency_thread_id: 15
2927      dual_if:
2928        - socket: 0
2929          threads: [1,2,3,4,5,6,7]
2930
2931        - socket: 1
2932          threads: [8,9,10,11,12,13,14]
2933  memory    :                                          
2934        dp_flows    : 1048576                    <1>
2935----
2936<1> add memory section with more flows 
2937
2938==== Traffic command 
2939
2940.command 
2941[source,bash]
2942----
2943$sudo ./t-rex-64 -f cap2/cur_flow_single.yaml -m 30000 -c 7 -d 40 -l 1000 --active-flows 5000000 -p --cfg cfg/trex_08_5mflows.yaml
2944----
2945
2946The number of active flows can be change using `--active-flows` CLI. in this example it is set to 5M flows
2947
2948
2949==== Script to get performance per active number of flows 
2950
2951[source,python]
2952----
2953
2954def minimal_stateful_test(server,csv_file,a_active_flows):
2955
2956    trex_client = CTRexClient(server)                                   <1>
2957
2958    trex_client.start_trex(                                             <2>
2959            c = 7,
2960            m = 30000,
2961            f = 'cap2/cur_flow_single.yaml',
2962            d = 30,
2963            l = 1000,
2964            p=True,
2965            cfg = "cfg/trex_08_5mflows.yaml",
2966            active_flows=a_active_flows,
2967            nc=True
2968            )
2969
2970    result = trex_client.sample_to_run_finish()                         <3>
2971
2972    active_flows=result.get_value_list('trex-global.data.m_active_flows')
2973    cpu_utl=result.get_value_list('trex-global.data.m_cpu_util')
2974    pps=result.get_value_list('trex-global.data.m_tx_pps')
2975    queue_full=result.get_value_list('trex-global.data.m_total_queue_full')
2976    if queue_full[-1]>10000:
2977        print("WARNING QUEU WAS FULL");
2978    tuple=(active_flows[-5],cpu_utl[-5],pps[-5],queue_full[-1])         <4>
2979    file_writer = csv.writer(test_file)
2980    file_writer.writerow(tuple);
2981    
2982
2983
2984if __name__ == '__main__':
2985    test_file = open('tw_2_layers.csv', 'wb');
2986    parser = argparse.ArgumentParser(description="Example for TRex Stateful, assuming server daemon is running.")
2987
2988    parser.add_argument('-s', '--server',
2989                        dest='server',
2990                        help='Remote trex address',
2991                        default='127.0.0.1',
2992                        type = str)
2993    args = parser.parse_args()
2994
2995    max_flows=8000000;
2996    min_flows=100;
2997    active_flow=min_flows;
2998    num_point=10
2999    factor=math.exp(math.log(max_flows/min_flows,math.e)/num_point);
3000    for i in range(num_point+1):
3001        print("=====================",i,math.floor(active_flow))
3002        minimal_stateful_test(args.server,test_file,math.floor(active_flow))      
3003        active_flow=active_flow*factor
3004
3005    test_file.close();
3006----
3007<1> connect 
3008<2> Start with different active_flows
3009<3> wait for the results 
3010<4> get the results and save to csv file
3011
3012This script iterate between 100 to 8M active flows and save the results to csv file.
3013
3014==== The results v2.12 vs v2.14
3015
3016.MPPS/core 
3017image:images/tw1_0.png[title="results",align="center"]
3018
3019.MPPS/core 
3020image:images/tw0_0_chart.png[title="results",align="center",width=800]
3021
3022* TW0 - v2.14 default configuration 
3023* PQ  - v2.12 default configuration 
3024
3025* To run the same script on v2.12 (that does not support `active_flows` directive) a patch was introduced.
3026
3027*Observation*::
3028  * TW works better (up to 250%) in case of 25-100K flows 
3029  * TW scale better with active-flows
3030
3031==== Tunning 
3032
3033let's add another modes called *TW1*, in this mode the scheduler is tune to have more buckets (more memory)
3034
3035.TW1 cap2/cur_flow_single_tw_8.yaml
3036[source,python]
3037----
3038- duration : 0.1
3039  generator :  
3040          distribution : "seq"
3041          clients_start : "16.0.0.1"
3042          clients_end   : "16.0.0.255"
3043          servers_start : "48.0.0.1"
3044          servers_end   : "48.0.255.255"
3045          clients_per_gb : 201
3046          min_clients    : 101
3047          dual_port_mask : "1.0.0.0" 
3048  tw :                            
3049     buckets : 16384                    <1>
3050     levels  : 2                        <2>
3051     bucket_time_usec : 20.0
3052  cap_info : 
3053     - name: cap2/udp_10_pkts.pcap
3054       cps : 100
3055       ipg : 200
3056       rtt : 200
3057       w   : 1
3058----
3059<1> more buckets  
3060<2> less levels 
3061
3062
3063in *TW2* mode we have the same template, duplicated one with short IPG and another one with high IPG 
306410% of the new flows will be with long IPG
3065
3066.TW2 cap2/cur_flow.yaml
3067[source,python]
3068----
3069- duration : 0.1
3070  generator :  
3071          distribution : "seq"
3072          clients_start : "16.0.0.1"
3073          clients_end   : "16.0.0.255"
3074          servers_start : "48.0.0.1"
3075          servers_end   : "48.0.255.255"
3076          clients_per_gb : 201
3077          min_clients    : 101
3078          dual_port_mask : "1.0.0.0" 
3079          tcp_aging      : 0
3080          udp_aging      : 0
3081  mac        : [0x0,0x0,0x0,0x1,0x0,0x00]
3082  #cap_ipg    : true
3083  cap_info : 
3084     - name: cap2/udp_10_pkts.pcap
3085       cps : 10
3086       ipg : 100000
3087       rtt : 100000
3088       w   : 1
3089     - name: cap2/udp_10_pkts.pcap   
3090       cps : 90
3091       ipg : 2
3092       rtt : 2
3093       w   : 1
3094----
3095
3096==== Full results 
3097
3098
3099* PQ - v2.12 default configuration 
3100* TW0 - v2.14 default configuration 
3101* TW1 - v2.14 more buckets 16K
3102* TW2 - v2.14 two templates 
3103
3104.MPPS/core Comparison
3105image:images/tw1.png[title="results",align="center",width=800]
3106
3107.MPPS/core
3108image:images/tw1_tbl.png[title="results",align="center"]
3109
3110.Factor relative to v2.12 results 
3111image:images/tw2.png[title="results",align="center",width=800]
3112
3113.Extrapolation Total GbE per UCS with average packet size of 600B 
3114image:images/tw3.png[title="results",align="center",width=800]
3115
3116Observation:
3117
3118* TW2 (two flows) almost does not have a performance impact
3119* TW1 (more buckets) improve the performance up to a point
3120* TW is general is better than PQ
3121
3122
3123
3124