trex_book.asciidoc revision 6fda38dc
1TRex 
2====
3:author: hhaim 
4:email: <hhaim@cisco.com> 
5:revnumber: 2.1
6:quotes.++:
7:numbered:
8:web_server_url: http://trex-tgn.cisco.com/trex
9:local_web_server_url: csi-wiki-01:8181/trex
10:toclevels: 4
11
12include::trex_ga.asciidoc[]
13
14
15== Introduction
16
17=== A word on traffic generators
18
19Traditionally, routers have been tested using commercial traffic generators, while performance
20typically has been measured using packets per second (PPS) metrics. As router functionality and
21services became more complex, stateful traffic generators now need to provide more realistic traffic scenarios.
22
23Advantages of realistic traffic generators:
24
25* Accurate performance metrics.
26* Discovering bottlenecks in realistic traffic scenarios.
27
28==== Current Challenges:
29
30* *Cost*: Commercial stateful traffic generators are very expensive.
31* *Scale*: Bandwidth does not scale up well with feature complexity.
32* *Standardization*: Lack of standardization of traffic patterns and methodologies.
33* *Flexibility*: Commercial tools do not allow agility when flexibility and changes are needed.
34
35==== Implications 
36
37* High capital expenditure (capex) spent by different teams.
38* Testing in low scale and extrapolation became a common practice. This is non-ideal and fails to indicate bottlenecks that appear in real-world scenarios.
39* Teams use different benchmark methodologies, so results are not standardized.
40* Delays in development and testing due to dependence on testing tool features.
41* Resource and effort investment in developing different ad hoc tools and test methodologies.
42
43=== Overview of TRex
44
45TRex addresses these problems through an innovative and extendable software implementation and by leveraging standard and open software and x86/UCS hardware.
46
47* Generates and analyzes L4-7 traffic. In one package, provides capabilities of commercial L7 tools.
48* Stateful traffic generator based on pre-processing and smart replay of real traffic templates.
49* Generates and *amplifies* both client and server side traffic.
50* Customized functionality can be added.
51* Scales to 200Gb/sec for one UCS (using Intel 40Gb/sec NICs).
52* Low cost.
53* Self-contained package that can be easily installed and deployed.
54* Virtual interface support enables TRex to be used in a fully virtual environment without physical NICs. Example use cases:
55** Amazon AWS
56** Cisco LaaS
57// Which LaaS is this? Location as a service? Linux?
58** TRex on your laptop
59
60
61
62.TRex Hardware 
63[options="header",cols="1^,1^"]
64|=================
65|Cisco UCS Platform | Intel NIC 
66| image:images/ucs200_2.png[title="generator"] | image:images/Intel520.png[title="generator"]  
67|=================
68
69=== Purpose of this guide
70
71This guide explains the use of TRex internals and the use of TRex together with Cisco ASR1000 Series routers. The examples illustrate novel traffic generation techniques made possible by TRex. 
72
73== Download and installation
74
75=== Hardware recommendations
76
77TRex operates in a Linux application environment, interacting with Linux kernel modules. 
78TRex curretly works on x86 architecture and can operate well on Cisco UCS hardware. The following platforms have been tested and are recommended for operating TRex. 
79
80[NOTE]
81=====================================
82 A high-end UCS platform is not required for operating TRex in its current version, but may be required for future versions.
83=====================================
84
85[NOTE]
86=====================================
87 Not all supported DPDK interfaces are supported by TRex
88=====================================
89 
90
91.Preferred UCS hardware
92[options="header",cols="1,3"]
93|=================
94| UCS Type | Comments 
95| UCS C220 Mx  | *Preferred Low-End*. Supports up to 40Gb/sec with 540-D2. With newer Intel NIC (recommended), supports 80Gb/sec with 1RU. See table below describing components.
96| UCS C200| Early UCS model.
97| UCS C210 Mx | Supports up to 40Gb/sec PCIe3.0.
98| UCS C240 Mx | *Preferred, High-End* Supports up to 200Gb/sec. 6x XL710 NICS (PCIex8) or 2xFM10K (PCIex16). See table below describing components.
99| UCS C260M2 | Supports up to 30Gb/sec (limited by V2 PCIe).
100|=================
101
102.Low-End UCS C220 Mx - Internal components
103[options="header",cols="1,2",width="60%"]
104|=================
105| Components |  Details 
106| CPU  | 2x E5-2620 @ 2.0 GHz. 
107| CPU Configuration | 2-Socket CPU configurations (also works with 1 CPU).
108| Memory | 2x4 banks f.or each CPU. Total of 32GB in 8 banks.	 
109| RAID | No RAID.
110|=================
111
112.High-End C240 Mx - Internal components 
113[options="header",cols="1,2",width="60%"]
114|=================
115| Components |  Details 
116| CPU  | 2x E5-2667 @ 3.20 GHz.
117| PCIe | 1x Riser PCI expansion card option A PID UCSC-PCI-1A-240M4 enables 2 PCIex16.
118| CPU Configuration | 2-Socket CPU configurations (also works with 1 CPU).
119| Memory | 2x4 banks for each CPU. Total of 32GB in 8 banks. 
120| RAID | No RAID.
121| Riser 1/2 | both left and right should support x16 PCIe. Right (Riser1) should be from option A x16 and Left (Riser2) should be x16. need to order both  
122|=================
123 
124.Supported NICs
125[options="header",cols="1,1,4",width="90%"]
126|=================
127| Chipset              | Bandwidth  (Gb/sec)  |  Example
128| Intel I350           | 1   | Intel 4x1GE 350-T4 NIC
129| Intel 82599          | 10  |   Cisco part ID:N2XX-AIPCI01 Intel x520-D2, Intel X520 Dual Port 10Gb SFP+ Adapter
130| Intel 82599 VF       | x  | 
131| Intel X710           | 10  | Cisco part ID:UCSC-PCIE-IQ10GF link:https://en.wikipedia.org/wiki/Small_form-factor_pluggable_transceiver[SFP+], *Preferred* support per stream stats in hardware link:http://www.silicom-usa.com/PE310G4i71L_Quad_Port_Fiber_SFP+_10_Gigabit_Ethernet_PCI_Express_Server_Adapter_49[Silicom PE310G4i71L]
132| Intel XL710          | 40  | Cisco part ID:UCSC-PCIE-ID40GF, link:https://en.wikipedia.org/wiki/QSFP[QSFP+] (copper/optical)
133| Intel XL710/X710 VF  | x  | 
134| Intel 82599 VF       | x  | 
135| Intel FM10420        | 25/100 | QSFP28,  by Silicom link:http://www.silicom-usa.com/100_Gigabit_Dual_Port_Fiber_Ethernet_PCI_Express_PE3100G2DQiR_96[Silicom PE3100G2DQiR_96] (*in development*) 
136| Mellanox ConnectX-4  | 25/40/50/56/100 | QSFP28, link:http://www.mellanox.com/page/products_dyn?product_family=201&[ConnectX-4] link:http://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX-4_VPI_Card.pdf[ConnectX-4-brief] (copper/optical) supported from v2.11 more details xref:connectx_support[TRex Support]
137| Mellanox ConnectX-5  | 25/40/50/56/100 | Supported 
138| Cisco 1300 series    | 40              | QSFP+, VIC 1380,  VIC 1385, VIC 1387 see more xref:ciscovic_support[TRex Support]
139| VMXNET / +
140VMXNET3 (see notes) | VMware paravirtualized  | Connect using VMware vSwitch
141| E1000    | paravirtualized  | VMware/KVM/VirtualBox 
142| Virtio | paravirtualized  | KVM
143|=================
144
145// in table above, is it correct to list "paravirtualized" as chipset? Also, what is QSFP28? It does not appear on the lined URL. Clarify: is Intel X710 the preferred NIC?
146
147.SFP+ support 
148[options="header",cols="2,1,1,1",width="90%"]
149|=================
150| link:https://en.wikipedia.org/wiki/Small_form-factor_pluggable_transceiver[SFP+]  | Intel Ethernet Converged X710-DAX |  Silicom link:http://www.silicom-usa.com/PE310G4i71L_Quad_Port_Fiber_SFP+_10_Gigabit_Ethernet_PCI_Express_Server_Adapter_49[PE310G4i71L] (Open optic) | 82599EB 10-Gigabit
151| link:http://www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/data_sheet_c78-455693.html[Cisco SFP-10G-SR] | Does not work     | [green]*works* | [green]*works*
152| link:http://www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/data_sheet_c78-455693.html[Cisco SFP-10G-LR] | Does not work     | [green]*works* | [green]*works*
153| link:http://www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/data_sheet_c78-455693.html[Cisco SFP-H10GB-CU1M]| [green]*works* | [green]*works* | [green]*works*
154| link:http://www.cisco.com/c/en/us/products/collateral/interfaces-modules/transceiver-modules/data_sheet_c78-455693.html[Cisco SFP-10G-AOC1M] | [green]*works* | [green]*works* | [green]*works*
155|=================
156
157[NOTE]
158=====================================
159 Intel X710 NIC (example: FH X710DA4FHBLK) operates *only* with Intel SFP+. For open optic, use the link:http://www.silicom-usa.com/PE310G4i71L_Quad_Port_Fiber_SFP+_10_Gigabit_Ethernet_PCI_Express_Server_Adapter_49[Silicom PE310G4i71L] NIC.
160=====================================
161
162// clarify above table and note
163
164.XL710 NIC base QSFP+ support 
165[options="header",cols="1,1,1",width="90%"]
166|=================
167| link:https://en.wikipedia.org/wiki/QSFP[QSFP+]             |  Intel Ethernet Converged XL710-QDAX | Silicom link:http://www.silicom-usa.com/Dual_Port_Fiber_40_Gigabit_Ethernet_PCI_Express_Server_Adapter_PE340G2Qi71_83[PE340G2Qi71] Open optic 
168| QSFP+ SR4 optics  |  APPROVED OPTICS [green]*works*,  Cisco QSFP-40G-SR4-S does *not* work   | Cisco QSFP-40G-SR4-S [green]*works* 
169| QSFP+ LR-4 Optics |   APPROVED OPTICS [green]*works*, Cisco QSFP-40G-LR4-S does *not* work   | Cisco QSFP-40G-LR4-S [green]*works* 
170| QSFP Active Optical Cables (AoC) | Cisco QSFP-H40G-AOC [green]*works*  | Cisco QSFP-H40G-AOC [green]*works* 
171| QSFP+ Intel Ethernet Modular Optics |    N/A            |  N/A
172| QSFP+ DA twin-ax cables | N/A  | N/A  
173| Active QSFP+ Copper Cables | Cisco QSFP-4SFP10G-CU [green]*works*  | Cisco QSFP-4SFP10G-CU [green]*works*
174|=================
175
176[NOTE]
177=====================================
178 For Intel XL710 NICs, Cisco SR4/LR QSFP+ does not operate. Use Silicom with Open Optic.
179=====================================
180
181
182.ConnectX-4 NIC base QSFP28 support (100gb)
183[options="header",cols="1,2",width="90%"]
184|=================
185| link:https://en.wikipedia.org/wiki/QSFP[QSFP28]             |  ConnectX-4
186| QSFP28 SR4 optics  |  N/A
187| QSFP28 LR-4 Optics |  N/A 
188| QSFP28 (AoC)       | Cisco QSFP-100G-AOCxM  [green]*works*  
189| QSFP28 DA twin-ax cables | Cisco QSFP-100G-CUxM [green]*works*  
190|=================
191
192.Cisco VIC NIC base QSFP+ support 
193[options="header",cols="1,2",width="90%"]
194|=================
195| link:https://en.wikipedia.org/wiki/QSFP[QSFP+]             |  Intel Ethernet Converged XL710-QDAX 
196| QSFP+ SR4 optics  |  N/A
197| QSFP+ LR-4 Optics |   N/A
198| QSFP Active Optical Cables (AoC) | Cisco QSFP-H40G-AOC [green]*works*  
199| QSFP+ Intel Ethernet Modular Optics |    N/A            
200| QSFP+ DA twin-ax cables | N/A  | N/A  
201| Active QSFP+ Copper Cables | N/A
202|=================
203
204
205// clarify above table and note. let's discuss.
206.FM10K QSFP28 support 
207[options="header",cols="1,1",width="70%"]
208|=================
209| QSFP28             | Example 
210| todo  |  todo
211|=================
212
213// do we want to show "todo"? maybe "pending"
214
215
216[IMPORTANT]
217=====================================
218* Intel SFP+ 10Gb/sec is the only one supported by default on the standard Linux driver. TRex also supports Cisco 10Gb/sec SFP+. 
219// above, replace "only one" with "only mode"?
220* For operating high speed throughput (example: several Intel XL710 40Gb/sec), use different link:https://en.wikipedia.org/wiki/Non-uniform_memory_access[NUMA] nodes for different NICs. +
221    To verify NUMA and NIC topology: `lstopo (yum install hwloc)` +
222    To display CPU info, including NUMA node: `lscpu` +
223    NUMA usage xref:numa-example[example]
224* For Intel XL710 NICs, verify that the NVM is v5.04 . xref:xl710-firmware[Info].
225**  `> sudo ./t-rex-64 -f cap2/dns.yaml -d 0 *-v 6* --nc | grep NVM` +
226    `PMD:  FW 5.0 API 1.5 NVM 05.00.04  eetrack 800013fc` 
227=====================================
228
229// above, maybe rename the bullet points "NIC usage notes"? should we create a subsection for NICs? Maybe it would be under "2.1 Hardware recommendations" as a subsection.
230
231
232.Sample order for recommended low-end Cisco UCSC-C220-M3S with 4x10Gb ports
233[options="header",cols="1,1",width="70%"]
234|=================
235| Component | Quantity 
236| UCSC-C220-M3S    |  1
237| UCS-CPU-E5-2650  |  2
238| UCS-MR-1X041RY-A |  8
239| A03-D500GC3      |  1
240| N2XX-AIPCI01     |  2
241| UCSC-PSU-650W    |  1
242| SFS-250V-10A-IS  |  1
243| UCSC-CMA1        |  1
244| UCSC-HS-C220M3   |  2
245| N20-BBLKD        |  7
246| UCSC-PSU-BLKP    |  1
247| UCSC-RAIL1       |  1
248|=================
249
250// should table above say "low-end Cisco UCS C220 M3S" instead of "low-end USCS-C220-M3S"?
251
252NOTE: Purchase the 10Gb/sec SFP+ separately. Cisco would be fine with TRex (but not for plain Linux driver).
253// does note above mean "TRex operates with 10Gb/sec SFP+ components, but plain Linux does not provide drivers."? if so, how does purchasing separately solve this? where do they get drivers?
254
255=== Installing OS 
256
257==== Supported versions
258
259Supported Linux versions:
260
261* Fedora 20-23, 64-bit kernel (not 32-bit)
262* Ubuntu 14.04.1 LTS, 64-bit kernel (not 32-bit)
263* Ubuntu 16.xx LTS, 64-bit kernel (not 32-bit) -- not fully supported
264* CentOs/RedHat 7.2 LTS, 64-bit kernel (not 32-bit)   -- The only working option for ConnectX-4
265
266NOTE: Additional OS version may be supported by compiling the necessary drivers.
267
268To check whether a kernel is 64-bit, verify that the ouput of the following command is `x86_64`.
269
270[source,bash]
271----
272$uname -m
273x86_64 
274----
275
276
277==== Download Linux
278
279ISO images for supported Linux releases can be downloaded from:
280
281.Supported Linux ISO image links
282[options="header",cols="1^,2^",width="50%"]
283|======================================
284| Distribution | SHA256 Checksum
285| link:http://archives.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/iso/Fedora-20-x86_64-DVD.iso[Fedora 20]
286    | link:http://archives.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/iso/Fedora-20-x86_64-CHECKSUM[Fedora 20 CHECKSUM] 
287| link:http://fedora-mirror01.rbc.ru/pub/fedora/linux/releases/21/Server/x86_64/iso/Fedora-Server-DVD-x86_64-21.iso[Fedora 21]
288    | link:http://fedora-mirror01.rbc.ru/pub/fedora/linux/releases/21/Server/x86_64/iso/Fedora-Server-21-x86_64-CHECKSUM[Fedora 21 CHECKSUM] 
289| link:http://old-releases.ubuntu.com/releases/14.04.1/ubuntu-14.04-desktop-amd64.iso[Ubuntu 14.04.1]
290    | http://old-releases.ubuntu.com/releases/14.04.1/SHA256SUMS[Ubuntu 14.04* CHECKSUMs]
291| link:http://releases.ubuntu.com/16.04.1/ubuntu-16.04.1-server-amd64.iso[Ubuntu 16.04.1]
292    | http://releases.ubuntu.com/16.04.1/SHA256SUMS[Ubuntu 16.04* CHECKSUMs]
293
294|======================================
295
296For Fedora downloads...
297
298* Select a mirror close to your location: +
299https://admin.fedoraproject.org/mirrormanager/mirrors/Fedora +
300Choose: "Fedora Linux http" -> releases -> <version number> -> Server -> x86_64 -> iso -> Fedora-Server-DVD-x86_64-<version number>.iso
301
302* Verify the checksum of the downloaded file matches the linked checksum values with the `sha256sum` command. Example:
303
304[source,bash]
305----
306$sha256sum Fedora-18-x86_64-DVD.iso
30791c5f0aca391acf76a047e284144f90d66d3d5f5dcd26b01f368a43236832c03 #<1>
308----
309<1> Should be equal to the link:https://en.wikipedia.org/wiki/SHA-2[SHA-256] values described in the linked checksum files. 
310
311
312==== Install Linux 
313
314Ask your lab admin to install the Linux using CIMC, assign an IP, and set the DNS. Request the sudo or super user password to enable you to ping and SSH.
315
316xref:fedora21_example[Example of installing Fedora 21 Server]
317
318[NOTE]
319=====================================
320 * To use TRex, you should have sudo on the machine or the root password.
321 * Upgrading the linux Kernel using `yum upgrade` requires building the TRex drivers. 
322 * In Ubuntu 16, auto-updater is enabled by default. It's advised to turn it off as with update of Kernel need to compile again the DPDK .ko file. +
323Command to remove it: +
324 > sudo apt-get remove unattended-upgrades
325=====================================
326
327==== Verify Intel NIC installation
328
329Use `lspci` to verify the NIC installation. 
330
331Example 4x 10Gb/sec TRex configuration (see output below):
332
333* I350 management port
334
335* 4x Intel Ethernet Converged Network Adapter model x520-D2 (82599 chipset)
336
337[source,bash]
338----
339$[root@trex]lspci | grep Ethernet
34001:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)                #<1>
34101:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)                #<2>
34203:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) #<3>
34303:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
34482:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
34582:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01)
346----
347<1> Management port 
348<2> CIMC port
349<3> 10Gb/sec traffic ports (Intel 82599EB) 
350
351=== Obtaining the TRex package
352
353Connect using `ssh` to the TRex machine and execute the commands described below.
354
355NOTE: Prerequisite: *$WEB_URL* is *{web_server_url}* or *{local_web_server_url}* (Cisco internal)
356
357Latest release:
358[source,bash]
359----
360$mkdir trex
361$cd trex
362$wget --no-cache $WEB_URL/release/latest 
363$tar -xzvf latest 
364----
365
366
367Bleeding edge version:
368[source,bash]
369----
370$wget --no-cache $WEB_URL/release/be_latest 
371----
372
373To obtain a specific version, do the following: 
374[source,bash]
375----
376$wget --no-cache $WEB_URL/release/vX.XX.tar.gz #<1>
377----
378
379<1> X.XX = Version number
380
381== First time Running
382
383=== Configuring for loopback
384
385Before connecting TRex to your DUT, it is strongly advised to verify that TRex and the NICs work correctly in loopback. +
386To get best performance, it is advised to loopback interfaces on the same NUMA (controlled by the same physical processor). If you do not know how to check this, you can ignore this advice for now. +
387
388[NOTE]
389=====================================================================
390If you are using 10Gbs NIC based on Intel 520-D2 NICs, and you loopback ports on the same NIC, using SFP+, it might not sync, and you will fail to get link up. +
391We checked many types of SFP+ (Intel/Cisco/SR/LR) and it worked for us. +
392If you still encounter link issues, you can either try to loopback interfaces from different NICs, or use link:http://www.fiberopticshare.com/tag/cisco-10g-twinax[Cisco twinax copper cable].
393=====================================================================
394
395.Loopback example
396image:images/loopback_example.png[title="Loopback example"] 
397
398==== Identify the ports 
399
400[source,bash]
401----
402 $>sudo ./dpdk_setup_ports.py -s
403
404 Network devices using DPDK-compatible driver
405 ============================================
406
407 Network devices using kernel driver
408 ===================================
409 0000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb #<1>
410 0000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
411 0000:13:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
412 0000:13:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
413 0000:02:00.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth2 drv=e1000 unused=igb_uio *Active* #<2>
414
415 Other network devices
416 =====================
417 <none>
418----
419
420<1> If you did not run any DPDK application, you will see list of interfaces binded to the kernel, or not binded at all.
421<2> Interface marked as 'active' is the one used by your ssh connection. *Never* put it in TRex config file.
422
423Choose ports to use and follow the instructions in the next section to create configuration file.
424
425==== Creating minimum configuration file
426
427Default configuration file name is: `/etc/trex_cfg.yaml`.
428
429You can copy basic configuration file from cfg folder
430
431[source,bash]
432----
433$cp  cfg/simple_cfg.yaml /etc/trex_cfg.yaml
434----
435
436Then, edit the configuration file and put your interface's and IP addresses details.
437
438Example:
439
440[source,bash]
441----
442<none>
443- port_limit      : 2
444  version         : 2
445#List of interfaces. Change to suit your setup. Use ./dpdk_setup_ports.py -s to see available options
446interfaces    : ["03:00.0", "03:00.1"]  #<1>
447 port_info       :  # Port IPs. Change to suit your needs. In case of loopback, you can leave as is.
448          - ip         : 1.1.1.1
449            default_gw : 2.2.2.2
450          - ip         : 2.2.2.2
451            default_gw : 1.1.1.1
452----
453<1> You need to edit this line to match the interfaces you are using.
454Notice that all NICs you are using should have the same type. You cannot mix different NIC types in one config file. For more info, see link:http://trex-tgn.cisco.com/youtrack/issue/trex-201[trex-201].
455
456You can find xref:trex_config[here] full list of configuration file options.
457
458=== Script for creating config file
459
460To help starting with basic configuration file that suits your needs, there a script that can automate this process.
461The script helps you getting started, and you can then edit the file and add advanced options from xref:trex_config[here]
462if needed. +
463There are two ways to run the script. Interactively (script will pormpt you for parameters), or providing all parameters
464using command line options.
465
466==== Interactive mode
467
468[source,bash]
469----
470sudo ./dpdk_setup_ports.py -i
471----
472
473You will see a list of available interfaces with their related information +
474Just follow the instructions to get basic config file.
475
476==== Specifying input arguments using command line options
477
478First, run this command to see the list of all interfaces and their related information:
479
480[source,bash]
481----
482sudo ./dpdk_setup_ports.py -t
483----
484
485* In case of *Loopback* and/or only *L1-L2 Switches* on the way, you do not need to provide IPs or destination MACs. +
486The script Will assume the following interface connections: 0&#8596;1, 2&#8596;3 etc. +
487Just run:
488
489[source,bash]
490----
491sudo ./dpdk_setup_ports.py -c <TRex interface 0> <TRex interface 1> ...
492----
493
494* In case of *Router* (or other next hop device, such as *L3 Switch*), you should specify the TRex IPs and default gateways, or
495MACs of the router as described below.
496
497.Additional arguments to creating script (dpdk_setup_ports.py -c)
498[options="header",cols="2,5,3",width="100%"]
499|=================
500| Arg | Description | Example
501| -c  | Create a configuration file by specified interfaces (PCI address or Linux names: eth1 etc.) | -c 03:00.1 eth1 eth4 84:00.0
502| --dump | Dump created config to screen. |
503| -o | Output the config to this file. | -o /etc/trex_cfg.yaml
504| --dest-macs | Destination MACs to be used per each interface. Specify this option if you want MAC based config instead of IP based one. You must not set it together with --ip and --def_gw | --dest-macs 11:11:11:11:11:11 22:22:22:22:22:22
505| --ip | List of IPs to use for each interface. If this option and --dest-macs is not specified, script assumes loopback connections (0&#8596;1, 2&#8596;3 etc.) | --ip 1.2.3.4 5.6.7.8
506|--def-gw | List of default gateways to use for each interface. If --ip given, you must provide --def_gw as well | --def-gw 3.4.5.6 7.8.9.10
507| --ci | Cores include: White list of cores to use. Make sure there is enough for each NUMA. | --ci 0 2 4 5 6
508| --ce | Cores exclude: Black list of cores to exclude. Make sure there will be enough for each NUMA. | --ci 10 11 12
509| --no-ht | No HyperThreading: Use only one thread of each Core in created config yaml. |
510| --prefix | Advanced option: prefix to be used in TRex config in case of parallel instances. | --prefix first_instance
511| --zmq-pub-port | Advanced option: ZMQ Publisher port to be used in TRex config in case of parallel instances. | --zmq-pub-port 4000
512| --zmq-rpc-port | Advanced option: ZMQ RPC port to be used in TRex config in case of parallel instances. | --zmq-rpc-port
513| --ignore-numa | Advanced option: Ignore NUMAs for config creation. Use this option only if you have to, as it might reduce performance. For example, if you have pair of interfaces at different NUMAs |
514|=================
515
516=== Configuring ESXi for running TRex
517
518To get best performance, it is advised to run TRex on bare metal hardware, and not use any kind of VM.
519Bandwidth on VM might be limited, and IPv6 might not be fully supported.
520Having said that, there are sometimes benefits for running on VM. +
521These include: +
522    * Virtual NICs can be used to bridge between TRex and NICs not supported by TRex. +
523    * If you already have VM installed, and do not require high performance. +
524
5251. Click the host machine, enter Configuration -> Networking.
526
527a. One of the NICs should be connected to the main vSwitch network to get an "outside" connection, for the TRex client and ssh: +
528image:images/vSwitch_main.png[title="vSwitch_main"]
529
530b. Other NICs that are used for TRex traffic should be in distinguish vSwitch: +
531image:images/vSwitch_loopback.png[title="vSwitch_loopback"]
532
5332. Right-click guest machine -> Edit settings -> Ensure the NICs are set to their networks: +
534image:images/vSwitch_networks.png[title="vSwitch_networks"]
535
536[NOTE]
537=====================================================================
538Before version 2.10, the following command did not function as expected:
539[subs="quotes"]
540....
541sudo ./t-rex-64 -f cap2/dns.yaml *--lm 1 --lo* -l 1000 -d 100
542....
543The vSwitch did not "know" where to route the packet. Was solved on version 2.10 when TRex started to support ARP.
544=====================================================================
545
546* Pass-through is the way to use directly the NICs from host machine inside the VM. Has no limitations except the NIC/hardware itself. The only difference via bare-metal OS is occasional spikes of latency (~10ms). Passthrough settings cannot be saved to OVA.
547
5481. Click on the host machine. Enter Configuration -> Advanced settings -> Edit. Mark the desired NICs. Reboot the ESXi to apply. +
549image:images/passthrough_marking.png[title="passthrough_marking"]
550
5512. Right click on guest machine. Edit settings -> Add -> *PCI device* -> Choose the NICs one by one. +
552image:images/passthrough_adding.png[title="passthrough_adding"]
553
554=== Configuring for running with router (or other L3 device) as DUT
555
556You can follow link:trex_config_guide.html[this] presentation for an example of how to configure router as DUT.
557
558=== Running TRex
559
560When all is set, use the following command to start basic TRex run for 10 seconds
561(it will use the default config file name /etc/trex_cfg.yaml):
562[source,bash]
563----
564$sudo ./t-rex-64 -f cap2/dns.yaml -c 4 -m 1 -d 10  -l 1000
565----
566
567If successful, the output will be similar to the following:
568
569[source,python]
570----
571$ sudo ./t-rex-64 -f cap2/dns.yaml -d 10 -l 1000
572Starting  TRex 2.09 please wait  ...
573zmq publisher at: tcp://*:4500 
574 number of ports found : 4
575  port : 0 
576  ------------
577  link         :  link : Link Up - speed 10000 Mbps - full-duplex      <1>
578  promiscuous  : 0 
579  port : 1 
580  ------------
581  link         :  link : Link Up - speed 10000 Mbps - full-duplex
582  promiscuous  : 0 
583  port : 2 
584  ------------
585  link         :  link : Link Up - speed 10000 Mbps - full-duplex
586  promiscuous  : 0 
587  port : 3 
588  ------------
589  link         :  link : Link Up - speed 10000 Mbps - full-duplex
590  promiscuous  : 0 
591
592
593 -Per port stats table 
594      ports |               0 |               1 |               2 |               3 
595 -------------------------------------------------------------------------------------
596   opackets |            1003 |            1003 |            1002 |            1002 
597     obytes |           66213 |           66229 |           66132 |           66132 
598   ipackets |            1003 |            1003 |            1002 |            1002 
599     ibytes |           66225 |           66209 |           66132 |           66132 
600    ierrors |               0 |               0 |               0 |               0 
601    oerrors |               0 |               0 |               0 |               0 
602      Tx Bw |     217.09 Kbps |     217.14 Kbps |     216.83 Kbps |     216.83 Kbps 
603
604 -Global stats enabled 
605 Cpu Utilization : 0.0  % <2>  29.7 Gb/core <3>
606 Platform_factor : 1.0    
607 Total-Tx        :     867.89 Kbps                                             <4>
608 Total-Rx        :     867.86 Kbps                                             <5>
609 Total-PPS       :       1.64 Kpps  
610 Total-CPS       :       0.50  cps  
611
612 Expected-PPS    :       2.00  pps   <6>
613 Expected-CPS    :       1.00  cps   <7>
614 Expected-BPS    :       1.36 Kbps   <8>
615
616 Active-flows    :        0 <9> Clients :      510   Socket-util  : 0.0000 %
617 Open-flows      :        1 <10> Servers :      254   Socket   :        1  Socket/Clients :  0.0
618 drop-rate       :       0.00  bps   <11>
619 current time    : 5.3 sec  
620 test duration   : 94.7 sec  
621
622 -Latency stats enabled 
623 Cpu Utilization : 0.2 %  <12>
624 if|   tx_ok , rx_ok  , rx   ,error,    average   ,   max         , Jitter ,  max window 
625   |         ,        , check,     , latency(usec),latency (usec) ,(usec)  ,             
626 --------------------------------------------------------------------------------------------------
627 0 |     1002,    1002,         0,   0,         51  ,      69,       0      |   0  69  67    <13>
628 1 |     1002,    1002,         0,   0,         53  ,     196,       0      |   0  196  53
629 2 |     1002,    1002,         0,   0,         54  ,      71,       0      |   0  71  69 
630 3 |     1002,    1002,         0,   0,         53  ,     193,       0      |   0  193  52 
631----
632<1> Link must be up for TRex to work.
633<2> Average CPU utilization of transmitters threads. For best results it should be lower than 80%.
634<3> Gb/sec generated per core of DP. Higher is better.
635<4> Total Tx must be the same as Rx at the end of the run
636<5> Total Rx must be the same as Tx at the end of the run
637<6> Expected number of packets per second (calculated without latency packets).
638<7> Expected number of connections per second (calculated without latency packets).
639<8> Expected number of bits per second (calculated without latency packets).
640<9> Number of TRex active "flows". Could be different than the number of router flows, due to aging issues. Usualy the TRex number of active flows is much lower than that of the router because the router ages flows slower.
641<10> Total number of TRex flows opened since startup (including active ones, and ones already closed).
642<11> Drop rate.
643<12> Rx and latency thread CPU utilization.
644<13> Tx_ok on port 0 should equal Rx_ok on port 1, and vice versa.
645
646More statistics information:
647
648*socket*::  Same as the active flows.  
649
650*Socket/Clients*:: Average of active flows per client, calculated as active_flows/#clients.
651
652*Socket-util*:: Estimation of number of L4 ports (sockets) used per client IP. This is approximately (100*active_flows/#clients)/64K, calculated as (average active flows per client*100/64K). Utilization of more than 50% means that TRex is generating too many flows per single client, and that more clients must be added in the generator config.
653// clarify above, especially the formula
654
655*Max window*:: Momentary maximum latency for a time window of 500 msec. There are few numbers shown per port.
656 The newest number (last 500msec) is on the right. Oldest on the left. This can help identifying spikes of high latency clearing after some time. Maximum latency is the total maximum over the entire test duration. To best understand this,
657 run TRex with latency option (-l) and watch the results with this section in mind.
658
659*Platform_factor*:: There are cases in which we duplicate the traffic using splitter/switch and we would like all numbers displayed by TRex to be multiplied by this factor, so that TRex counters will match the DUT counters.
660
661WARNING: If you don't see rx packets, revisit your MAC address configuration.
662
663include::trex_book_basic.asciidoc[]
664
665== Advanced features
666
667=== VLAN (dot1q) support
668
669If you want VLAN tag to be added to all traffic generated by TRex, you can acheive that by adding ``vlan'' keyword in each
670port section in the platform config file, like described xref:trex_config[here]. +
671You can specify different VLAN tag for each port, or even use VLAN only on some of the ports. +
672One useful application of this can be in a lab setup where you have one TRex and many DUTs, and you want to test different
673DUT on each run, without changing cable connections. You can put each DUT on a VLAN of its own, and use different TRex
674platform config files with different VLANs on each run.
675
676
677=== Utilizing maximum port bandwidth in case of asymmetric traffic profile 
678
679anchor:trex_load_bal[]
680
681[NOTE]
682If you want simple VLAN support, this is probably *not* the feature you want. This is used for load balancing.
683If you want VLAN support, please look at ``vlan'' field xref:trex_config[here].
684
685
686The VLAN Trunk TRex feature attempts to solve the router port bandwidth limitation when the traffic profile is asymmetric. Example: Asymmetric SFR profile.
687This feature converts asymmetric traffic to symmetric, from the port perspective, using router sub-interfaces.
688This requires TRex to send the traffic on two VLANs, as described below.
689
690.YAML format - This goes into traffic yaml file
691[source,python]
692----
693  vlan       : { enable : 1  ,  vlan0 : 100 , vlan1 : 200 }
694----
695
696
697.Example
698[source,python]
699----
700- duration : 0.1
701  vlan       : { enable : 1  ,  vlan0 : 100 , vlan1 : 200 }   <1>
702----
703<1> Enable load balance feature, vlan0==100 , vlan1==200
704For a full file example please look in TRex source at scripts/cap2/ipv4_load_balance.yaml
705
706*Problem definition:*::
707
708Scenario: TRex with two ports and an SFR traffic profile.
709
710.Without VLAN/sub interfaces, all client emulated traffic is sent on port 0, and all server emulated traffic (HTTP response for example) on port 1.
711[source,python]
712----
713TRex port 0 ( client) <-> [  DUT ] <-> TRex port 1 ( server)
714----
715Without VLAN support the traffic is asymmetric. 10% of the traffic is sent from port 0 (client side), 90% is from port 1 (server). Port 1 is the bottlneck (10Gb/s limit).
716
717.With VLAN/sub interfaces 
718[source,python]
719----
720TRex port 0 ( client VLAN0) <->  | DUT  | <-> TRex port 1 ( server-VLAN0)
721TRex port 0 ( server VLAN1) <->  | DUT  | <-> TRex port 1 ( client-VLAN1)
722----
723
724In this case, traffic on vlan0 is sent as before, while for traffic on vlan1, order is reversed (client traffic sent on port1 and server traffic on port0).
725TRex divids the flows evenly between the vlans. This results an equal amount of traffic on each port.
726
727*Router configuation:*::
728[source,python]
729----
730        !
731        interface TenGigabitEthernet1/0/0      <1>
732         mac-address 0000.0001.0000   
733         mtu 4000
734         no ip address
735         load-interval 30
736        !
737        i
738        interface TenGigabitEthernet1/0/0.100
739         encapsulation dot1Q 100               <2> 
740         ip address 11.77.11.1 255.255.255.0
741         ip nbar protocol-discovery
742         ip policy route-map vlan_100_p1_to_p2 <3> 
743        !
744        interface TenGigabitEthernet1/0/0.200
745         encapsulation dot1Q 200               <4>
746         ip address 11.88.11.1 255.255.255.0
747         ip nbar protocol-discovery
748         ip policy route-map vlan_200_p1_to_p2 <5> 
749        !
750        interface TenGigabitEthernet1/1/0
751         mac-address 0000.0001.0000
752         mtu 4000
753         no ip address
754         load-interval 30
755        !
756        interface TenGigabitEthernet1/1/0.100
757         encapsulation dot1Q 100
758         ip address 22.77.11.1 255.255.255.0
759         ip nbar protocol-discovery
760         ip policy route-map vlan_100_p2_to_p1
761        !
762        interface TenGigabitEthernet1/1/0.200
763         encapsulation dot1Q 200
764         ip address 22.88.11.1 255.255.255.0
765         ip nbar protocol-discovery
766         ip policy route-map vlan_200_p2_to_p1
767        !
768        
769        arp 11.77.11.12 0000.0001.0000 ARPA      <6>
770        arp 22.77.11.12 0000.0001.0000 ARPA
771        
772        route-map vlan_100_p1_to_p2 permit 10    <7>
773         set ip next-hop 22.77.11.12
774        !
775        route-map vlan_100_p2_to_p1 permit 10
776         set ip next-hop 11.77.11.12
777        !
778        
779        route-map vlan_200_p1_to_p2 permit 10
780         set ip next-hop 22.88.11.12
781        !
782        route-map vlan_200_p2_to_p1 permit 10
783         set ip next-hop 11.88.11.12
784        !
785----
786<1> Main interface must not have IP address.
787<2> Enable VLAN1
788<3> PBR configuration
789<4> Enable VLAN2  
790<5> PBR configuration
791<6> TRex destination port MAC address
792<7> PBR configuration rules
793
794=== Static source MAC address setting  
795
796With this feature, TRex replaces the source MAC address with the client IP address.
797
798 Note: This feature was requested by the Cisco ISG group. 
799
800
801*YAML:*::
802[source,python]
803----
804 mac_override_by_ip : true
805----
806
807.Example
808[source,python]
809----
810- duration : 0.1
811 ..
812  mac_override_by_ip : true <1>
813----
814<1> In this case, the client side MAC address looks like this:
815SRC_MAC = IPV4(IP) + 00:00  
816
817=== IPv6 support 
818
819Support for IPv6 includes:
820
8211. Support for pcap files containing IPv6 packets
8222. Ability to generate IPv6 traffic from pcap files containing IPv4 packets
823The following command line option enables this feature: `--ipv6`
824The keywords (`src_ipv6` and `dst_ipv6`) specify the most significant 96 bits of the IPv6 address - for example:
825
826[source,python]
827----
828      src_ipv6 : [0xFE80,0x0232,0x1002,0x0051,0x0000,0x0000]
829      dst_ipv6 : [0x2001,0x0DB8,0x0003,0x0004,0x0000,0x0000]
830----
831      
832The IPv6 address is formed by placing what would typically be the IPv4
833address into the least significant 32 bits and copying the value provided
834in the src_ipv6/dst_ipv6 keywords into the most signficant 96 bits.
835If src_ipv6 and dst_ipv6 are not specified, the default
836is to form IPv4-compatible addresses (most signifcant 96 bits are zero).
837
838There is support for all plugins.
839  
840*Example:*::
841[source,bash]
842----
843$sudo ./t-rex-64 -f cap2l/sfr_delay_10_1g.yaml -c 4 -p -l 100 -d 100000 -m 30  --ipv6
844----
845
846*Limitations:*::
847
848* TRex cannot generate both IPv4 and IPv6 traffic.
849* The `--ipv6` switch must be specified even when using pcap file containing only IPv6 packets.
850
851
852*Router configuration:*::
853
854[source,python]
855----
856interface TenGigabitEthernet1/0/0
857 mac-address 0000.0001.0000
858 mtu 4000
859 ip address 11.11.11.11 255.255.255.0
860 ip policy route-map p1_to_p2
861 load-interval 30
862 ipv6 enable   ==> IPv6
863 ipv6 address 2001:DB8:1111:2222::1/64                  <1>
864 ipv6 policy route-map ipv6_p1_to_p2                    <2>
865!
866
867
868ipv6 unicast-routing                                    <3>
869
870ipv6 neighbor 3001::2 TenGigabitEthernet0/1/0 0000.0002.0002   <4>
871ipv6 neighbor 2001::2 TenGigabitEthernet0/0/0 0000.0003.0002
872
873route-map ipv6_p1_to_p2 permit 10                              <5>
874 set ipv6 next-hop 2001::2
875!
876route-map ipv6_p2_to_p1 permit 10
877 set ipv6 next-hop 3001::2
878!
879
880
881asr1k(config)#ipv6 route 4000::/64 2001::2                 
882asr1k(config)#ipv6 route 5000::/64 3001::2 
883----
884<1> Enable IPv6 
885<2> Add pbr 
886<3> Enable IPv6 routing 
887<4> MAC address setting. Should be TRex MAC.
888<5> PBR configuraion
889
890
891=== Client clustering configuration
892TRex supports testing complex topologies, with more than one DUT, using a feature called "client clustering".
893This feature allows specifying the distribution of clients TRex emulates.
894
895Let's look at the following topology:
896
897.Topology Example 
898image:images/topology.png[title="Client Clustering",width=850]
899
900We have two clusters of DUTs.
901Using config file, you can partition TRex emulated clients to groups, and define
902how they will be spread between the DUT clusters.
903
904Group configuration includes:
905
906* IP start range.
907* IP end range.
908* Initiator side configuration. - These are the parameters affecting packets sent from client side.
909* Responder side configuration. - These are the parameters affecting packets sent from server side.
910
911[NOTE]
912It is important to understand that this is *complimentary* to the client generator
913configured per profile - it only defines how the clients will be spread between clusters.
914
915Let's look at an example.
916
917We have a profile defining client generator.
918
919[source,bash]
920----
921$cat cap2/dns.yaml 
922- duration : 10.0
923  generator :  
924          distribution : "seq"           
925          clients_start : "16.0.0.1"
926          clients_end   : "16.0.0.255"   
927          servers_start : "48.0.0.1"
928          servers_end   : "48.0.0.255"   
929          dual_port_mask : "1.0.0.0" 
930  cap_info : 
931     - name: cap2/dns.pcap
932       cps : 1.0          
933       ipg : 10000        
934       rtt : 10000        
935       w   : 1            
936----
937
938We want to create two clusters with 4 and 3 devices respectively.
939We also want to send *80%* of the traffic to the upper cluster and *20%* to the lower cluster.
940We can specify to which DUT the packet will be sent by MAC address or IP. We will present a MAC
941based example, and then see how to change to be IP based.
942
943We will create the following cluster configuration file.
944
945[source,bash]
946----
947#
948# Client configuration example file
949# The file must contain the following fields
950#
951# 'vlan'   - if the entire configuration uses VLAN,
952#            each client group must include vlan
953#            configuration
954#
955# 'groups' - each client group must contain range of IPs
956#            and initiator and responder section
957#            'count' represents the number of different DUTs
958#            in the group.
959#
960
961# 'true' means each group must contain VLAN configuration. 'false' means no VLAN config allowed.
962vlan: true
963
964groups:
965
966-    ip_start  : 16.0.0.1
967     ip_end    : 16.0.0.204
968     initiator :
969                 vlan    : 100
970                 dst_mac : "00:00:00:01:00:00"
971     responder :
972                 vlan    : 200
973                 dst_mac : "00:00:00:02:00:00"
974
975     count     : 4
976
977-    ip_start  : 16.0.0.205
978     ip_end    : 16.0.0.255
979     initiator :
980                 vlan    : 101
981                 dst_mac : "00:00:01:00:00:00"
982
983     responder:
984                 vlan    : 201
985                 dst_mac : "00:00:02:00:00:00"
986
987     count     : 3
988
989----
990
991The above configuration will divide the generator range of 255 clients to two clusters. The range
992of IPs in all groups in the client config file together, must cover the entire range of client IPs
993from the traffic profile file.
994
995MACs will be allocated incrementally, with a wrap around after ``count'' addresses.
996
997e.g.
998
999*Initiator side: (packets with source in 16.x.x.x net)*
1000
1001* 16.0.0.1 -> 48.x.x.x - dst_mac: 00:00:00:01:00:00  vlan: 100 
1002* 16.0.0.2 -> 48.x.x.x - dst_mac: 00:00:00:01:00:01  vlan: 100 
1003* 16.0.0.3 -> 48.x.x.x - dst_mac: 00:00:00:01:00:02  vlan: 100 
1004* 16.0.0.4 -> 48.x.x.x - dst_mac: 00:00:00:01:00:03  vlan: 100 
1005* 16.0.0.5 -> 48.x.x.x - dst_mac: 00:00:00:01:00:00  vlan: 100 
1006* 16.0.0.6 -> 48.x.x.x - dst_mac: 00:00:00:01:00:01  vlan: 100 
1007
1008*responder side: (packets with source in 48.x.x.x net)* 
1009
1010* 48.x.x.x -> 16.0.0.1  - dst_mac(from responder) : "00:00:00:02:00:00" , vlan:200
1011* 48.x.x.x -> 16.0.0.2  - dst_mac(from responder) : "00:00:00:02:00:01" , vlan:200
1012
1013and so on. +
1014 +
1015This means that the MAC addresses of DUTs must be changed to be sequential. Other option is to
1016specify instead of ``dst_mac'', ip address, using ``next_hop''. +
1017For example, config file first group will look like:
1018
1019[source,bash]
1020----
1021-    ip_start  : 16.0.0.1
1022     ip_end    : 16.0.0.204
1023     initiator :
1024                 vlan     : 100
1025                 next_hop : 1.1.1.1
1026                 src_ip   : 1.1.1.100
1027     responder :
1028                 vlan     : 200
1029                 next_hop : 2.2.2.1
1030                 src_ip   : 2.2.2.100
1031
1032     count     : 4
1033----
1034
1035In this case, TRex will try to resolve using ARP requests the addresses
10361.1.1.1, 1.1.1.2, 1.1.1.3, 1.1.1.4 (and the range 2.2.2.1-2.2.2.4). If not all IPs are resolved,
1037TRex will exit with an error message. ``src_ip'' will be used for sending gratitues ARP, and
1038for filling relevant fields in ARP request. If no ``src_ip'' given, TRex will look for source
1039IP in the relevant port section in the platform config file (/etc/trex_cfg.yaml). If none is found, TRex
1040will exit with an error message. +
1041If client config file is given, the ``dest_mac'' and ``default_gw'' parameters from the platform config
1042file are ignored.
1043
1044Now, streams will look like: +
1045*Initiator side: (packets with source in 16.x.x.x net)*
1046
1047* 16.0.0.1 -> 48.x.x.x - dst_mac: MAC of 1.1.1.1  vlan: 100
1048* 16.0.0.2 -> 48.x.x.x - dst_mac: MAC of 1.1.1.2  vlan: 100
1049* 16.0.0.3 -> 48.x.x.x - dst_mac: MAC of 1.1.1.3  vlan: 100
1050* 16.0.0.4 -> 48.x.x.x - dst_mac: MAC of 1.1.1.4  vlan: 100
1051* 16.0.0.5 -> 48.x.x.x - dst_mac: MAC of 1.1.1.1  vlan: 100
1052* 16.0.0.6 -> 48.x.x.x - dst_mac: MAC of 1.1.1.2  vlan: 100
1053
1054*responder side: (packets with source in 48.x.x.x net)*
1055
1056* 48.x.x.x -> 16.0.0.1  - dst_mac: MAC of 2.2.2.1 , vlan:200
1057* 48.x.x.x -> 16.0.0.2  - dst_mac: MAC of 2.2.2.2 , vlan:200
1058
1059
1060[NOTE]
1061It is important to understand that the ip to MAC coupling (both with MAC based config or IP based)
1062is done at the beginning and never changes. Meaning, for example, for the MAC case, packets
1063with source IP 16.0.0.2 will always have VLAN 100 and dst MAC 00:00:00:01:00:01.
1064Packets with destination IP 16.0.0.2 will always have VLAN 200 and dst MAC "00:00:00:02:00:01.
1065This way, you can predict exactly which packet (and how many packets) will go to each DUT.
1066
1067*Usage:*
1068
1069[source,bash]
1070----
1071sudo ./t-rex-64 -f cap2/dns.yaml --client_cfg my_cfg.yaml
1072----
1073
1074=== NAT support 
1075
1076TRex can learn dynamic NAT/PAT translation. To enable this feature add `--learn-mode <mode>` to the command line.
1077To learn the NAT translation, TRex must embed information describing the flow a packet belongs to, in the first
1078packet of each flow. This can be done in different methods, depending on the chosen <mode>.
1079
1080*mode 1:*::
1081
1082In case of TCP flow, flow info is embedded in the ACK of the first TCP SYN. +
1083In case of UDP flow, flow info is embedded in the IP identification field of the first packet in the flow. +
1084This mode was developed for testing NAT with firewalls (which usually do not work with mode 2).
1085In this mode, TRex also learn and compensate for TCP sequence number randomization that might be done by the DUT.
1086TRex can learn and compensate for seq num randomization in both directions of the connection.
1087
1088*mode 2:*::
1089
1090Flow info is added in a special IPv4 option header (8 bytes long 0x10 id). The option is added only to the first packet in the flow.
1091This mode does not work with DUTs that drop packets with IP options (for example, Cisco ASA firewall).
1092
1093*mode 3:*::
1094
1095This is like mode 1, with the only change being that TRex does not learn the seq num randomization in the server->client direction.
1096This mode can give much better connections per second performance than mode 1 (still, for all existing firewalls, mode 1 cps rate is more than enough).
1097
1098==== Examples
1099
1100*simple HTTP traffic*
1101
1102[source,bash]
1103----
1104$sudo ./t-rex-64 -f cap2/http_simple.yaml -c 4  -l 1000 -d 100000 -m 30  --learn-mode 1
1105----
1106
1107*SFR traffic without bundling/ALG support*
1108
1109[source,bash]
1110----
1111$sudo ./t-rex-64 -f avl/sfr_delay_10_1g_no_bundling.yaml -c 4  -l 1000 -d 100000 -m 10  --learn-mode 2
1112----
1113
1114*NAT terminal counters:*::
1115
1116[source,python]
1117----
1118-Global stats enabled 
1119 Cpu Utilization : 0.6  %  33.4 Gb/core 
1120 Platform_factor : 1.0
1121 Total-Tx        :       3.77 Gbps   NAT time out    :      917 <1> (0 in wait for syn+ack) <5>
1122 Total-Rx        :       3.77 Gbps   NAT aged flow id:        0 <2>
1123 Total-PPS       :     505.72 Kpps   Total NAT active:      163 <3> (12 waiting for syn) <6>
1124 Total-CPS       :      13.43 Kcps   Total NAT opened:    82677 <4>
1125----
1126<1> Number of connections for which TRex had to send the next packet in the flow, but did not learn the NAT translation yet. Should be 0. Usually, value different than 0 is seen if the DUT drops the flow (probably because it can't handle the number of connections)
1127<2> Number of flows for which when we got the translation info, flow was aged out already. Non 0 value here should be very rare. Can occur only when there is huge latency in the DUT input/output queue.
1128<3> Number of flows for which we sent the first packet, but did not learn the NAT translation yet. Value seen depends on the connection per second rate and round trip time.
1129<4> Total number of translations over the lifetime of the TRex instance. May be different from the total number of flows if template is uni-directional (and consequently does not need translation).
1130<5> Out of the timed out flows, how many were timed out while waiting to learn the TCP seq num randomization of the server->client from the SYN+ACK packet (Seen only in --learn-mode 1)
1131<6> Out of the active NAT sessions, how many are waiting to learn the client->server translation from the SYN packet (others are waiting for SYN+ACK from server) (Seen only in --learn-mode 1)
1132
1133*Configuration for Cisco ASR1000 Series:*::
1134
1135This feature was tested with the following configuration and  sfr_delay_10_1g_no_bundling. yaml traffic profile.
1136Client address range is 16.0.0.1 to 16.0.0.255 
1137
1138[source,python]
1139----
1140interface TenGigabitEthernet1/0/0            <1>
1141 mac-address 0000.0001.0000
1142 mtu 4000
1143 ip address 11.11.11.11 255.255.255.0
1144 ip policy route-map p1_to_p2
1145 ip nat inside                               <2>
1146 load-interval 30
1147!
1148
1149interface TenGigabitEthernet1/1/0
1150 mac-address 0000.0001.0000
1151 mtu 4000
1152 ip address 11.11.11.11 255.255.255.0
1153 ip policy route-map p1_to_p2
1154 ip nat outside                              <3>
1155 load-interval 30
1156
1157ip  nat pool my 200.0.0.0 200.0.0.255 netmask 255.255.255.0  <4>
1158
1159ip nat inside source list 7 pool my overload 
1160access-list 7 permit 16.0.0.0 0.0.0.255                      <5>
1161
1162ip nat inside source list 8 pool my overload                 <6>
1163access-list 8 permit 17.0.0.0 0.0.0.255                      
1164----
1165<1> Must be connected to TRex Client port (router inside port)
1166<2> NAT inside 
1167<3> NAT outside
1168<4> Pool of outside address with overload
1169<5> Match TRex YAML client range
1170<6> In case of dual port TRex
1171
1172// verify 1 and 5 above; rephrased
1173
1174
1175*Limitations:*::
1176
1177. The IPv6-IPv6 NAT feature does not exist on routers, so this feature can work only with IPv4.
1178. Does not support NAT64. 
1179. Bundling/plugin is not fully supported. Consequently, sfr_delay_10.yaml does not work. Use sfr_delay_10_no_bundling.yaml instead.
1180
1181[NOTE]
1182=====================================================================
1183* `--learn-verify` is a TRex debug mechanism for testing the TRex learn mechanism.
1184* Need to run it when DUT is configured without NAT. It will verify that the inside_ip==outside_ip and inside_port==outside_port.
1185=====================================================================
1186
1187=== Flow order/latency verification 
1188
1189In normal mode (without this feature enabled), received traffic is not checked by software. Hardware (Intel NIC) testing for dropped packets occurs at the end of the test. The only exception is the Latency/Jitter packets.
1190This is one reason that with TRex, you *cannot* check features that terminate traffic (for example TCP Proxy).
1191To enable this feature, add `--rx-check <sample>` to the command line options, where <sample> is the sample rate. 
1192The number of flows that will be sent to the software for verification is (1/(sample_rate). For 40Gb/sec traffic you can use a sample rate of 1/128. Watch for Rx CPU% utilization.
1193
1194[NOTE]
1195============
1196This feature changes the TTL of the sampled flows to 255 and expects to receive packets with TTL 254 or 255 (one routing hop). If you have more than one hop in your setup, use `--hops` to change it to a higher value. More than one hop is possible if there are number of routers betwean TRex client side and TRex server side.
1197============
1198
1199This feature ensures that:
1200
1201* Packets get out of DUT in order (from each flow perspective).
1202* There are no packet drops (no need to wait for the end of the test). Without this flag, you must wait for the end of the test in order to identify packet drops, because there is always a difference between TX and Rx, due to RTT.
1203
1204
1205.Full example 
1206[source,bash]
1207----
1208$sudo ./t-rex-64 -f avl/sfr_delay_10_1g.yaml -c 4 -p -l 100 -d 100000 -m 30  --rx-check 128
1209----
1210
1211[source,python]
1212----
1213Cpu Utilization : 0.1 %                                                                       <1>
1214 if|   tx_ok , rx_ok  , rx   ,error,    average   ,   max         , Jitter ,  max window 
1215   |         ,        , check,     , latency(usec),latency (usec) ,(usec)  ,             
1216 --------------------------------------------------------------------------------
1217 0 |     1002,    1002,      2501,   0,         61  ,      70,       3      |  60
1218 1 |     1002,    1002,      2012,   0,         56  ,      63,       2      |  50
1219 2 |     1002,    1002,      2322,   0,         66  ,      74,       5      |  68
1220 3 |     1002,    1002,      1727,   0,         58  ,      68,       2      |  52
1221
1222 Rx Check stats enabled                                                                       <2>
1223 -------------------------------------------------------------------------------------------
1224 rx check:  avg/max/jitter latency,       94  ,     744,       49      |  252  287  309       <3>
1225 
1226 active flows: <6>      10, fif: <5>     308,  drop:        0, errors:        0                <4>
1227 -------------------------------------------------------------------------------------------
1228----
1229<1> CPU% of the Rx thread. If it is too high, *increase* the sample rate.
1230<2> Rx Check section. For more detailed info, press 'r' during the test or at the end of the test.
1231<3> Average latency, max latency, jitter on the template flows in microseconds. This is usually *higher* than the latency check packet because the feature works more on this packet.
1232<4> Drop counters and errors counter should be zero. If not, press 'r' to see the full report or view the report at the end of the test.
1233<5> fif - First in flow. Number of new flows handled by the Rx thread.
1234<6> active flows - number of active flows handled by rx thread 
1235
1236.Press R to Display Full Report
1237[source,python]
1238----
1239 m_total_rx                              : 2 
1240 m_lookup                                : 2 
1241 m_found                                 : 1 
1242 m_fif                                   : 1 
1243 m_add                                   : 1 
1244 m_remove                                : 1 
1245 m_active                                : 0 
1246                                                        <1>
1247 0  0  0  0  1041  0  0  0  0  0  0  0  0  min_delta  : 10 usec 
1248 cnt        : 2 
1249 high_cnt   : 2 
1250 max_d_time : 1041 usec
1251 sliding_average    : 1 usec                            <2>
1252 precent    : 100.0 %
1253 histogram 
1254 -----------
1255 h[1000]  :  2 
1256 tempate_id_ 0 , errors:       0,  jitter: 61           <3>
1257 tempate_id_ 1 , errors:       0,  jitter: 0 
1258 tempate_id_ 2 , errors:       0,  jitter: 0 
1259 tempate_id_ 3 , errors:       0,  jitter: 0 
1260 tempate_id_ 4 , errors:       0,  jitter: 0 
1261 tempate_id_ 5 , errors:       0,  jitter: 0 
1262 tempate_id_ 6 , errors:       0,  jitter: 0 
1263 tempate_id_ 7 , errors:       0,  jitter: 0 
1264 tempate_id_ 8 , errors:       0,  jitter: 0 
1265 tempate_id_ 9 , errors:       0,  jitter: 0 
1266 tempate_id_10 , errors:       0,  jitter: 0 
1267 tempate_id_11 , errors:       0,  jitter: 0 
1268 tempate_id_12 , errors:       0,  jitter: 0 
1269 tempate_id_13 , errors:       0,  jitter: 0 
1270 tempate_id_14 , errors:       0,  jitter: 0 
1271 tempate_id_15 , errors:       0,  jitter: 0 
1272 ager :
1273 m_st_alloc                                 : 1 
1274 m_st_free                                  : 0 
1275 m_st_start                                 : 2 
1276 m_st_stop                                  : 1 
1277 m_st_handle                                : 0 
1278----
1279<1> Errors, if any, shown here
1280<2> Low pass filter on the active average of latency events 
1281<3> Error per template info
1282
1283// IGNORE: this line added to help rendition. Without this line, the "Notes and Limitations" section below does not appear.
1284
1285*Notes and Limitations:*::
1286
1287** To receive the packets TRex does the following:
1288*** Changes the TTL to 0xff and expects 0xFF (loopback) or oxFE (route). (Use `--hop` to configure this value.)
1289*** Adds 24 bytes of metadata as ipv4/ipv6 option header.
1290// clarify "ipv4/ipv6 option header" above
1291
1292== Reference
1293
1294=== Traffic YAML (parameter of -f option)
1295
1296==== Global Traffic YAML section 
1297
1298[source,python]
1299----
1300- duration : 10.0                          <1>
1301  generator :                              <2>
1302          distribution : "seq"           
1303          clients_start : "16.0.0.1"     
1304          clients_end   : "16.0.0.255"   
1305          servers_start : "48.0.0.1"     
1306          servers_end   : "48.0.0.255"   
1307          clients_per_gb : 201
1308          min_clients    : 101
1309          dual_port_mask : "1.0.0.0" 
1310          tcp_aging      : 1
1311          udp_aging      : 1
1312  cap_ipg    : true                            <3>
1313  cap_ipg_min    : 30                          <4>
1314  cap_override_ipg    : 200                    <5>
1315  vlan       : { enable : 1  ,  vlan0 : 100 , vlan1 : 200 } <6>
1316  mac_override_by_ip : true  <7>
1317----
1318<1> Test duration (seconds). Can be overridden using the `-d` option.
1319<2> See full explanation on generator section link:trex_manual.html#_clients_servers_ip_allocation_scheme[here].
1320<3> true (default) indicates that the IPG is taken from the cap file (also taking into account cap_ipg_min and cap_override_ipg if they exist). false indicates that IPG is taken from per template section.
1321<4> The following two options can set the min ipg in microseconds: (if (pkt_ipg<cap_ipg_min) { pkt_ipg=cap_override_ipg} )
1322<5> Value to override (microseconds), as described in note above.
1323<6> Enable load balance feature. See xref:trex_load_bal[trex load balance section] for info.
1324<7> Enable MAC address replacement by client IP.
1325
1326==== Timer Wheel section configuration  
1327
1328(from v2.13)
1329see xref:timer_w[Timer Wheel section] 
1330
1331==== Per template section 
1332// clarify "per template"  
1333
1334[source,python]
1335----
1336     - name: cap2/dns.pcap <1>
1337       cps : 10.0          <2>
1338       ipg : 10000         <3>
1339       rtt : 10000         <4>
1340       w   : 1             <5>
1341       server_addr : "48.0.0.7"    <6>
1342       one_app_server : true       <7>
1343       
1344----
1345<1> The name of the template pcap file. Can be relative path from the t-rex-64 image directory, or an absolute path. The pcap file should include only one flow. (Exception: in case of plug-ins).
1346<2> Connection per second. This is the value that will be used if specifying -m 1 from command line (giving -m x will multiply this 
1347<3> If the global section of the YAML file includes `cap_ipg    : false`, this line sets the inter-packet gap in microseconds. 
1348<4> Should be set to the same value as ipg (microseconds). 
1349<5> Default value: w=1. This indicates to the IP generator how to generate the flows. If w=2, two flows from the same template will be generated in a burst (more for HTTP that has burst of flows).
1350<6> If `one_app_server` is set to true, then all templates will use the same server.
1351<7> If the same server address is required, set this value to true.
1352
1353
1354
1355=== Configuration YAML (parameter of --cfg option)
1356
1357anchor:trex_config[]
1358
1359The configuration file, in YAML format, configures TRex behavior, including:
1360 
1361- IP address or MAC address for each port (source and destination).
1362- Masked interfaces, to ensure that TRex does not try to use the management ports as traffic ports.
1363- Changing the zmq/telnet TCP port.
1364
1365You specify which config file to use by adding --cfg <file name> to the command line arguments. +
1366If no --cfg given, the default `/etc/trex_cfg.yaml` is used. +
1367Configuration file examples can be found in the `$TREX_ROOT/scripts/cfg` folder.
1368
1369==== Basic Configurations
1370
1371[source,python]
1372----
1373     - port_limit    : 2    #mandatory <1>
1374       version       : 2    #mandatory <2>
1375       interfaces    : ["03:00.0", "03:00.1"]   #mandatory <3>
1376       #enable_zmq_pub  : true #optional <4>
1377       #zmq_pub_port    : 4500 #optional <5>
1378       #prefix          : setup1 #optional <6>
1379       #limit_memory    : 1024 #optional <7>
1380       c               : 4 #optional <8>
1381       port_bandwidth_gb : 10 #optional <9>
1382       port_info       :  # set eh mac addr  mandatory
1383            - default_gw : 1.1.1.1   # port 0 <10>
1384              dest_mac   : '00:00:00:01:00:00' # Either default_gw or dest_mac is mandatory <10>
1385              src_mac    : '00:00:00:02:00:00' # optional <11>
1386              ip         : 2.2.2.2 # optional <12>
1387              vlan       : 15 # optional <13>
1388            - dest_mac   : '00:00:00:03:00:00'  # port 1
1389              src_mac    : '00:00:00:04:00:00'
1390            - dest_mac   : '00:00:00:05:00:00'  # port 2
1391              src_mac    : '00:00:00:06:00:00'
1392            - dest_mac   :   [0x0,0x0,0x0,0x7,0x0,0x01]  # port 3 <14>
1393              src_mac    :   [0x0,0x0,0x0,0x8,0x0,0x02] # <14>
1394----
1395<1>  Number of ports. Should be equal to the number of interfaces listed in 3. - mandatory
1396<2>  Must be set to 2. - mandatory
1397<3>  List of interfaces to use. Run `sudo ./dpdk_setup_ports.py --show` to see the list you can choose from. - mandatory
1398<4>  Enable the ZMQ publisher for stats data, default is true. 
1399<5>  ZMQ port number. Default value is good. If running two TRex instances on the same machine, each should be given distinct number. Otherwise, can remove this line.
1400<6>  If running two TRex instances on the same machine, each should be given distinct name. Otherwise, can remove this line. ( Passed to DPDK as --file-prefix arg)
1401<7>  Limit the amount of packet memory used. (Passed to dpdk as -m arg)
1402<8> Number of threads (cores) TRex will use per interface pair ( Can be overridden by -c command line option )
1403<9> The bandwidth of each interface in Gbs. In this example we have 10Gbs interfaces. For VM, put 1. Used to tune the amount of memory allocated by TRex.
1404<10> TRex need to know the destination MAC address to use on each port. You can specify this in one of two ways: +
1405Specify dest_mac directly. +
1406Specify default_gw (since version 2.10). In this case (only if no dest_mac given), TRex will issue ARP request to this IP, and will use
1407the result as dest MAC. If no dest_mac given, and no ARP response received, TRex will exit.
1408
1409<11> Source MAC to use when sending packets from this interface. If not given (since version 2.10), MAC address of the port will be used.
1410<12> If given (since version 2.10), TRex will issue gratitues ARP for the ip + src MAC pair on appropriate port. In stateful mode,
1411gratitues ARP for each ip will be sent every 120 seconds (Can be changed using --arp-refresh-period argument).
1412<13> If given (since version 2.18), all traffic on the port will be sent with this VLAN tag.
1413<14> Old MAC address format. New format is supported since version v2.09.
1414
1415[NOTE]
1416=========================================================================================
1417If you use version earlier than 2.10, or choose to omit the ``ip''
1418and have mac based configuration, be aware that TRex will not send any
1419gratitues ARP and will not answer ARP requests. In this case, you must configure static
1420ARP entries pointing to TRex port on your DUT. For an example config, you can look
1421xref:trex_config[here].
1422=========================================================================================
1423
1424To find out which interfaces (NIC ports) can be used, perform the following:
1425
1426[source,bash]
1427----
1428 $>sudo ./dpdk_setup_ports.py --show
1429
1430 Network devices using DPDK-compatible driver
1431 ============================================
1432
1433 Network devices using kernel driver
1434 ===================================
1435 0000:02:00.0 '82545EM Gigabit Ethernet Controller' if=eth2 drv=e1000 unused=igb_uio *Active* #<1>
1436 0000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb #<2>
1437 0000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
1438 0000:13:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
1439 0000:13:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv= unused=ixgb
1440
1441 Other network devices
1442 =====================
1443 <none>
1444----
1445<1> We see that 02:00.0 is active (our management port).
1446<2> All other NIC ports (03:00.0, 03:00.1, 13:00.0, 13:00.1) can be used.
1447
1448minimum configuration file is:
1449
1450[source,bash]
1451----
1452<none>
1453- port_limit    : 4         
1454  version       : 2
1455  interfaces    : ["03:00.0","03:00.1","13:00.1","13:00.0"]
1456----
1457
1458==== Memory section configuration  
1459
1460The memory section is optional. It is used when there is a need to tune the amount of memory used by TRex packet manager.
1461Default values (from the TRex source code), are usually good for most users. Unless you have some unusual needs, you can
1462eliminate this section.
1463
1464[source,python]
1465----
1466        - port_limit      : 2                                                                           
1467          version       : 2                                                                             
1468          interfaces    : ["03:00.0","03:00.1"]                                                         
1469          memory    :                                           <1>
1470             mbuf_64     : 16380                                <2>
1471             mbuf_128    : 8190
1472             mbuf_256    : 8190
1473             mbuf_512    : 8190
1474             mbuf_1024   : 8190
1475             mbuf_2048   : 4096
1476             traffic_mbuf_64     : 16380                        <3>
1477             traffic_mbuf_128    : 8190
1478             traffic_mbuf_256    : 8190
1479             traffic_mbuf_512    : 8190
1480             traffic_mbuf_1024   : 8190
1481             traffic_mbuf_2048   : 4096
1482             dp_flows    : 1048576                              <4>
1483             global_flows : 10240                               <5>
1484----
1485<1> Memory section header
1486<2> Numbers of memory buffers allocated for packets in transit, per port pair. Numbers are specified per packet size.
1487<3> Numbers of memory buffers allocated for holding the part of the packet which is remained unchanged per template.
1488You should increase numbers here, only if you have very large amount of templates.
1489<4> Number of TRex flow objects allocated (To get best performance they are allocated upfront, and not dynamically).
1490If you expect more concurrent flows than the default (1048576), enlarge this.
1491<5> Number objects TRex allocates for holding NAT ``in transit'' connections. In stateful mode, TRex learn NAT
1492translation by looking at the address changes done by the DUT to the first packet of each flow. So, these are the
1493number of flows for which TRex sent the first flow packet, but did not learn the translation yet. Again, default
1494here (10240) should be good. Increase only if you use NAT and see issues.
1495
1496
1497==== Platform section configuration  
1498
1499The platform section is optional. It is used to tune the performance and allocate the cores to the right NUMA  
1500a configuration file now has the folowing struct to support multi instance 
1501
1502[source,python]
1503----
1504- version       : 2
1505  interfaces    : ["03:00.0","03:00.1"]
1506  port_limit    : 2 
1507....
1508  platform :                                                    <1>
1509        master_thread_id  : 0                                   <2>
1510        latency_thread_id : 5                                   <3>
1511        dual_if   :                                             <4>
1512             - socket   : 0                                     <5>
1513               threads  : [1,2,3,4]                             <6>
1514----
1515<1> Platform section header.
1516<2> Hardware thread_id for control thread.
1517<3> Hardware thread_id for RX thread.
1518<4> ``dual_if'' section defines info for interface pairs (according to the order in ``interfaces'' list).
1519each section, starting with ``- socket'' defines info for different interface pair.
1520<5> The NUMA node from which memory will be allocated for use by the interface pair.
1521<6> Hardware threads to be used for sending packets for the interface pair. Threads are pinned to cores, so specifying threads
1522actually determines the hardware cores.
1523
1524*Real example:* anchor:numa-example[]
1525
1526We connected 2 Intel XL710 NICs close to each other on the motherboard. They shared the same NUMA:
1527
1528image:images/same_numa.png[title="2_NICSs_same_NUMA"]
1529
1530CPU utilization was very high ~100%, with c=2 and c=4 the results were same.
1531
1532Then, we moved the cards to different NUMAs:
1533
1534image:images/different_numa.png[title="2_NICSs_different_NUMAs"]
1535
1536*+*
1537We added configuration to the /etc/trex_cfg.yaml:
1538
1539[source,python]
1540  platform :
1541        master_thread_id  : 0
1542        latency_thread_id : 8
1543        dual_if   :
1544             - socket   : 0
1545               threads  : [1, 2, 3, 4, 5, 6, 7]
1546             - socket   : 1
1547               threads  : [9, 10, 11, 12, 13, 14, 15]
1548
1549This gave best results: with *\~98 Gb/s* TX BW and c=7, CPU utilization became *~21%*! (40% with c=4)
1550
1551
1552==== Timer Wheeel  section configuration  
1553
1554anchor:timer_w[]
1555
1556The memory section is optional. It is used when there is a need to tune the amount of memory used by TRex packet manager.
1557Default values (from the TRex source code), are usually good for most users. Unless you have some unusual needs, you can
1558eliminate this section.
1559
1560==== Timer Wheel section configuration  
1561The flow scheduler uses timer wheel to schedule flows. To tune it for a large number of flows it is possible to change the default values.
1562This is an advance configuration, don't use it if you don't know what you are doing. it can be configure in trex_cfg file and trex traffic profile.
1563
1564[source,python]
1565----
1566  tw :                             
1567     buckets : 1024                <1>
1568     levels  : 3                   <2>
1569     bucket_time_usec : 20.0       <3>
1570----
1571<1> the number of buckets in each level, higher number will improve performance, but will reduce the maximum levels.
1572<2> how many levels.  
1573<3> bucket time in usec. higher number will create more bursts  
1574
1575
1576=== Command line options 
1577
1578anchor:cml-line[]
1579
1580*--allow-coredump*::
1581Allow creation of core dump.
1582
1583*--arp-refresh-period <num>*::
1584Period in seconds between sending of gratuitous ARP for our addresses. Value of 0 means ``never send``.
1585
1586*-c <num>*::
1587Number of hardware threads to use per interface pair. Use at least 4 for TRex 40Gbs. +
1588TRex uses 2 threads for inner needs. Rest of the threads can be used. Maximum number here, can be number of free threads
1589divided by number of interface pairs. +
1590For virtual NICs on VM, we always use one thread per interface pair.
1591
1592*--cfg <file name>*::
1593TRex configuration file to use. See relevant manual section for all config file options.
1594
1595*--checksum-offload*::
1596Enable IP, TCP and UDP tx checksum offloading, using DPDK. This requires all used interfaces to support this.
1597
1598*--client_cfg <file>*::
1599YAML file describing clients configuration. Look link:trex_manual.html#_client_clustering_configuration[here] for details.
1600
1601*-d <num>*::
1602Duration of the test in seconds.
1603
1604*-e*::
1605  Same as `-p`, but change the src/dst IP according to the port. Using this, you will get all the packets of the
1606  same flow from the same port, and with the same src/dst IP. +
1607  It will not work good with NBAR as it expects all clients ip to be sent from same direction.
1608
1609*-f <yaml file>*::
1610Specify traffic YAML configuration file to use. Mandatory option for stateful mode.
1611
1612*--hops <num>*::
1613   Provide number of hops in the setup (default is one hop). Relevant only if the Rx check is enabled.
1614   Look link:trex_manual.html#_flow_order_latency_verification[here] for details.
1615
1616*--iom <mode>*::
1617        I/O mode. Possible values: 0 (silent), 1 (normal), 2 (short).
1618
1619*--ipv6*::
1620       Convert templates to IPv6 mode.
1621
1622*-k <num>*::
1623   Run ``warm up'' traffic for num seconds before starting the test. This is needed if TRex is connected to switch running
1624   spanning tree. You want the switch to see traffic from all relevant source MAC addresses before starting to send real
1625   data. Traffic sent is the same used for the latency test (-l option) +
1626   Current limitation (holds for TRex version 1.82): does not work properly on VM.
1627
1628*-l <rate>*::
1629    In parallel to the test, run latency check, sending packets at rate/sec from each interface.
1630
1631*--learn-mode <mode>*::
1632    Learn the dynamic NAT translation. Look link:trex_manual.html#_nat_support[here] for details.
1633
1634*--learn-verify*::
1635   Used for testing the NAT learning mechanism. Do the learning as if DUT is doing NAT, but verify that packets
1636   are not actually changed.
1637
1638*--limit-ports <port num>*::
1639   Limit the number of ports used. Overrides the ``port_limit'' from config file.
1640
1641*--lm <hex bit mask>*::
1642Mask specifying which ports will send traffic. For example, 0x1 - Only port 0 will send. 0x4 - only port 2 will send.
1643This can be used to verify port connectivity. You can send packets from one port, and look at counters on the DUT.
1644
1645*--lo*::
1646   Latency only - Send only latency packets. Do not send packets from the templates/pcap files.
1647
1648*-m <num>*::
1649   Rate multiplier. TRex will multiply the CPS rate of each template by num.
1650
1651*--nc*::
1652    If set, will terminate exacly at the end of the specified duration.
1653    This provides faster, more accurate TRex termination.
1654    By default (without this option), TRex waits for all flows to terminate gracefully. In case of a very long flow, termination might prolong. 
1655
1656*--no-flow-control-change*::
1657  Prevents TRex from changing flow control. By default (without this option), TRex disables flow control at startup for all cards, except for the Intel XL710 40G card.
1658
1659*--no-key*:: Daemon mode, don't get input from keyboard.
1660
1661*--no-watchdog*:: Disable watchdog.
1662    
1663*-p*::
1664Send all packets of the same flow from the same direction. For each flow, TRex will randomly choose between client port and
1665server port, and send all the packets from this port. src/dst IPs keep their values as if packets are sent from two ports.
1666Meaning, we get on the same port packets from client to server, and from server to client. +
1667If you are using this with a router, you can not relay on routing rules to pass traffic to TRex, you must configure policy
1668based routes to pass all traffic from one DUT port to the other. +
1669
1670*-pm <num>*::
1671   Platform factor. If the setup includes splitter, you can multiply all statistic number displayed by TRex by this factor, so that they will match the DUT counters.
1672  
1673*-pubd*::
1674  Disable ZMQ monitor's publishers.
1675
1676*--rx-check <sample rate>*::
1677        Enable Rx check module. Using this, each thread randomly samples 1/sample_rate of the flows and checks packet order, latency, and additional statistics for the sampled flows.
1678        Note: This feature works on the RX thread.
1679
1680*-v <verbosity level>*::
1681   Show debug info. Value of 1 shows debug info on startup. Value of 3, shows debug info during run at some cases. Might slow down operation.
1682
1683*--vlan*:: Relevant only for stateless mode with Intel 82599 10G NIC.
1684   When configuring flow stat and latency per stream rules, assume all streams uses VLAN.
1685
1686*-w <num seconds>*::
1687   Wait additional time between NICs initialization and sending traffic. Can be useful if DUT needs extra setup time. Default is 1 second.
1688
1689*--active-flows*::            
1690    An experimental switch to scale up or down the number of active flows.
1691    It is not accurate due to the quantization of flow scheduler and in some case does not work. 
1692    Example --active-flows 500000 wil set the ballpark of the active flow to be ~0.5M 
1693
1694
1695ifndef::backend-docbook[]
1696
1697
1698endif::backend-docbook[]
1699
1700== Appendix
1701
1702=== Simulator 
1703 
1704The TRex simulator is a linux application (no DPDK needed) that can run on any Linux (it can also run on TRex machine itself).
1705you can create output pcap file from input of traffic YAML.
1706
1707====  Simulator 
1708
1709
1710[source,bash]
1711----
1712
1713$./bp-sim-64-debug -f avl/sfr_delay_10_1g.yaml -v 1
1714
1715 -- loading cap file avl/delay_10_http_get_0.pcap 
1716 -- loading cap file avl/delay_10_http_post_0.pcap 
1717 -- loading cap file avl/delay_10_https_0.pcap 
1718 -- loading cap file avl/delay_10_http_browsing_0.pcap 
1719 -- loading cap file avl/delay_10_exchange_0.pcap 
1720 -- loading cap file avl/delay_10_mail_pop_0.pcap 
1721 -- loading cap file avl/delay_10_mail_pop_1.pcap 
1722 -- loading cap file avl/delay_10_mail_pop_2.pcap 
1723 -- loading cap file avl/delay_10_oracle_0.pcap 
1724 -- loading cap file avl/delay_10_rtp_160k_full.pcap 
1725 -- loading cap file avl/delay_10_rtp_250k_full.pcap 
1726 -- loading cap file avl/delay_10_smtp_0.pcap 
1727 -- loading cap file avl/delay_10_smtp_1.pcap 
1728 -- loading cap file avl/delay_10_smtp_2.pcap 
1729 -- loading cap file avl/delay_10_video_call_0.pcap 
1730 -- loading cap file avl/delay_10_sip_video_call_full.pcap 
1731 -- loading cap file avl/delay_10_citrix_0.pcap 
1732 -- loading cap file avl/delay_10_dns_0.pcap 
1733 id,name                                    , tps, cps,f-pkts,f-bytes, duration,   Mb/sec,   MB/sec,   c-flows,  PPS,total-Mbytes-duration,errors,flows    #<2>
1734 00, avl/delay_10_http_get_0.pcap             ,404.52,404.52,    44 ,   37830 ,   0.17 , 122.42 ,   15.30 ,         67 , 17799 ,       2 , 0 , 1 
1735 01, avl/delay_10_http_post_0.pcap            ,404.52,404.52,    54 ,   48468 ,   0.21 , 156.85 ,   19.61 ,         85 , 21844 ,       2 , 0 , 1 
1736 02, avl/delay_10_https_0.pcap                ,130.87,130.87,    96 ,   91619 ,   0.22 ,  95.92 ,   11.99 ,         29 , 12564 ,       1 , 0 , 1 
1737 03, avl/delay_10_http_browsing_0.pcap        ,709.89,709.89,    37 ,   34425 ,   0.13 , 195.50 ,   24.44 ,         94 , 26266 ,       2 , 0 , 1 
1738 04, avl/delay_10_exchange_0.pcap             ,253.81,253.81,    43 ,    9848 ,   1.57 ,  20.00 ,    2.50 ,        400 , 10914 ,       0 , 0 , 1 
1739 05, avl/delay_10_mail_pop_0.pcap             ,4.76,4.76,    20 ,    5603 ,   0.17 ,   0.21 ,    0.03 ,          1 ,    95 ,       0 , 0 , 1 
1740 06, avl/delay_10_mail_pop_1.pcap             ,4.76,4.76,   114 ,  101517 ,   0.25 ,   3.86 ,    0.48 ,          1 ,   543 ,       0 , 0 , 1 
1741 07, avl/delay_10_mail_pop_2.pcap             ,4.76,4.76,    30 ,   15630 ,   0.19 ,   0.60 ,    0.07 ,          1 ,   143 ,       0 , 0 , 1 
1742 08, avl/delay_10_oracle_0.pcap               ,79.32,79.32,   302 ,   56131 ,   6.86 ,  35.62 ,    4.45 ,        544 , 23954 ,       0 , 0 , 1 
1743 09, avl/delay_10_rtp_160k_full.pcap          ,2.78,8.33,  1354 , 1232757 ,  61.24 ,  27.38 ,    3.42 ,        170 ,  3759 ,       0 , 0 , 3 
1744 10, avl/delay_10_rtp_250k_full.pcap          ,1.98,5.95,  2069 , 1922000 ,  61.38 ,  30.48 ,    3.81 ,        122 ,  4101 ,       0 , 0 , 3 
1745 11, avl/delay_10_smtp_0.pcap                 ,7.34,7.34,    22 ,    5618 ,   0.19 ,   0.33 ,    0.04 ,          1 ,   161 ,       0 , 0 , 1 
1746 12, avl/delay_10_smtp_1.pcap                 ,7.34,7.34,    35 ,   18344 ,   0.21 ,   1.08 ,    0.13 ,          2 ,   257 ,       0 , 0 , 1 
1747 13, avl/delay_10_smtp_2.pcap                 ,7.34,7.34,   110 ,   96544 ,   0.27 ,   5.67 ,    0.71 ,          2 ,   807 ,       0 , 0 , 1 
1748 14, avl/delay_10_video_call_0.pcap           ,11.90,11.90,  2325 , 2532577 ,  36.56 , 241.05 ,   30.13 ,        435 , 27662 ,       3 , 0 , 1 
1749 15, avl/delay_10_sip_video_call_full.pcap    ,29.35,58.69,  1651 ,  120315 ,  24.56 ,  28.25 ,    3.53 ,        721 , 48452 ,       0 , 0 , 2 
1750 16, avl/delay_10_citrix_0.pcap               ,43.62,43.62,   272 ,   84553 ,   6.23 ,  29.51 ,    3.69 ,        272 , 11866 ,       0 , 0 , 1 
1751 17, avl/delay_10_dns_0.pcap                  ,1975.02,1975.02,     2 ,     162 ,   0.01 ,   2.56 ,    0.32 ,         22 ,  3950 ,       0 , 0 , 1 
1752
1753 00, sum                                      ,4083.86,93928.84,  8580 , 6413941 ,   0.00 , 997.28 ,  124.66 ,       2966 , 215136 ,      12 , 0 , 23 
1754 Memory usage 
1755 size_64        : 1687 
1756 size_128       : 222 
1757 size_256       : 798 
1758 size_512       : 1028 
1759 size_1024      : 86 
1760 size_2048      : 4086 
1761 Total    :       8.89 Mbytes  159% util #<1>
1762
1763----
1764<1> the memory usage of the templates 
1765<2> CSV for all the templates 
1766
1767
1768=== firmware update to XL710/X710 
1769anchor:xl710-firmware[]
1770 
1771To upgrade the firmware  follow  this
1772
1773==== Download the driver 
1774
1775*Download driver i40e from link:https://downloadcenter.intel.com/download/24411/Network-Adapter-Driver-for-PCI-E-40-Gigabit-Network-Connections-under-Linux-[here]
1776*Build the kernel module
1777
1778[source,bash]
1779----
1780$tar -xvzf i40e-1.3.47
1781$cd i40e-1.3.47/src
1782$make 
1783$sudo insmod  i40e.ko
1784----
1785
1786
1787====  Bind the NIC to Linux
1788
1789In this stage we bind the NIC to Linux (take it from DPDK)
1790
1791[source,bash]
1792----
1793$sudo ./dpdk_nic_bind.py --status # show the ports 
1794
1795Network devices using DPDK-compatible driver
1796============================================
17970000:02:00.0 'Device 1583' drv=igb_uio unused=      #<1>
17980000:02:00.1 'Device 1583' drv=igb_uio unused=      #<2>
17990000:87:00.0 'Device 1583' drv=igb_uio unused=
18000000:87:00.1 'Device 1583' drv=igb_uio unused=
1801
1802$sudo dpdk_nic_bind.py -u 02:00.0  02:00.1          #<3> 
1803
1804$sudo dpdk_nic_bind.py -b i40e 02:00.0 02:00.1      #<4>
1805
1806$ethtool -i p1p2                                    #<5>  
1807
1808driver: i40e
1809version: 1.3.47
1810firmware-version: 4.24 0x800013fc 0.0.0             #<6>
1811bus-info: 0000:02:00.1
1812supports-statistics: yes
1813supports-test: yes
1814supports-eeprom-access: yes
1815supports-register-dump: yes
1816supports-priv-flags: yes
1817
1818   
1819$ethtool -S p1p2 
1820$lspci -s 02:00.0 -vvv                              #<7>
1821
1822
1823----
1824<1> XL710 ports that need to unbind from DPDK
1825<2> XL710 ports that need to unbind from DPDK
1826<3> Unbind from DPDK using this command
1827<4> Bind to linux to i40e driver 
1828<5> Show firmware version throw linux driver 
1829<6> Firmare version
1830<7> More info 
1831
1832
1833====  Upgrade 
1834
1835Download NVMUpdatePackage.zip from Intel site link:http://downloadcenter.intel.com/download/24769/NVM-Update-Utility-for-Intel-Ethernet-Converged-Network-Adapter-XL710-X710-Series[here]  
1836It includes the utility `nvmupdate64e`
1837
1838Run this:
1839
1840[source,bash]
1841----
1842$sudo ./nvmupdate64e  
1843----
1844
1845You might need a power cycle and to run this command a few times to get the latest firmware  
1846
1847====  QSFP+ support for XL710
1848
1849see link:https://www.google.co.il/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwjJhPSH3b3LAhUp7nIKHSkACUYQFggaMAA&url=http%3A%2F%2Fwww.intel.co.id%2Fcontent%2Fdam%2Fwww%2Fpublic%2Fus%2Fen%2Fdocuments%2Frelease-notes%2Fxl710-ethernet-controller-feature-matrix.pdf&usg=AFQjCNFhwozfz-XuKGMOy9_MJDbetw15Og&sig2=ce7YU9F9Et6xf6KvqSFBxg&bvm=bv.116636494,d.bGs[QSFP+ support] for QSFP+ support and Firmware requirement for XL710
1850
1851
1852=== TRex with ASA 5585
1853
1854When running TRex aginst ASA 5585, you have to notice following things:
1855
1856* ASA can't forward ipv4 options, so there is a need to use --learn-mode 1 (or 3) in case of NAT. In this mode, bidirectional UDP flows are not supported.
1857--learn-mode 1 support TCP sequence number randomization in both sides of the connection (client to server and server client). For this to work, TRex must learn
1858the translation of packets from both sides, so this mode reduce the amount of connections per second TRex can generate (The number is still high enough to test
1859any existing firewall). If you need higher cps rate, you can use --learn-mode 3. This mode handles sequence number randomization on client->server side only.
1860* Latency should be tested using ICMP with `--l-pkt-mode 2`
1861
1862====  ASA 5585 sample configuration
1863
1864[source,bash]
1865----
1866ciscoasa# show running-config 
1867: Saved
1868
1869: 
1870: Serial Number: JAD194801KX
1871: Hardware:   ASA5585-SSP-10, 6144 MB RAM, CPU Xeon 5500 series 2000 MHz, 1 CPU (4 cores)
1872:
1873ASA Version 9.5(2) 
1874!
1875hostname ciscoasa
1876enable password 8Ry2YjIyt7RRXU24 encrypted
1877passwd 2KFQnbNIdI.2KYOU encrypted
1878names
1879!
1880interface Management0/0
1881 management-only
1882 nameif management
1883 security-level 100
1884 ip address 10.56.216.106 255.255.255.0 
1885!             
1886interface TenGigabitEthernet0/8
1887 nameif inside
1888 security-level 100
1889 ip address 15.0.0.1 255.255.255.0 
1890!             
1891interface TenGigabitEthernet0/9
1892 nameif outside
1893 security-level 0
1894 ip address 40.0.0.1 255.255.255.0 
1895!             
1896boot system disk0:/asa952-smp-k8.bin
1897ftp mode passive
1898pager lines 24
1899logging asdm informational
1900mtu management 1500
1901mtu inside 9000
1902mtu outside 9000
1903no failover   
1904no monitor-interface service-module 
1905icmp unreachable rate-limit 1 burst-size 1
1906no asdm history enable
1907arp outside 40.0.0.2 90e2.baae.87d1 
1908arp inside 15.0.0.2 90e2.baae.87d0 
1909arp timeout 14400
1910no arp permit-nonconnected
1911route management 0.0.0.0 0.0.0.0 10.56.216.1 1
1912route inside 16.0.0.0 255.0.0.0 15.0.0.2 1
1913route outside 48.0.0.0 255.0.0.0 40.0.0.2 1
1914timeout xlate 3:00:00
1915timeout pat-xlate 0:00:30
1916timeout conn 1:00:00 half-closed 0:10:00 udp 0:02:00 sctp 0:02:00 icmp 0:00:02
1917timeout sunrpc 0:10:00 h323 0:05:00 h225 1:00:00 mgcp 0:05:00 mgcp-pat 0:05:00
1918timeout sip 0:30:00 sip_media 0:02:00 sip-invite 0:03:00 sip-disconnect 0:02:00
1919timeout sip-provisional-media 0:02:00 uauth 0:05:00 absolute
1920timeout tcp-proxy-reassembly 0:01:00
1921timeout floating-conn 0:00:00
1922user-identity default-domain LOCAL
1923http server enable
1924http 192.168.1.0 255.255.255.0 management
1925no snmp-server location
1926no snmp-server contact
1927crypto ipsec security-association pmtu-aging infinite
1928crypto ca trustpool policy
1929telnet 0.0.0.0 0.0.0.0 management
1930telnet timeout 5
1931ssh stricthostkeycheck
1932ssh timeout 5 
1933ssh key-exchange group dh-group1-sha1
1934console timeout 0
1935!             
1936tls-proxy maximum-session 1000
1937!             
1938threat-detection basic-threat
1939threat-detection statistics access-list
1940no threat-detection statistics tcp-intercept
1941dynamic-access-policy-record DfltAccessPolicy
1942!             
1943class-map icmp-class
1944 match default-inspection-traffic
1945class-map inspection_default
1946 match default-inspection-traffic
1947!             
1948!             
1949policy-map type inspect dns preset_dns_map
1950 parameters   
1951  message-length maximum client auto
1952  message-length maximum 512
1953policy-map icmp_policy
1954 class icmp-class
1955  inspect icmp 
1956policy-map global_policy
1957 class inspection_default
1958  inspect dns preset_dns_map 
1959  inspect ftp 
1960  inspect h323 h225 
1961  inspect h323 ras 
1962  inspect rsh 
1963  inspect rtsp 
1964  inspect esmtp 
1965  inspect sqlnet 
1966  inspect skinny  
1967  inspect sunrpc 
1968  inspect xdmcp 
1969  inspect sip  
1970  inspect netbios 
1971  inspect tftp 
1972  inspect ip-options 
1973!             
1974service-policy global_policy global
1975service-policy icmp_policy interface outside
1976prompt hostname context 
1977!             
1978jumbo-frame reservation
1979!             
1980no call-home reporting anonymous
1981: end         
1982ciscoasa# 
1983----
1984
1985====  TRex commands example 
1986
1987Using these commands the configuration is:
1988
19891. NAT learn mode (TCP-ACK)
19902. Delay of 1 second at start up (-k 1). It was added because ASA drops the first packets.
19913. Latency is configured to ICMP reply mode (--l-pkt-mode 2).
1992
1993
1994*Simple HTTP:*::
1995[source,bash]
1996----
1997$sudo ./t-rex-64 -f cap2/http_simple.yaml -d 1000 -l 1000 --l-pkt-mode 2 -m 1000  --learn-mode 1 -k 1
1998----
1999
2000This is more realistic traffic for enterprise (we removed from SFR file the bidirectional UDP traffic templates, which (as described above), are not supported in this mode).
2001
2002*Enterprise profile:*::
2003[source,bash]
2004----
2005$sudo ./t-rex-64 -f avl/sfr_delay_10_1g_asa_nat.yaml -d 1000 -l 1000 --l-pkt-mode 2 -m 4 --learn-mode 1 -k 1
2006----
2007
2008The TRex output
2009
2010[source,bash]
2011----
2012-Per port stats table 
2013      ports |               0 |               1 
2014 -----------------------------------------------------------------------------------------
2015   opackets |       106347896 |       118369678 
2016     obytes |     33508291818 |    118433748567 
2017   ipackets |       118378757 |       106338782 
2018     ibytes |    118434305375 |     33507698915 
2019    ierrors |               0 |               0 
2020    oerrors |               0 |               0 
2021      Tx Bw |     656.26 Mbps |       2.27 Gbps 
2022
2023-Global stats enabled 
2024 Cpu Utilization : 18.4  %  31.7 Gb/core 
2025 Platform_factor : 1.0  
2026 Total-Tx        :       2.92 Gbps   NAT time out    :        0 #<1> (0 in wait for syn+ack) #<1>
2027 Total-Rx        :       2.92 Gbps   NAT aged flow id:        0 #<1>
2028 Total-PPS       :     542.29 Kpps   Total NAT active:      163  (12 waiting for syn)
2029 Total-CPS       :       8.30 Kcps   Nat_learn_errors:        0
2030
2031 Expected-PPS    :     539.85 Kpps
2032 Expected-CPS    :       8.29 Kcps  
2033 Expected-BPS    :       2.90 Gbps  
2034
2035 Active-flows    :     7860  Clients :      255   Socket-util : 0.0489 %    
2036 Open-flows      :  3481234  Servers :     5375   Socket :     7860 Socket/Clients :  30.8 
2037 drop-rate       :       0.00  bps   #<1>
2038 current time    : 425.1 sec  
2039 test duration   : 574.9 sec  
2040
2041-Latency stats enabled 
2042 Cpu Utilization : 0.3 %  
2043 if|   tx_ok , rx_ok  , rx   ,error,    average   ,   max         , Jitter ,  max window 
2044   |         ,        , check,     , latency(usec),latency (usec) ,(usec)  ,             
2045 ---------------------------------------------------------------------------------------------------------------- 
2046 0 |   420510,  420495,         0,   1,         58  ,    1555,      14      |  240  257  258  258  219  930  732  896  830  472  190  207  729 
2047 1 |   420496,  420509,         0,   1,         51  ,    1551,      13      |  234  253  257  258  214  926  727  893  826  468  187  204  724
2048----
2049<1>  These counters should be zero 
2050
2051anchor:fedora21_example[]
2052
2053=== Fedora 21 Server installation
2054
2055Download the .iso file from link above, boot with it using Hypervisor or CIMC console. +
2056Troubleshooting -> install in basic graphics mode
2057
2058* In packages selection, choose:
2059
2060** C Development Tools and Libraries
2061
2062** Development Tools
2063
2064** System Tools
2065
2066* Set Ethernet configuration if needed
2067
2068* Use default hard-drive partitions, reclaim space if needed
2069
2070* After installation, edit file /etc/selinux/config +
2071set: +
2072SELINUX=disabled
2073
2074* Run: +
2075systemctl disable firewalld 
2076
2077* Edit file /etc/yum.repos.d/fedora-updates.repo +
2078set everywhere: +
2079enabled=0
2080
2081* Reboot
2082
2083=== Configure Linux host as network emulator
2084
2085There are lots of Linux tutorials on the web, so this will not be full tutorial, only highlighting some key points. Commands
2086were checked on Ubuntu system.
2087
2088For this example:
2089
20901. TRex Client side network is 16.0.0.x 
20912. TRex Server side network is 48.0.0.x 
20923. Linux Client side network eth0 is configured with IPv4 as 172.168.0.1 
20934. Linux Server side network eth1 is configured with IPv4 as 10.0.0.1 
2094
2095[source,bash]
2096----
2097
2098  TRex-0 (16.0.0.1->48.0.0.1 )   <-->
2099
2100                ( 172.168.0.1/255.255.0.0)-eth0 [linux] -( 10.0.0.1/255.255.0.0)-eth1 
2101
2102                <--> TRex-1 (16.0.0.1<-48.0.0.1)
2103  
2104----
2105
2106
2107==== Enable forwarding
2108One time (will be discarded after reboot): +
2109
2110[source,bash]
2111----
2112echo 1 > /proc/sys/net/ipv4/ip_forward
2113----
2114To make this permanent, add the following line to the file /etc/sysctl.conf: +
2115----
2116net.ipv4.ip_forward=1
2117----
2118
2119==== Add static routes
2120Example if for the default TRex networks, 48.0.0.0 and 16.0.0.0.
2121
2122Routing all traffic from 48.0.0.0 to the gateway 10.0.0.100
2123[source,bash]
2124----
2125route add -net 48.0.0.0 netmask 255.255.0.0 gw 10.0.0.100
2126----
2127
2128Routing all traffic from 16.0.0.0 to the gateway 172.168.0.100
2129[source,bash]
2130----
2131route add -net 16.0.0.0 netmask 255.255.0.0 gw 172.168.0.100
2132----
2133If you use stateless mode, and decide to add route only in one direction, remember to disable reverse path check. +
2134For example, to disable on all interfaces:
2135[source,bash]
2136----
2137for i in /proc/sys/net/ipv4/conf/*/rp_filter ; do
2138  echo 0 > $i 
2139done
2140----
2141
2142Alternatively, you can edit /etc/network/interfaces, and add something like this for both ports connected to TRex.
2143This will take effect, only after restarting networking (rebooting the machine in an alternative also).
2144----
2145auto eth1
2146iface eth1 inet static
2147address 16.0.0.100
2148netmask 255.0.0.0
2149network 16.0.0.0
2150broadcast 16.255.255.255
2151... same for 48.0.0.0
2152----
2153
2154==== Add static ARP entries
2155[source,bash]
2156----
2157sudo arp -s 10.0.0.100 <Second TRex port MAC>
2158sudo arp -s 172.168.0.100 <TRex side the NICs are not visible to ifconfig, run:
2159----
2160
2161=== Configure Linux to use VF on Intel X710 and 82599 NICs
2162
2163TRex supports paravirtualized interfaces such as VMXNET3/virtio/E1000 however when connected to a vSwitch, the vSwitch limits the performance. VPP or OVS-DPDK can improve the performance but require more software resources to handle the rate.
2164SR-IOV can accelerate the performance and reduce CPU resource usage as well as latency by utilizing NIC hardware switch capability (the switching  is done by hardware).
2165TRex version 2.15 now includes SR-IOV support for XL710 and X710.
2166The following diagram compares between vSwitch and SR-IOV.
2167
2168image:images/sr_iov_vswitch.png[title="vSwitch_main",width=850]
2169
2170One use case which shows the performance gain that can be acheived by using SR-IOV is when a user wants to create a pool of TRex VMs that tests a pool of virtual DUTs (e.g. ASAv,CSR etc.)
2171When using newly supported SR-IOV, compute, storage and networking resources can be controlled dynamically (e.g by using OpenStack)
2172 
2173image:images/sr_iov_trex.png[title="vSwitch_main",width=850]
2174
2175The above diagram is an example of one server with two NICS. TRex VMs can be allocated on one NIC while the DUTs can be allocated on another.
2176
2177
2178Following are some links we used and lessons we learned while putting up an environment for testing TRex with VF interfaces (using SR-IOV).
2179This is by no means a full toturial of VF usage, and different Linux distributions might need slightly different handling.
2180
2181==== Links and resources
2182link:http://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/xl710-sr-iov-config-guide-gbe-linux-brief.pdf[This]
2183is a good tutorial by Intel of SR-IOV and how to configure. +
2184
2185link:http://dpdk.org/doc/guides/nics/intel_vf.html[This] is a tutorial from DPDK documentation. +
2186
2187==== Linux configuration
2188First, need to verify BIOS support for the feature. Can consult link:http://kpanic.de/node/8[this link] for directions. +
2189Second, need to make sure you have the correct kernel options. +
2190We added the following options to the kernel boot command on Grub: ``iommu=pt intel_iommu=on pci_pt_e820_access=on''. This
2191was needed on Fedora and Ubuntu. When using Centos, adding these options was not needed. +
2192To load the kernel module with the correct VF parameters after reboot add following line ``options i40e max_vfs=1,1'' to some file in ``/etc/modprobe.d/'' +
2193On Centos, we needed to also add the following file (example for x710 ): +
2194
2195[source,bash]
2196----
2197cat /etc/sysconfig/modules/i40e.modules
2198#!/bin/sh
2199rmmod i40e >/dev/null 2>&1
2200exec /sbin/modprobe i40e >/dev/null 2>&1
2201----
2202
2203==== x710 specific instructions
2204For x710 (i40e driver), we needed to download latest kernel driver. On all distributions we were using, existing driver was not new enough. +
2205To make the system use your new compiled driver with the correct parameters: +
2206Copy the .ko file to /lib/modules/Your kernel version as seen by uname -r/kernel/drivers/net/ethernet/intel/i40e/i40e.ko +
2207
2208==== 82599 specific instructions
2209In order to make VF interfaces work correctly, we had to increase mtu on related PF interfaces. +
2210For example, if you run with max_vfs=1,1 (one VF per PF), you will have something like this:
2211
2212[source,bash]
2213----
2214sudo ./dpdk_nic_bind.py -s
2215Network devices using DPDK-compatible driver
2216============================================
22170000:03:10.0 '82599 Ethernet Controller Virtual Function' drv=igb_uio unused=
22180000:03:10.1 '82599 Ethernet Controller Virtual Function' drv=igb_uio unused=
2219
2220Network devices using kernel driver
2221===================================
22220000:01:00.0 'I350 Gigabit Network Connection' if=eth0 drv=igb unused=igb_uio *Active*
22230000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=eth2 drv=ixgbe unused=igb_uio
22240000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' if=eth3 drv=ixgbe unused=igb_uio
2225----
2226
2227In order to work with 0000:03:10.0 and 0000:03:10.1, you will have to run the following +
2228[source,bash]
2229----
2230sudo ifconfig eth3 up mtu 9000
2231sudo ifconfig eth2 up mtu 9000
2232----
2233
2234
2235TRex stateful performance::
2236
2237Using the following command, running on x710 card with VF driver, we can see that TRex can reach 30GBps, using only one core. We can also see that the average latency is around 20 usec, which is pretty much the same value we get on loopback ports with x710 physical function without VF.
2238$sudo ./t-rex-64 -f cap2/http_simple.yaml -m 40000 -l 100 -c 1 -p
2239
2240[source,python]
2241----
2242
2243$sudo ./t-rex-64 -f cap2/http_simple.yaml -m 40000 -l 100 -c 1 -p
2244
2245  -Per port stats table
2246      ports |               0 |               1
2247  -----------------------------------------------------------------------------------------
2248   opackets |       106573954 |       107433792
2249     obytes |     99570878833 |    100374254956
2250   ipackets |       107413075 |       106594490
2251     ibytes | 100354899813    |     99590070585
2252    ierrors |            1038 |            1027
2253    oerrors |               0 |               0
2254      Tx Bw |      15.33 Gbps |      15.45 Gbps
2255 
2256-Global stats enabled
2257Cpu Utilization : 91.5  %  67.3 Gb/core
2258Platform_factor : 1.0
2259Total-Tx :      30.79 Gbps
2260Total-Rx :      30.79 Gbps
2261Total-PPS :       4.12 Mpps
2262Total-CPS :     111.32 Kcps
2263 
2264Expected-PPS :       4.11 Mpps
2265Expected-CPS :     111.04 Kcps
2266Expected-BPS :      30.71 Gbps
2267 
2268Active-flows :    14651  Clients : 255   Socket-util : 0.0912 %
2269Open-flows :  5795073  Servers : 65535   Socket :    14652 Socket/Clients :  57.5
2270drop-rate :       0.00 bps
2271current time : 53.9 sec
2272test duration : 3546.1 sec
2273 
2274 -Latency stats enabled
2275Cpu Utilization : 23.4 %
2276if| tx_ok , rx_ok  , rx check ,error,       latency (usec) ,    Jitter         
2277   | ,        ,          , ,   average   , max  ,    (usec)
2278  -------------------------------------------------------------------------------
22790 | 5233,    5233,         0, 0,         19  , 580,       5      | 37  37  37 4
22801 | 5233,    5233,         0, 0,         22  , 577,       5      | 38  40  39 3
2281----
2282
2283TRex stateless performance::
2284
2285[source,python]
2286----
2287
2288$sudo ./t-rex-64 -i -c 1
2289
2290trex>portattr
2291Port Status
2292 
2293     port |          0           |          1
2294  -------------------------------------------------------------
2295  driver          | net_i40e_vf      |     net_i40e_vf
2296  description     | XL710/X710 Virtual  |  XL710/X710 Virtual
2297 
2298With the console command:
2299start -f stl/imix.py -m 8mpps --force --port 0
2300We can see, that we can reach 8M packet per second, which in this case is around 24.28 Gbit/second.
2301 
2302Global Statistics
2303 
2304connection   : localhost, Port 4501                  total_tx_L2  : 24.28 Gb/sec
2305version      : v2.15 total_tx_L1  : 25.55 Gb/sec
2306cpu_util.    : 80.6% @ 1 cores (1 per port)          total_rx     : 24.28 Gb/sec
2307rx_cpu_util. : 66.8%                                 total_pps    : 7.99 Mpkt/sec
2308async_util.  : 0.18% / 1.84 KB/sec                   drop_rate    : 0.00 b/sec
2309queue_full   : 3,467 pkts
2310 
2311Port Statistics
2312 
2313   port    |         0         |         1         | total
2314  ----------------------------------------------------------------------
2315  owner      |           ibarnea |           ibarnea |
2316  link       |                UP |                UP |
2317  state      | TRANSMITTING      |              IDLE |
2318  speed      |           40 Gb/s |           40 Gb/s |
2319  CPU util.  | 80.6%             |              0.0% |
2320  --         |                   |                   |
2321  Tx bps L2  | 24.28 Gbps        |          0.00 bps |        24.28 Gbps
2322  Tx bps L1  | 25.55 Gbps        |             0 bps |        25.55 Gbps
2323  Tx pps     | 7.99 Mpps         |          0.00 pps |         7.99 Mpps
2324  Line Util. |           63.89 % |            0.00 % |
2325  ---        |                   |                   |
2326  Rx bps     | 0.00 bps          |        24.28 Gbps |        24.28 Gbps
2327  Rx pps     | 0.00 pps          |         7.99 Mpps |         7.99 Mpps
2328  ----       |                   |                   |
2329  opackets   | 658532501         |                 0 |         658532501
2330  ipackets   |                 0 |         658612569 |         658612569
2331  obytes     | 250039721918      |                 0 |      250039721918
2332  ibytes     |                 0 |      250070124150 |      250070124150
2333  tx-bytes   | 250.04 GB         |               0 B |         250.04 GB
2334  rx-bytes   |               0 B |         250.07 GB |         250.07 GB
2335  tx-pkts    | 658.53 Mpkts      |            0 pkts |      658.53 Mpkts
2336  rx-pkts    | 0 pkts            |      658.61 Mpkts |      658.61 Mpkts
2337  -----      |                   |                   |
2338  oerrors    |                 0 |                 0 |                 0
2339  ierrors    |                 0 |            15,539 |            15,539
2340----
2341
2342
2343==== Performance
2344See the performance tests we did link:trex_vm_bench.html[here]
2345
2346=== Mellanox ConnectX-4/5 support 
2347
2348anchor:connectx_support[]
2349
2350Mellanox ConnectX-4/5 adapter family supports 100/56/40/25/10 Gb/s Ethernet speeds. 
2351Its DPDK support is a bit different from Intel DPDK support, more information can be found link:http://dpdk.org/doc/guides/nics/mlx5.html[here].
2352Intel NICs do not require additional kernel drivers (except for igb_uio which is already supported in most distributions). ConnectX-4 works on top of Infiniband API (verbs) and requires special kernel modules/user space libs. 
2353This means that it is required to install OFED package to be able to work with this NIC.
2354Installing the full OFED package is the simplest way to make it work (trying to install part of the package can work too but didn't work for us).
2355The advantage of this model is that you can control it using standard Linux tools (ethtool and ifconfig will work).
2356The disadvantage is the OFED dependency.
2357
2358==== Installation  
2359
2360==== Install Linux 
2361
2362We tested the following distro with TRex and OFED. Others might work too.
2363
2364* CentOS 7.2 
2365
2366Following distro was tested and did *not* work for us.
2367
2368* Fedora 21 (3.17.4-301.fc21.x86_64)  
2369* Ubuntu 14.04.3 LTS (GNU/Linux 3.19.0-25-generic x86_64)  -- crash when RSS was enabled link:https://trex-tgn.cisco.com/youtrack/issue/trex-294[MLX RSS issue]
2370
2371==== Install OFED 
2372
2373Information was taken from  link:http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers[Install OFED]
2374
2375* Download 4.0 OFED tar for your distro  
2376
2377[IMPORTANT]
2378=====================================
2379The version must be *MLNX_OFED_LINUX-4.0* or higher (4.0.x)
2380=====================================
2381
2382[IMPORTANT]
2383=====================================
2384Make sure you have an internet connection without firewalls for HTTPS/HTTP - required by yum/apt-get
2385=====================================
2386
2387.Verify md5
2388[source,bash]
2389----
2390$md5sum md5sum MLNX_OFED_LINUX-4.0.0.0.0-rhel7.2-x86_64.tgz
239158b9fb369d7c62cedbc855661a89a9fd  MLNX_OFED_LINUX-4.0-xx.0.0.0-rhel7.2-x86_64.tgz
2392----
2393
2394.Open the tar
2395[source,bash]
2396----
2397$tar -xvzf MLNX_OFED_LINUX-4.0-x.0.0.0-rhel7.2-x86_64.tgz
2398$cd MLNX_OFED_LINUX-4.0-x.0.0.0-rhel7.2-x86_64
2399----
2400
2401.Run Install script
2402[source,bash]
2403----
2404$sudo ./mlnxofedinstall  
2405
2406
2407Log: /tmp/ofed.build.log
2408Logs dir: /tmp/MLNX_OFED_LINUX.10406.logs
2409
2410Below is the list of MLNX_OFED_LINUX packages that you have chosen
2411(some may have been added by the installer due to package dependencies):
2412
2413ofed-scripts
2414mlnx-ofed-kernel-utils
2415mlnx-ofed-kernel-dkms
2416iser-dkms
2417srp-dkms
2418mlnx-sdp-dkms
2419mlnx-rds-dkms
2420mlnx-nfsrdma-dkms
2421libibverbs1
2422ibverbs-utils
2423libibverbs-dev
2424libibverbs1-dbg
2425libmlx4-1
2426libmlx4-dev
2427libmlx4-1-dbg
2428libmlx5-1
2429libmlx5-dev
2430libmlx5-1-dbg
2431libibumad
2432libibumad-static
2433libibumad-devel
2434ibacm
2435ibacm-dev
2436librdmacm1
2437librdmacm-utils
2438librdmacm-dev
2439mstflint
2440ibdump
2441libibmad
2442libibmad-static
2443libibmad-devel
2444libopensm
2445opensm
2446opensm-doc
2447libopensm-devel
2448infiniband-diags
2449infiniband-diags-compat
2450mft
2451kernel-mft-dkms
2452libibcm1
2453libibcm-dev
2454perftest
2455ibutils2
2456libibdm1
2457ibutils
2458cc-mgr
2459ar-mgr
2460dump-pr
2461ibsim
2462ibsim-doc
2463knem-dkms
2464mxm
2465fca
2466sharp
2467hcoll
2468openmpi
2469mpitests
2470knem
2471rds-tools
2472libdapl2
2473dapl2-utils
2474libdapl-dev
2475srptools
2476mlnx-ethtool
2477libsdp1
2478libsdp-dev
2479sdpnetstat
2480
2481This program will install the MLNX_OFED_LINUX package on your machine.
2482Note that all other Mellanox, OEM, OFED, or Distribution IB packages will be removed.
2483Do you want to continue?[y/N]:y
2484
2485Checking SW Requirements...
2486
2487One or more required packages for installing MLNX_OFED_LINUX are missing.
2488Attempting to install the following missing packages:
2489autotools-dev tcl debhelper dkms tk8.4 libgfortran3 graphviz chrpath automake dpatch flex bison autoconf quilt m4 tcl8.4 libltdl-dev pkg-config pytho
2490bxml2 tk swig gfortran libnl1
2491
2492..
2493
2494Removing old packages...
2495Installing new packages
2496Installing ofed-scripts-3.4...
2497Installing mlnx-ofed-kernel-utils-3.4...
2498Installing mlnx-ofed-kernel-dkms-3.4...
2499
2500Removing old packages...
2501Installing new packages
2502Installing ofed-scripts-3.4...
2503Installing mlnx-ofed-kernel-utils-3.4...
2504Installing mlnx-ofed-kernel-dkms-3.4...
2505Installing iser-dkms-1.8.1...
2506Installing srp-dkms-1.6.1...
2507Installing mlnx-sdp-dkms-3.4...
2508Installing mlnx-rds-dkms-3.4...
2509Installing mlnx-nfsrdma-dkms-3.4...
2510Installing libibverbs1-1.2.1mlnx1...
2511Installing ibverbs-utils-1.2.1mlnx1...
2512Installing libibverbs-dev-1.2.1mlnx1...
2513Installing libibverbs1-dbg-1.2.1mlnx1...
2514Installing libmlx4-1-1.2.1mlnx1...
2515Installing libmlx4-dev-1.2.1mlnx1...
2516Installing libmlx4-1-dbg-1.2.1mlnx1...
2517Installing libmlx5-1-1.2.1mlnx1...
2518Installing libmlx5-dev-1.2.1mlnx1...
2519Installing libmlx5-1-dbg-1.2.1mlnx1...
2520Installing libibumad-1.3.10.2.MLNX20150406.966500d...
2521Installing libibumad-static-1.3.10.2.MLNX20150406.966500d...
2522Installing libibumad-devel-1.3.10.2.MLNX20150406.966500d...
2523Installing ibacm-1.2.1mlnx1...
2524Installing ibacm-dev-1.2.1mlnx1...
2525Installing librdmacm1-1.1.0mlnx...
2526Installing librdmacm-utils-1.1.0mlnx...
2527Installing librdmacm-dev-1.1.0mlnx...
2528Installing mstflint-4.5.0...
2529Installing ibdump-4.0.0...
2530Installing libibmad-1.3.12.MLNX20160814.4f078cc...
2531Installing libibmad-static-1.3.12.MLNX20160814.4f078cc...
2532Installing libibmad-devel-1.3.12.MLNX20160814.4f078cc...
2533Installing libopensm-4.8.0.MLNX20160906.32a95b6...
2534Installing opensm-4.8.0.MLNX20160906.32a95b6...
2535Installing opensm-doc-4.8.0.MLNX20160906.32a95b6...
2536Installing libopensm-devel-4.8.0.MLNX20160906.32a95b6...
2537Installing infiniband-diags-1.6.6.MLNX20160814.999c7b2...
2538Installing infiniband-diags-compat-1.6.6.MLNX20160814.999c7b2...
2539Installing mft-4.5.0...
2540Installing kernel-mft-dkms-4.5.0...
2541Installing libibcm1-1.0.5mlnx2...
2542Installing libibcm-dev-1.0.5mlnx2...
2543Installing perftest-3.0...
2544Installing ibutils2-2.1.1...
2545Installing libibdm1-1.5.7.1...
2546Installing ibutils-1.5.7.1...
2547Installing cc-mgr-1.0...
2548Installing ar-mgr-1.0...
2549Installing dump-pr-1.0...
2550Installing ibsim-0.6...
2551Installing ibsim-doc-0.6...
2552Installing knem-dkms-1.1.2.90mlnx1...
2553Installing mxm-3.5.220c57f...
2554Installing fca-2.5.2431...
2555Installing sharp-1.1.1.MLNX20160915.8763a35...
2556Installing hcoll-3.6.1228...
2557Installing openmpi-1.10.5a1...
2558Installing mpitests-3.2.18...
2559Installing knem-1.1.2.90mlnx1...
2560Installing rds-tools-2.0.7...
2561Installing libdapl2-2.1.9mlnx...
2562Installing dapl2-utils-2.1.9mlnx...
2563Installing libdapl-dev-2.1.9mlnx...
2564Installing srptools-1.0.3...
2565Installing mlnx-ethtool-4.2...
2566Installing libsdp1-1.1.108...
2567Installing libsdp-dev-1.1.108...
2568Installing sdpnetstat-1.60...
2569Selecting previously unselected package mlnx-fw-updater.
2570(Reading database ... 70592 files and directories currently installed.)
2571Preparing to unpack .../mlnx-fw-updater_3.4-1.0.0.0_amd64.deb ...
2572Unpacking mlnx-fw-updater (3.4-1.0.0.0) ...
2573Setting up mlnx-fw-updater (3.4-1.0.0.0) ...
2574
2575Added RUN_FW_UPDATER_ONBOOT=no to /etc/infiniband/openib.conf
2576
2577Attempting to perform Firmware update...
2578Querying Mellanox devices firmware ...
2579
2580Device #1:
2581
2582  Device Type:      ConnectX4
2583  Part Number:      MCX416A-CCA_Ax
2584  Description:      ConnectX-4 EN network interface card; 100GbE dual-port QSFP28; PCIe3.0 x16; ROHS R6
2585  PSID:             MT_2150110033
2586  PCI Device Name:  03:00.0
2587  Base GUID:        248a07030014fc60
2588  Base MAC:         0000248a0714fc60
2589  Versions:         Current        Available     
2590     FW             12.16.1006     12.17.1010    
2591     PXE            3.4.0812       3.4.0903      
2592
2593  Status:           Update required
2594
2595
2596Found 1 device(s) requiring firmware update...
2597
2598Device #1: Updating FW ... Done
2599
2600Restart needed for updates to take effect.
2601Log File: /tmp/MLNX_OFED_LINUX.16084.logs/fw_update.log
2602Please reboot your system for the changes to take effect.
2603Device (03:00.0):
2604        03:00.0 Ethernet controller: Mellanox Technologies MT27620 Family
2605        Link Width: x16
2606        PCI Link Speed: 8GT/s
2607
2608Device (03:00.1):
2609        03:00.1 Ethernet controller: Mellanox Technologies MT27620 Family
2610        Link Width: x16
2611        PCI Link Speed: 8GT/s
2612
2613Installation passed successfully
2614To load the new driver, run:
2615/etc/init.d/openibd restart
2616-----
2617
2618
2619.Reboot 
2620[source,bash]
2621----
2622$sudo  reboot 
2623----
2624
2625
2626.After reboot 
2627[source,bash]
2628----
2629$ibv_devinfo 
2630hca_id: mlx5_1
2631        transport:                      InfiniBand (0)
2632        fw_ver:                         12.17.1010             << 12.17.00
2633        node_guid:                      248a:0703:0014:fc61
2634        sys_image_guid:                 248a:0703:0014:fc60
2635        vendor_id:                      0x02c9
2636        vendor_part_id:                 4115
2637        hw_ver:                         0x0
2638        board_id:                       MT_2150110033
2639        phys_port_cnt:                  1
2640        Device ports:
2641                port:   1
2642                        state:                  PORT_DOWN (1)
2643                        max_mtu:                4096 (5)
2644                        active_mtu:             1024 (3)
2645                        sm_lid:                 0
2646                        port_lid:               0
2647                        port_lmc:               0x00
2648                        link_layer:             Ethernet
2649
2650hca_id: mlx5_0
2651        transport:                      InfiniBand (0)
2652        fw_ver:                         12.17.1010
2653        node_guid:                      248a:0703:0014:fc60
2654        sys_image_guid:                 248a:0703:0014:fc60
2655        vendor_id:                      0x02c9
2656        vendor_part_id:                 4115
2657        hw_ver:                         0x0
2658        board_id:                       MT_2150110033
2659        phys_port_cnt:                  1
2660        Device ports:
2661                port:   1
2662                        state:                  PORT_DOWN (1)
2663                        max_mtu:                4096 (5)
2664                        active_mtu:             1024 (3)
2665                        sm_lid:                 0
2666                        port_lid:               0
2667                        port_lmc:               0x00
2668                        link_layer:             Ethernet
2669
2670----
2671
2672.ibdev2netdev
2673[source,bash]
2674-----
2675$ibdev2netdev
2676mlx5_0 port 1 ==> eth6 (Down)
2677mlx5_1 port 1 ==> eth7 (Down)
2678-----
2679
2680.find the ports 
2681[source,bash]
2682-----
2683
2684        $sudo ./dpdk_setup_ports.py -t
2685  +----+------+---------++---------------------------------------------
2686  | ID | NUMA |   PCI   ||                      Name     |  Driver   | 
2687  +====+======+=========++===============================+===========+=
2688  | 0  | 0    | 06:00.0 || VIC Ethernet NIC              | enic      | 
2689  +----+------+---------++-------------------------------+-----------+-
2690  | 1  | 0    | 07:00.0 || VIC Ethernet NIC              | enic      | 
2691  +----+------+---------++-------------------------------+-----------+-
2692  | 2  | 0    | 0a:00.0 || 82599ES 10-Gigabit SFI/SFP+ Ne| ixgbe     | 
2693  +----+------+---------++-------------------------------+-----------+-
2694  | 3  | 0    | 0a:00.1 || 82599ES 10-Gigabit SFI/SFP+ Ne| ixgbe     | 
2695  +----+------+---------++-------------------------------+-----------+-
2696  | 4  | 0    | 0d:00.0 || Device 15d0                   |           | 
2697  +----+------+---------++-------------------------------+-----------+-
2698  | 5  | 0    | 10:00.0 || I350 Gigabit Network Connectio| igb       | 
2699  +----+------+---------++-------------------------------+-----------+-
2700  | 6  | 0    | 10:00.1 || I350 Gigabit Network Connectio| igb       | 
2701  +----+------+---------++-------------------------------+-----------+-
2702  | 7  | 1    | 85:00.0 || 82599ES 10-Gigabit SFI/SFP+ Ne| ixgbe     | 
2703  +----+------+---------++-------------------------------+-----------+-
2704  | 8  | 1    | 85:00.1 || 82599ES 10-Gigabit SFI/SFP+ Ne| ixgbe     | 
2705  +----+------+---------++-------------------------------+-----------+-
2706  | 9  | 1    | 87:00.0 || MT27700 Family [ConnectX-4]   | mlx5_core |  #<1>
2707  +----+------+---------++-------------------------------+-----------+-
2708  | 10 | 1    | 87:00.1 || MT27700 Family [ConnectX-4]   | mlx5_core |  #<2>
2709  +----+------+---------++---------------------------------------------
2710-----
2711<1>  ConnectX-4 port 0
2712<2>  ConnectX-4 port 1
2713
2714
2715.Config file example
2716[source,bash]
2717-----
2718### Config file generated by dpdk_setup_ports.py ###
2719
2720 - port_limit: 2
2721   version: 2
2722   interfaces: ['87:00.0', '87:00.1']
2723   port_info:
2724      - ip: 1.1.1.1
2725        default_gw: 2.2.2.2
2726      - ip: 2.2.2.2
2727        default_gw: 1.1.1.1
2728
2729   platform:
2730      master_thread_id: 0
2731      latency_thread_id: 1
2732      dual_if:
2733        - socket: 1
2734          threads: [8,9,10,11,12,13,14,15,24,25,26,27,28,29,30,31]
2735-----
2736
2737
2738
2739
2740==== TRex specific implementation details 
2741
2742TRex uses flow director filter to steer specific packets to specific queues. 
2743To support that, we change IPv4.TOS/Ipv6.TC  LSB to *1* for packets we want to handle by software (Other packets will be dropped). So latency packets will have this bit turned on (This is true for all NIC types, not only for ConnectX-4).
2744This means taht if the DUT for some reason clears this bit (change TOS LSB to 0, e.g. change it from 0x3 to 0x2 for example) some TRex features (latency measurement for example) will not work properly.
2745
2746==== Which NIC to buy?
2747
2748NIC with two ports will work better from performance prospective, so it is better to have MCX456A-ECAT(dual 100gb ports) and *not* the  MCX455A-ECAT (single 100gb port).
2749
2750==== Limitation/Issues  
2751
2752* Stateless mode ``per stream statistics'' feature is handled in software (No hardware support like in X710 card).
2753* link:https://trex-tgn.cisco.com/youtrack/issue/trex-261[Latency issue]
2754* link:https://trex-tgn.cisco.com/youtrack/issue/trex-262[Statful RX out of order]
2755* link:https://trex-tgn.cisco.com/youtrack/issue/trex-273[Fedora 21 & OFED 3.4.1]
2756
2757
2758==== Performance Cycles/Packet ConnectX-4 vs Intel XL710
2759
2760For TRex version v2.11, these are the comparison results between XL710 and ConnectX-4 for various scenarios.
2761
2762.Stateless MPPS/Core [Preliminary]
2763image:images/xl710_vs_mlx5_64b.png[title="Stateless 64B"] 
2764
2765.Stateless Gb/Core [Preliminary]
2766image:images/xl710_vs_mlx5_var_size.png[title="Stateless variable size packet"] 
2767
2768*Comments*::
2769 
27701. MLX5 can reach ~50MPPS while XL710 is limited to 35MPPS. (With potential future fix it will be ~65MPPS)
27712. For Stateless/Stateful 256B profiles, ConnectX-4 uses half of the CPU cycles per packet. ConnectX-4 probably can handle in a better way chained mbufs (scatter gather).
27723. In the average stateful scenario, ConnectX-4 is the same as XL710.
27734. For Stateless 64B/IMIX profiles, ConnectX-4 uses 50-90% more CPU cycles per packet (it is actually even more because there is the TRex scheduler overhead) - it means that in worst case scenario, you will need x2 CPU for the same total MPPS.
2774
2775
2776[NOTE]
2777=====================================
2778There is a task to automate the production of thess reports
2779=====================================
2780
2781==== Troubleshooting
2782
2783* Before running TRex make sure the commands    `ibv_devinfo` and  `ibdev2netdev` present the NICS
2784* `ifconfig` should work too, you need to be able to ping from those ports
2785* run TRex server with '-v 7' for example `$sudo ./t-rex-64 -i -v 7`
2786
2787
2788==== Limitations/Issues
2789
2790* The order of the mlx5 PCI addrees in /etc/trex_cfg.yaml should be in the same order reported by `./dpdk_setup_ports.py` tool see link:https://groups.google.com/forum/#!searchin/trex-tgn/unable$20to$20run$20rx-check$20with$204$20port%7Csort:relevance/trex-tgn/DsORbw3AbaU/IT-KLcZbDgAJ[issue_thread] and link:https://trex-tgn.cisco.com/youtrack/issue/trex-295[trex-295] else there would be reported this error.
2791
2792.Will work
2793[source,bash]
2794----
2795 - version         : 2
2796   interfaces      : ["03:00.0","03:00.1"]
2797   port_limit      : 2 
2798----
2799
2800.Will not work
2801[source,bash]
2802----
2803 - version         : 2
2804   interfaces      : ["03:00.1","03:00.0"]
2805   port_limit      : 2 
2806----
2807
2808.The error 
2809[source,bash]
2810----
2811 PMD: net_mlx5: 0x7ff2dcfcb2c0: flow director mode 0 not supported
2812 EAL: Error - exiting with code: 1
2813   Cause: rte_eth_dev_filter_ctrl: err=-22, port=2
2814----
2815
2816==== Build with native OFED 
2817
2818In some case there is a need to build the dpdk-mlx5 with different OFED (not just 4.0 maybe newer) 
2819to do so run this on native machine 
2820
2821[source,bash]
2822---- 
2823[csi-trex-07]> ./b configure
2824Setting top to                           : /auto/srg-sce-swinfra-usr/emb/users/hhaim/work/depot/asr1k/emb/private/hhaim/bp_sim_git/trex-core 
2825Setting out to                           : /auto/srg-sce-swinfra-usr/emb/users/hhaim/work/depot/asr1k/emb/private/hhaim/bp_sim_git/trex-core/linux_dpdk/build_dpdk 
2826Checking for program 'g++, c++'          : /bin/g++ 
2827Checking for program 'ar'                : /bin/ar 
2828Checking for program 'gcc, cc'           : /bin/gcc 
2829Checking for program 'ar'                : /bin/ar 
2830Checking for program 'ldd'               : /bin/ldd 
2831Checking for library z                   : yes 
2832Checking for OFED                        : Found needed version 4.0   #<1> 
2833Checking for library ibverbs             : yes 
2834'configure' finished successfully (1.826s)
2835---- 
2836<1> make sure it was identify 
2837
2838
2839.Code change need for new OFED 
2840[source,python]
2841---- 
2842        index fba7540..a55fe6b 100755
2843        --- a/linux_dpdk/ws_main.py
2844        +++ b/linux_dpdk/ws_main.py
2845        @@ -143,8 +143,11 @@ def missing_pkg_msg(fedora, ubuntu):
2846         def check_ofed(ctx):
2847             ctx.start_msg('Checking for OFED')
2848             ofed_info='/usr/bin/ofed_info'
2849        -    ofed_ver= '-3.4-'
2850        -    ofed_ver_show= 'v3.4'
2851        +
2852        +    ofed_ver_re = re.compile('.*[-](\d)[.](\d)[-].*')
2853        +
2854        +    ofed_ver= 40                                     <1>
2855        +    ofed_ver_show= '4.0'
2856        
2857        
2858        --- a/scripts/dpdk_setup_ports.py
2859        +++ b/scripts/dpdk_setup_ports.py
2860        @@ -366,8 +366,8 @@ Other network devices
2861         
2862                 ofed_ver_re = re.compile('.*[-](\d)[.](\d)[-].*')
2863         
2864        -        ofed_ver= 34
2865        -        ofed_ver_show= '3.4-1'
2866        +        ofed_ver= 40                                 <2>
2867        +        ofed_ver_show= '4.0'
2868---- 
2869<1> change to new version 
2870<2> change to new version 
2871 
2872
2873
2874=== Cisco VIC support 
2875
2876anchor:ciscovic_support[]
2877
2878* Supported from TRex version v2.12
2879* Only 1300 series Cisco adapter supported
2880* Must have VIC firmware version 2.0(13) for UCS C-series servers. Will be GA in Febuary 2017.
2881* Must have VIC firmware version 3.1(2) for blade servers (which supports more filtering capabilities).
2882* The feature can be enabled via Cisco CIMC or USCM with the 'advanced filters' radio button.  When enabled, these additional flow director modes are available:
2883        RTE_ETH_FLOW_NONFRAG_IPV4_OTHER
2884        RTE_ETH_FLOW_NONFRAG_IPV4_SCTP
2885        RTE_ETH_FLOW_NONFRAG_IPV6_UDP
2886        RTE_ETH_FLOW_NONFRAG_IPV6_TCP
2887        RTE_ETH_FLOW_NONFRAG_IPV6_SCTP
2888        RTE_ETH_FLOW_NONFRAG_IPV6_OTHER
2889
2890==== vNIC Configuration Parameters
2891
2892*Number of Queues*::
2893  The maximum number of receive queues (RQs), work queues (WQs) and completion queues (CQs) are configurable on a per vNIC basis through the Cisco UCS Manager (CIMC or UCSM).
2894  These values should be configured as follows:
2895  * The number of WQs should be greater or equal to the number of threads (-c value) plus 1
2896  * The number of RQs should be greater than 5
2897  * The number of CQs should set to WQs + RQs 
2898  * Unless there is a lack of resources due to creating many vNICs, it is recommended that the WQ and RQ sizes be set to the *maximum*. 
2899
2900*Advanced filters*::
2901  advanced filter should be enabled 
2902
2903*MTU*::
2904  set the MTU to maximum 9000-9190 (Depends on the FW version)
2905  
2906more information could be found here link:http://www.dpdk.org/doc/guides/nics/enic.html?highlight=enic[enic DPDK]
2907 
2908image:images/UCS-B-adapter_policy_1.jpg[title="vic configuration",align="center",width=800]
2909image:images/UCS-B-adapter_policy_2.jpg[title="vic configuration",align="center",width=800]
2910
2911In case it is not configured correctly, this error will be seen
2912
2913.VIC error in case of wrong RQ/WQ 
2914[source,bash]
2915----
2916Starting  TRex v2.15 please wait  ...
2917no client generator pool configured, using default pool
2918no server generator pool configured, using default pool
2919zmq publisher at: tcp://*:4500
2920Number of ports found: 2
2921set driver name rte_enic_pmd
2922EAL: Error - exiting with code: 1
2923  Cause: Cannot configure device: err=-22, port=0     #<1>
2924----
2925<1>There is not enough queues 
2926
2927
2928running it with verbose mode with CLI  `-v 7`  
2929
2930[source,bash]
2931----
2932$sudo ./t-rex-64 -f cap2/dns.yaml -c 1 -m 1 -d 10  -l 1000 -v 7
2933----
2934
2935will give move info 
2936
2937[source,bash]
2938----
2939EAL:   probe driver: 1137:43 rte_enic_pmd
2940PMD: rte_enic_pmd: Advanced Filters available
2941PMD: rte_enic_pmd: vNIC MAC addr 00:25:b5:99:00:4c wq/rq 256/512 mtu 1500, max mtu:9190
2942PMD: rte_enic_pmd: vNIC csum tx/rx yes/yes rss yes intr mode any type min 
2943PMD: rte_enic_pmd: vNIC resources avail: wq 2 rq 2 cq 4 intr 6                  #<1>
2944EAL: PCI device 0000:0f:00.0 on NUMA socket 0
2945EAL:   probe driver: 1137:43 rte_enic_pmd
2946PMD: rte_enic_pmd: Advanced Filters available
2947PMD: rte_enic_pmd: vNIC MAC addr 00:25:b5:99:00:5c wq/rq 256/512 mtu 1500, max 
2948----
2949<1> rq is 2 which mean 1 input queue which is less than minimum required by trex (rq should be at least 5)
2950
2951
2952==== Limitations/Issues
2953
2954* Stateless mode ``per stream statistics'' feature is handled in software (No hardware support like in X710 card).
2955* link:https://trex-tgn.cisco.com/youtrack/issue/trex-272[QSFP+ issue]
2956* VLAN 0 Priority Tagging
2957If a vNIC is configured in TRUNK mode by the UCS manager, the adapter will priority tag egress packets according to 802.1Q if they were not already VLAN tagged by software. If the adapter is connected to a properly configured switch, there will be no unexpected behavior.
2958In test setups where an Ethernet port of a Cisco adapter in TRUNK mode is connected point-to-point to another adapter port or connected though a router instead of a switch, all ingress packets will be VLAN tagged. TRex can work with that see more link:http://www.cisco.com/c/en/us/support/docs/servers-unified-computing/ucs-c-series-rack-servers/117637-technote-UCS-00.html[upstream VIC]
2959Upstream the VIC always tags packets with an 802.1p header.In downstream it is possible to remove the tag (not supported by TRex yet)
2960
2961
2962=== More active flows   
2963
2964From version v2.13 there is a new Stateful scheduler that works better in the case of high concurrent/active flows.
2965In case of EMIX 70% better performance was observed.
2966In this tutorial there are 14 DP cores & up to 8M flows.  
2967There is a  special config file to enlarge the number of flows. This tutorial  present the difference in performance between the old scheduler and the new. 
2968
2969==== Setup details
2970
2971[cols="1,5"]
2972|=================
2973| Server: | UCSC-C240-M4SX
2974| CPU:    | 2 x Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
2975| RAM:    | 65536 @ 2133 MHz
2976| NICs:   | 2 x Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 01)
2977| QSFP:   | Cisco QSFP-H40G-AOC1M
2978| OS:     | Fedora 18
2979| Switch: | Cisco Nexus 3172 Chassis, System version: 6.0(2)U5(2).
2980| TRex:   | v2.13/v2.12 using 7 cores per dual interface.
2981|=================
2982
2983==== Traffic profile 
2984
2985.cap2/cur_flow_single.yaml
2986[source,python]
2987----
2988- duration : 0.1
2989  generator :  
2990          distribution : "seq"
2991          clients_start : "16.0.0.1"
2992          clients_end   : "16.0.0.255"
2993          servers_start : "48.0.0.1"
2994          servers_end   : "48.0.255.255"
2995          clients_per_gb : 201
2996          min_clients    : 101
2997          dual_port_mask : "1.0.0.0" 
2998  cap_info : 
2999     - name: cap2/udp_10_pkts.pcap  <1>
3000       cps : 100
3001       ipg : 200
3002       rtt : 200
3003       w   : 1
3004----
3005<1> One directional UDP flow with 10 packets of 64B
3006
3007
3008==== Config file command 
3009
3010./cfg/trex_08_5mflows.yaml
3011[source,python]
3012----
3013- port_limit: 4
3014  version: 2
3015  interfaces: ['05:00.0', '05:00.1', '84:00.0', '84:00.1']
3016  port_info:
3017      - ip: 1.1.1.1
3018        default_gw: 2.2.2.2
3019      - ip: 3.3.3.3
3020        default_gw: 4.4.4.4
3021
3022      - ip: 4.4.4.4
3023        default_gw: 3.3.3.3
3024      - ip: 2.2.2.2
3025        default_gw: 1.1.1.1
3026
3027  platform:
3028      master_thread_id: 0
3029      latency_thread_id: 15
3030      dual_if:
3031        - socket: 0
3032          threads: [1,2,3,4,5,6,7]
3033
3034        - socket: 1
3035          threads: [8,9,10,11,12,13,14]
3036  memory    :                                          
3037        dp_flows    : 1048576                    <1>
3038----
3039<1> add memory section with more flows 
3040
3041==== Traffic command 
3042
3043.command 
3044[source,bash]
3045----
3046$sudo ./t-rex-64 -f cap2/cur_flow_single.yaml -m 30000 -c 7 -d 40 -l 1000 --active-flows 5000000 -p --cfg cfg/trex_08_5mflows.yaml
3047----
3048
3049The number of active flows can be change using `--active-flows` CLI. in this example it is set to 5M flows
3050
3051
3052==== Script to get performance per active number of flows 
3053
3054[source,python]
3055----
3056
3057def minimal_stateful_test(server,csv_file,a_active_flows):
3058
3059    trex_client = CTRexClient(server)                                   <1>
3060
3061    trex_client.start_trex(                                             <2>
3062            c = 7,
3063            m = 30000,
3064            f = 'cap2/cur_flow_single.yaml',
3065            d = 30,
3066            l = 1000,
3067            p=True,
3068            cfg = "cfg/trex_08_5mflows.yaml",
3069            active_flows=a_active_flows,
3070            nc=True
3071            )
3072
3073    result = trex_client.sample_to_run_finish()                         <3>
3074
3075    active_flows=result.get_value_list('trex-global.data.m_active_flows')
3076    cpu_utl=result.get_value_list('trex-global.data.m_cpu_util')
3077    pps=result.get_value_list('trex-global.data.m_tx_pps')
3078    queue_full=result.get_value_list('trex-global.data.m_total_queue_full')
3079    if queue_full[-1]>10000:
3080        print("WARNING QUEU WAS FULL");
3081    tuple=(active_flows[-5],cpu_utl[-5],pps[-5],queue_full[-1])         <4>
3082    file_writer = csv.writer(test_file)
3083    file_writer.writerow(tuple);
3084    
3085
3086
3087if __name__ == '__main__':
3088    test_file = open('tw_2_layers.csv', 'wb');
3089    parser = argparse.ArgumentParser(description="Example for TRex Stateful, assuming server daemon is running.")
3090
3091    parser.add_argument('-s', '--server',
3092                        dest='server',
3093                        help='Remote trex address',
3094                        default='127.0.0.1',
3095                        type = str)
3096    args = parser.parse_args()
3097
3098    max_flows=8000000;
3099    min_flows=100;
3100    active_flow=min_flows;
3101    num_point=10
3102    factor=math.exp(math.log(max_flows/min_flows,math.e)/num_point);
3103    for i in range(num_point+1):
3104        print("=====================",i,math.floor(active_flow))
3105        minimal_stateful_test(args.server,test_file,math.floor(active_flow))      
3106        active_flow=active_flow*factor
3107
3108    test_file.close();
3109----
3110<1> connect 
3111<2> Start with different active_flows
3112<3> wait for the results 
3113<4> get the results and save to csv file
3114
3115This script iterate between 100 to 8M active flows and save the results to csv file.
3116
3117==== The results v2.12 vs v2.14
3118
3119.MPPS/core 
3120image:images/tw1_0.png[title="results",align="center"]
3121
3122.MPPS/core 
3123image:images/tw0_0_chart.png[title="results",align="center",width=800]
3124
3125* TW0 - v2.14 default configuration 
3126* PQ  - v2.12 default configuration 
3127
3128* To run the same script on v2.12 (that does not support `active_flows` directive) a patch was introduced.
3129
3130*Observation*::
3131  * TW works better (up to 250%) in case of 25-100K flows 
3132  * TW scale better with active-flows
3133
3134==== Tunning 
3135
3136let's add another modes called *TW1*, in this mode the scheduler is tune to have more buckets (more memory)
3137
3138.TW1 cap2/cur_flow_single_tw_8.yaml
3139[source,python]
3140----
3141- duration : 0.1
3142  generator :  
3143          distribution : "seq"
3144          clients_start : "16.0.0.1"
3145          clients_end   : "16.0.0.255"
3146          servers_start : "48.0.0.1"
3147          servers_end   : "48.0.255.255"
3148          clients_per_gb : 201
3149          min_clients    : 101
3150          dual_port_mask : "1.0.0.0" 
3151  tw :                            
3152     buckets : 16384                    <1>
3153     levels  : 2                        <2>
3154     bucket_time_usec : 20.0
3155  cap_info : 
3156     - name: cap2/udp_10_pkts.pcap
3157       cps : 100
3158       ipg : 200
3159       rtt : 200
3160       w   : 1
3161----
3162<1> more buckets  
3163<2> less levels 
3164
3165
3166in *TW2* mode we have the same template, duplicated one with short IPG and another one with high IPG 
316710% of the new flows will be with long IPG
3168
3169.TW2 cap2/cur_flow.yaml
3170[source,python]
3171----
3172- duration : 0.1
3173  generator :  
3174          distribution : "seq"
3175          clients_start : "16.0.0.1"
3176          clients_end   : "16.0.0.255"
3177          servers_start : "48.0.0.1"
3178          servers_end   : "48.0.255.255"
3179          clients_per_gb : 201
3180          min_clients    : 101
3181          dual_port_mask : "1.0.0.0" 
3182          tcp_aging      : 0
3183          udp_aging      : 0
3184  mac        : [0x0,0x0,0x0,0x1,0x0,0x00]
3185  #cap_ipg    : true
3186  cap_info : 
3187     - name: cap2/udp_10_pkts.pcap
3188       cps : 10
3189       ipg : 100000
3190       rtt : 100000
3191       w   : 1
3192     - name: cap2/udp_10_pkts.pcap   
3193       cps : 90
3194       ipg : 2
3195       rtt : 2
3196       w   : 1
3197----
3198
3199==== Full results 
3200
3201
3202* PQ - v2.12 default configuration 
3203* TW0 - v2.14 default configuration 
3204* TW1 - v2.14 more buckets 16K
3205* TW2 - v2.14 two templates 
3206
3207.MPPS/core Comparison
3208image:images/tw1.png[title="results",align="center",width=800]
3209
3210.MPPS/core
3211image:images/tw1_tbl.png[title="results",align="center"]
3212
3213.Factor relative to v2.12 results 
3214image:images/tw2.png[title="results",align="center",width=800]
3215
3216.Extrapolation Total GbE per UCS with average packet size of 600B 
3217image:images/tw3.png[title="results",align="center",width=800]
3218
3219Observation:
3220
3221* TW2 (two flows) almost does not have a performance impact
3222* TW1 (more buckets) improve the performance up to a point
3223* TW is general is better than PQ
3224
3225
3226
3227