flow_bifurcation.rst revision f7a9461e
1..  BSD LICENSE
2    Copyright(c) 2016 Intel Corporation. All rights reserved.
3    All rights reserved.
4
5    Redistribution and use in source and binary forms, with or without
6    modification, are permitted provided that the following conditions
7    are met:
8
9    * Redistributions of source code must retain the above copyright
10    notice, this list of conditions and the following disclaimer.
11    * Redistributions in binary form must reproduce the above copyright
12    notice, this list of conditions and the following disclaimer in
13    the documentation and/or other materials provided with the
14    distribution.
15    * Neither the name of Intel Corporation nor the names of its
16    contributors may be used to endorse or promote products derived
17    from this software without specific prior written permission.
18
19    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31
32Flow Bifurcation How-to Guide
33=============================
34
35Flow Bifurcation is a mechanism which uses hardware capable Ethernet devices
36to split traffic between Linux user space and kernel space. Since it is a
37hardware assisted feature this approach can provide line rate processing
38capability. Other than :ref:`KNI <kni>`, the software is just required to
39enable device configuration, there is no need to take care of the packet
40movement during the traffic split. This can yield better performance with
41less CPU overhead.
42
43The Flow Bifurcation splits the incoming data traffic to user space
44applications (such as DPDK applications) and/or kernel space programs (such as
45the Linux kernel stack). It can direct some traffic, for example data plane
46traffic, to DPDK, while directing some other traffic, for example control
47plane traffic, to the traditional Linux networking stack.
48
49There are a number of technical options to achieve this. A typical example is
50to combine the technology of SR-IOV and packet classification filtering.
51
52SR-IOV is a PCI standard that allows the same physical adapter to be split as
53multiple virtual functions. Each virtual function (VF) has separated queues
54with physical functions (PF). The network adapter will direct traffic to a
55virtual function with a matching destination MAC address. In a sense, SR-IOV
56has the capability for queue division.
57
58Packet classification filtering is a hardware capability available on most
59network adapters. Filters can be configured to direct specific flows to a
60given receive queue by hardware. Different NICs may have different filter
61types to direct flows to a Virtual Function or a queue that belong to it.
62
63In this way the Linux networking stack can receive specific traffic through
64the kernel driver while a DPDK application can receive specific traffic
65bypassing the Linux kernel by using drivers like VFIO or the DPDK ``igb_uio``
66module.
67
68.. _figure_flow_bifurcation_overview:
69
70.. figure:: img/flow_bifurcation_overview.*
71
72   Flow Bifurcation Overview
73
74
75Using Flow Bifurcation on IXGBE in Linux
76----------------------------------------
77
78On Intel 82599 10 Gigabit Ethernet Controller series NICs Flow Bifurcation can
79be achieved by SR-IOV and Intel Flow Director technologies. Traffic can be
80directed to queues by the Flow Director capability, typically by matching
815-tuple of UDP/TCP packets.
82
83The typical procedure to achieve this is as follows:
84
85#. Boot the system without iommu, or with ``iommu=pt``.
86
87#. Create Virtual Functions:
88
89   .. code-block:: console
90
91       echo 2 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
92
93#. Enable and set flow filters:
94
95   .. code-block:: console
96
97       ethtool -K eth1 ntuple on
98       ethtool -N eth1 flow-type udp4 src-ip 192.0.2.2 dst-ip 198.51.100.2 \
99               action $queue_index_in_VF0
100       ethtool -N eth1 flow-type udp4 src-ip 198.51.100.2 dst-ip 192.0.2.2 \
101               action $queue_index_in_VF1
102
103   Where:
104
105   * ``$queue_index_in_VFn``: Bits 39:32 of the variable defines VF id + 1; the lower 32 bits indicates the queue index of the VF. Thus:
106
107     * ``$queue_index_in_VF0`` = ``(0x1 & 0xFF) << 32 + [queue index]``.
108
109     * ``$queue_index_in_VF1`` = ``(0x2 & 0xFF) << 32 + [queue index]``.
110
111   .. _figure_ixgbe_bifu_queue_idx:
112
113   .. figure:: img/ixgbe_bifu_queue_idx.*
114
115#. Compile the DPDK application and insert ``igb_uio`` or probe the ``vfio-pci`` kernel modules as normal.
116
117#. Bind the virtual functions:
118
119   .. code-block:: console
120
121       modprobe vfio-pci
122       dpdk-devbind.py -b vfio-pci 01:10.0
123       dpdk-devbind.py -b vfio-pci 01:10.1
124
125#. Run a DPDK application on the VFs:
126
127   .. code-block:: console
128
129       testpmd -c 0xff -n 4 -- -i -w 01:10.0 -w 01:10.1 --forward-mode=mac
130
131In this example, traffic matching the rules will go through the VF by matching
132the filter rule. All other traffic, not matching the rules, will go through
133the default queue or scaling on queues in the PF. That is to say UDP packets
134with the specified IP source and destination addresses will go through the
135DPDK application. All other traffic, with different hosts or different
136protocols, will go through the Linux networking stack.
137
138.. note::
139
140    * The above steps work on the Linux kernel v4.2.
141
142    * The Flow Bifurcation is implemented in Linux kernel and ixgbe kernel driver using the following patches:
143
144        * `ethtool: Add helper routines to pass vf to rx_flow_spec <https://patchwork.ozlabs.org/patch/476511/>`_
145
146        * `ixgbe: Allow flow director to use entire queue space <https://patchwork.ozlabs.org/patch/476516/>`_
147
148    * The Ethtool version used in this example is 3.18.
149
150
151Using Flow Bifurcation on I40E in Linux
152---------------------------------------
153
154On Intel X710/XL710 series Ethernet Controllers Flow Bifurcation can be
155achieved by SR-IOV, Cloud Filter and L3 VEB switch. The traffic can be
156directed to queues by the Cloud Filter and L3 VEB switch's matching rule.
157
158* L3 VEB filters work for non-tunneled packets. It can direct a packet just by
159  the Destination IP address to a queue in a VF.
160
161* Cloud filters work for the following types of tunneled packets.
162
163    * Inner mac.
164
165    * Inner mac + VNI.
166
167    * Outer mac + Inner mac + VNI.
168
169    * Inner mac + Inner vlan + VNI.
170
171    * Inner mac + Inner vlan.
172
173The typical procedure to achieve this is as follows:
174
175#. Boot the system without iommu, or with ``iommu=pt``.
176
177#. Build and insert the ``i40e.ko`` module.
178
179#. Create Virtual Functions:
180
181   .. code-block:: console
182
183       echo 2 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
184
185#. Add udp port offload to the NIC if using cloud filter:
186
187   .. code-block:: console
188
189       ip li add vxlan0 type vxlan id 42 group 239.1.1.1 local 10.16.43.214 dev <name>
190       ifconfig vxlan0 up
191       ip -d li show vxlan0
192
193   .. note::
194
195       Output such as ``add vxlan port 8472, index 0 success`` should be
196       found in the system log.
197
198#. Examples of enabling and setting flow filters:
199
200   * L3 VEB filter, for a route whose destination IP is 192.168.50.108 to VF
201     0's queue 2.
202
203     .. code-block:: console
204
205       ethtool -N <dev_name> flow-type ip4 dst-ip 192.168.50.108 \
206               user-def 0xffffffff00000000 action 2 loc 8
207
208   * Inner mac, for a route whose inner destination mac is 0:0:0:0:9:0 to
209     PF's queue 6.
210
211     .. code-block:: console
212
213       ethtool -N <dev_name> flow-type ether dst 00:00:00:00:00:00 \
214               m ff:ff:ff:ff:ff:ff src 00:00:00:00:09:00 m 00:00:00:00:00:00 \
215               user-def 0xffffffff00000003 action 6 loc 1
216
217   * Inner mac + VNI, for a route whose inner destination mac is 0:0:0:0:9:0
218     and VNI is 8 to PF's queue 4.
219
220     .. code-block:: console
221
222       ethtool -N <dev_name> flow-type ether dst 00:00:00:00:00:00 \
223               m ff:ff:ff:ff:ff:ff src 00:00:00:00:09:00 m 00:00:00:00:00:00 \
224               user-def 0x800000003 action 4 loc 4
225
226   * Outer mac + Inner mac + VNI, for a route whose outer mac is
227     68:05:ca:24:03:8b, inner destination mac is c2:1a:e1:53:bc:57, and VNI
228     is 8 to PF's queue 2.
229
230     .. code-block:: console
231
232       ethtool -N <dev_name> flow-type ether dst 68:05:ca:24:03:8b \
233               m 00:00:00:00:00:00 src c2:1a:e1:53:bc:57 m 00:00:00:00:00:00 \
234               user-def 0x800000003 action 2 loc 2
235
236   * Inner mac + Inner vlan + VNI, for a route whose inner destination mac is
237     00:00:00:00:20:00, inner vlan is 10, and VNI is 8 to VF 0's queue 1.
238
239     .. code-block:: console
240
241       ethtool -N <dev_name> flow-type ether dst 00:00:00:00:01:00 \
242               m ff:ff:ff:ff:ff:ff src 00:00:00:00:20:00 m 00:00:00:00:00:00 \
243               vlan 10 user-def 0x800000000 action 1 loc 5
244
245   * Inner mac + Inner vlan, for a route whose inner destination mac is
246     00:00:00:00:20:00, and inner vlan is 10 to VF 0's queue 1.
247
248     .. code-block:: console
249
250       ethtool -N <dev_name> flow-type ether dst 00:00:00:00:01:00 \
251               m ff:ff:ff:ff:ff:ff src 00:00:00:00:20:00 m 00:00:00:00:00:00 \
252               vlan 10 user-def 0xffffffff00000000 action 1 loc 5
253
254   .. note::
255
256       * If the upper 32 bits of 'user-def' are ``0xffffffff``, then the
257         filter can be used for programming an L3 VEB filter, otherwise the
258         upper 32 bits of 'user-def' can carry the tenant ID/VNI if
259         specified/required.
260
261       * Cloud filters can be defined with inner mac, outer mac, inner ip,
262         inner vlan and VNI as part of the cloud tuple. It is always the
263         destination (not source) mac/ip that these filters use. For all
264         these examples dst and src mac address fields are overloaded dst ==
265         outer, src == inner.
266
267       * The filter will direct a packet matching the rule to a vf id
268         specified in the lower 32 bit of user-def to the queue specified by
269         'action'.
270
271       * If the vf id specified by the lower 32 bit of user-def is greater
272         than or equal to ``max_vfs``, then the filter is for the PF queues.
273
274#. Compile the DPDK application and insert ``igb_uio`` or probe the ``vfio-pci``
275   kernel modules as normal.
276
277#. Bind the virtual function:
278
279   .. code-block:: console
280
281       modprobe vfio-pci
282       dpdk-devbind.py -b vfio-pci 01:10.0
283       dpdk-devbind.py -b vfio-pci 01:10.1
284
285#. run DPDK application on VFs:
286
287   .. code-block:: console
288
289       testpmd -c 0xff -n 4 -- -i -w 01:10.0 -w 01:10.1 --forward-mode=mac
290
291.. note::
292
293   * The above steps work on the i40e Linux kernel driver v1.5.16.
294
295   * The Ethtool version used in this example is 3.18. The mask ``ff`` means
296     'not involved', while ``00`` or no mask means 'involved'.
297
298   * For more details of the configuration, refer to the
299     `cloud filter test plan <http://git.dpdk.org/tools/dts/tree/test_plans/cloud_filter_test_plan.rst>`_
300