15129044dSC.J. Collier..  BSD LICENSE
25129044dSC.J. Collier    Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
35129044dSC.J. Collier    All rights reserved.
45129044dSC.J. Collier
55129044dSC.J. Collier    Redistribution and use in source and binary forms, with or without
65129044dSC.J. Collier    modification, are permitted provided that the following conditions
75129044dSC.J. Collier    are met:
85129044dSC.J. Collier
95129044dSC.J. Collier    * Redistributions of source code must retain the above copyright
105129044dSC.J. Collier    notice, this list of conditions and the following disclaimer.
115129044dSC.J. Collier    * Redistributions in binary form must reproduce the above copyright
125129044dSC.J. Collier    notice, this list of conditions and the following disclaimer in
135129044dSC.J. Collier    the documentation and/or other materials provided with the
145129044dSC.J. Collier    distribution.
155129044dSC.J. Collier    * Neither the name of Intel Corporation nor the names of its
165129044dSC.J. Collier    contributors may be used to endorse or promote products derived
175129044dSC.J. Collier    from this software without specific prior written permission.
185129044dSC.J. Collier
195129044dSC.J. Collier    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
205129044dSC.J. Collier    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
215129044dSC.J. Collier    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
225129044dSC.J. Collier    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
235129044dSC.J. Collier    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
245129044dSC.J. Collier    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
255129044dSC.J. Collier    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
265129044dSC.J. Collier    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
275129044dSC.J. Collier    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
285129044dSC.J. Collier    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
295129044dSC.J. Collier    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
305129044dSC.J. Collier
315129044dSC.J. CollierI40E/IXGBE/IGB Virtual Function Driver
325129044dSC.J. Collier======================================
335129044dSC.J. Collier
345129044dSC.J. CollierSupported Intel® Ethernet Controllers (see the *DPDK Release Notes* for details)
355129044dSC.J. Colliersupport the following modes of operation in a virtualized environment:
365129044dSC.J. Collier
375129044dSC.J. Collier*   **SR-IOV mode**: Involves direct assignment of part of the port resources to different guest operating systems
385129044dSC.J. Collier    using the PCI-SIG Single Root I/O Virtualization (SR IOV) standard,
395129044dSC.J. Collier    also known as "native mode" or "pass-through" mode.
405129044dSC.J. Collier    In this chapter, this mode is referred to as IOV mode.
415129044dSC.J. Collier
425129044dSC.J. Collier*   **VMDq mode**: Involves central management of the networking resources by an IO Virtual Machine (IOVM) or
435129044dSC.J. Collier    a Virtual Machine Monitor (VMM), also known as software switch acceleration mode.
445129044dSC.J. Collier    In this chapter, this mode is referred to as the Next Generation VMDq mode.
455129044dSC.J. Collier
465129044dSC.J. CollierSR-IOV Mode Utilization in a DPDK Environment
475129044dSC.J. Collier---------------------------------------------
485129044dSC.J. Collier
495129044dSC.J. CollierThe DPDK uses the SR-IOV feature for hardware-based I/O sharing in IOV mode.
505129044dSC.J. CollierTherefore, it is possible to partition SR-IOV capability on Ethernet controller NIC resources logically and
515129044dSC.J. Collierexpose them to a virtual machine as a separate PCI function called a "Virtual Function".
525129044dSC.J. CollierRefer to :numref:`figure_single_port_nic`.
535129044dSC.J. Collier
545129044dSC.J. CollierTherefore, a NIC is logically distributed among multiple virtual machines (as shown in :numref:`figure_single_port_nic`),
555129044dSC.J. Collierwhile still having global data in common to share with the Physical Function and other Virtual Functions.
565129044dSC.J. CollierThe DPDK fm10kvf, i40evf, igbvf or ixgbevf as a Poll Mode Driver (PMD) serves for the Intel® 82576 Gigabit Ethernet Controller,
575129044dSC.J. CollierIntel® Ethernet Controller I350 family, Intel® 82599 10 Gigabit Ethernet Controller NIC,
585129044dSC.J. CollierIntel® Fortville 10/40 Gigabit Ethernet Controller NIC's virtual PCI function, or PCIe host-interface of the Intel Ethernet Switch
595129044dSC.J. CollierFM10000 Series.
605129044dSC.J. CollierMeanwhile the DPDK Poll Mode Driver (PMD) also supports "Physical Function" of such NIC's on the host.
615129044dSC.J. Collier
625129044dSC.J. CollierThe DPDK PF/VF Poll Mode Driver (PMD) supports the Layer 2 switch on Intel® 82576 Gigabit Ethernet Controller,
635129044dSC.J. CollierIntel® Ethernet Controller I350 family, Intel® 82599 10 Gigabit Ethernet Controller,
645129044dSC.J. Collierand Intel® Fortville 10/40 Gigabit Ethernet Controller NICs so that guest can choose it for inter virtual machine traffic in SR-IOV mode.
655129044dSC.J. Collier
665129044dSC.J. CollierFor more detail on SR-IOV, please refer to the following documents:
675129044dSC.J. Collier
685129044dSC.J. Collier*   `SR-IOV provides hardware based I/O sharing <http://www.intel.com/network/connectivity/solutions/vmdc.htm>`_
695129044dSC.J. Collier
705129044dSC.J. Collier*   `PCI-SIG-Single Root I/O Virtualization Support on IA
715129044dSC.J. Collier    <http://www.intel.com/content/www/us/en/pci-express/pci-sig-single-root-io-virtualization-support-in-virtualization-technology-for-connectivity-paper.html>`_
725129044dSC.J. Collier
735129044dSC.J. Collier*   `Scalable I/O Virtualized Servers <http://www.intel.com/content/www/us/en/virtualization/server-virtualization/scalable-i-o-virtualized-servers-paper.html>`_
745129044dSC.J. Collier
755129044dSC.J. Collier.. _figure_single_port_nic:
765129044dSC.J. Collier
775129044dSC.J. Collier.. figure:: img/single_port_nic.*
785129044dSC.J. Collier
795129044dSC.J. Collier   Virtualization for a Single Port NIC in SR-IOV Mode
805129044dSC.J. Collier
815129044dSC.J. Collier
825129044dSC.J. CollierPhysical and Virtual Function Infrastructure
835129044dSC.J. Collier~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
845129044dSC.J. Collier
855129044dSC.J. CollierThe following describes the Physical Function and Virtual Functions infrastructure for the supported Ethernet Controller NICs.
865129044dSC.J. Collier
875129044dSC.J. CollierVirtual Functions operate under the respective Physical Function on the same NIC Port and therefore have no access
885129044dSC.J. Collierto the global NIC resources that are shared between other functions for the same NIC port.
895129044dSC.J. Collier
905129044dSC.J. CollierA Virtual Function has basic access to the queue resources and control structures of the queues assigned to it.
915129044dSC.J. CollierFor global resource access, a Virtual Function has to send a request to the Physical Function for that port,
925129044dSC.J. Collierand the Physical Function operates on the global resources on behalf of the Virtual Function.
935129044dSC.J. CollierFor this out-of-band communication, an SR-IOV enabled NIC provides a memory buffer for each Virtual Function,
945129044dSC.J. Collierwhich is called a "Mailbox".
955129044dSC.J. Collier
965129044dSC.J. CollierThe PCIE host-interface of Intel Ethernet Switch FM10000 Series VF infrastructure
975129044dSC.J. Collier^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
985129044dSC.J. Collier
995129044dSC.J. CollierIn a virtualized environment, the programmer can enable a maximum of *64 Virtual Functions (VF)*
1005129044dSC.J. Collierglobally per PCIE host-interface of the Intel Ethernet Switch FM10000 Series device.
1015129044dSC.J. CollierEach VF can have a maximum of 16 queue pairs.
1025129044dSC.J. CollierThe Physical Function in host could be only configured by the Linux* fm10k driver
1035129044dSC.J. Collier(in the case of the Linux Kernel-based Virtual Machine [KVM]), DPDK PMD PF driver doesn't support it yet.
1045129044dSC.J. Collier
1055129044dSC.J. CollierFor example,
1065129044dSC.J. Collier
1075129044dSC.J. Collier*   Using Linux* fm10k driver:
1085129044dSC.J. Collier
1095129044dSC.J. Collier    .. code-block:: console
1105129044dSC.J. Collier
1115129044dSC.J. Collier        rmmod fm10k (To remove the fm10k module)
1125129044dSC.J. Collier        insmod fm0k.ko max_vfs=2,2 (To enable two Virtual Functions per port)
1135129044dSC.J. Collier
1145129044dSC.J. CollierVirtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC.
1155129044dSC.J. CollierWhen you enable the four Virtual Functions with the above command, the four enabled functions have a Function#
1165129044dSC.J. Collierrepresented by (Bus#, Device#, Function#) in sequence starting from 0 to 3.
1175129044dSC.J. CollierHowever:
1185129044dSC.J. Collier
1195129044dSC.J. Collier*   Virtual Functions 0 and 2 belong to Physical Function 0
1205129044dSC.J. Collier
1215129044dSC.J. Collier*   Virtual Functions 1 and 3 belong to Physical Function 1
1225129044dSC.J. Collier
1235129044dSC.J. Collier.. note::
1245129044dSC.J. Collier
1255129044dSC.J. Collier    The above is an important consideration to take into account when targeting specific packets to a selected port.
1265129044dSC.J. Collier
1275129044dSC.J. CollierIntel® Fortville 10/40 Gigabit Ethernet Controller VF Infrastructure
1285129044dSC.J. Collier^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1295129044dSC.J. Collier
1305129044dSC.J. CollierIn a virtualized environment, the programmer can enable a maximum of *128 Virtual Functions (VF)*
1315129044dSC.J. Collierglobally per Intel® Fortville 10/40 Gigabit Ethernet Controller NIC device.
1325129044dSC.J. CollierEach VF can have a maximum of 16 queue pairs.
1335129044dSC.J. CollierThe Physical Function in host could be either configured by the Linux* i40e driver
1345129044dSC.J. Collier(in the case of the Linux Kernel-based Virtual Machine [KVM]) or by DPDK PMD PF driver.
1355129044dSC.J. CollierWhen using both DPDK PMD PF/VF drivers, the whole NIC will be taken over by DPDK based application.
1365129044dSC.J. Collier
1375129044dSC.J. CollierFor example,
1385129044dSC.J. Collier
1395129044dSC.J. Collier*   Using Linux* i40e  driver:
1405129044dSC.J. Collier
1415129044dSC.J. Collier    .. code-block:: console
1425129044dSC.J. Collier
1435129044dSC.J. Collier        rmmod i40e (To remove the i40e module)
1445129044dSC.J. Collier        insmod i40e.ko max_vfs=2,2 (To enable two Virtual Functions per port)
1455129044dSC.J. Collier
1465129044dSC.J. Collier*   Using the DPDK PMD PF i40e driver:
1475129044dSC.J. Collier
1485129044dSC.J. Collier    Kernel Params: iommu=pt, intel_iommu=on
1495129044dSC.J. Collier
1505129044dSC.J. Collier    .. code-block:: console
1515129044dSC.J. Collier
1525129044dSC.J. Collier        modprobe uio
1535129044dSC.J. Collier        insmod igb_uio
1545b1ff351SRicardo Salveti        ./dpdk-devbind.py -b igb_uio bb:ss.f
1555129044dSC.J. Collier        echo 2 > /sys/bus/pci/devices/0000\:bb\:ss.f/max_vfs (To enable two VFs on a specific PCI device)
1565129044dSC.J. Collier
1575129044dSC.J. Collier    Launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library.
1585129044dSC.J. Collier
1595129044dSC.J. Collier*   Using the DPDK PMD PF ixgbe driver to enable VF RSS:
1605129044dSC.J. Collier
1615129044dSC.J. Collier    Same steps as above to install the modules of uio, igb_uio, specify max_vfs for PCI device, and
1625129044dSC.J. Collier    launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library.
1635129044dSC.J. Collier
1645129044dSC.J. Collier    The available queue number(at most 4) per VF depends on the total number of pool, which is
1655129044dSC.J. Collier    determined by the max number of VF at PF initialization stage and the number of queue specified
1665129044dSC.J. Collier    in config:
1675129044dSC.J. Collier
1685129044dSC.J. Collier    *   If the max number of VF is set in the range of 1 to 32:
1695129044dSC.J. Collier
1705129044dSC.J. Collier        If the number of rxq is specified as 4(e.g. '--rxq 4' in testpmd), then there are totally 32
1715129044dSC.J. Collier        pools(ETH_32_POOLS), and each VF could have 4 or less(e.g. 2) queues;
1725129044dSC.J. Collier
1735129044dSC.J. Collier        If the number of rxq is specified as 2(e.g. '--rxq 2' in testpmd), then there are totally 32
1745129044dSC.J. Collier        pools(ETH_32_POOLS), and each VF could have 2 queues;
1755129044dSC.J. Collier
1765129044dSC.J. Collier    *   If the max number of VF is in the range of 33 to 64:
1775129044dSC.J. Collier
1785129044dSC.J. Collier        If the number of rxq is 4 ('--rxq 4' in testpmd), then error message is expected as rxq is not
1795129044dSC.J. Collier        correct at this case;
1805129044dSC.J. Collier
1815129044dSC.J. Collier        If the number of rxq is 2 ('--rxq 2' in testpmd), then there is totally 64 pools(ETH_64_POOLS),
1825129044dSC.J. Collier        and each VF have 2 queues;
1835129044dSC.J. Collier
1845129044dSC.J. Collier    On host, to enable VF RSS functionality, rx mq mode should be set as ETH_MQ_RX_VMDQ_RSS
1855129044dSC.J. Collier    or ETH_MQ_RX_RSS mode, and SRIOV mode should be activated(max_vfs >= 1).
1865129044dSC.J. Collier    It also needs config VF RSS information like hash function, RSS key, RSS key length.
1875129044dSC.J. Collier
1885129044dSC.J. Collier    .. code-block:: console
1895129044dSC.J. Collier
1905129044dSC.J. Collier        testpmd -c 0xffff -n 4 -- --coremask=<core-mask> --rxq=4 --txq=4 -i
1915129044dSC.J. Collier
1925129044dSC.J. Collier    The limitation for VF RSS on Intel® 82599 10 Gigabit Ethernet Controller is:
1935129044dSC.J. Collier    The hash and key are shared among PF and all VF, the RETA table with 128 entries is also shared
1945129044dSC.J. Collier    among PF and all VF; So it could not to provide a method to query the hash and reta content per
1955129044dSC.J. Collier    VF on guest, while, if possible, please query them on host(PF) for the shared RETA information.
1965129044dSC.J. Collier
1975129044dSC.J. CollierVirtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC.
1985129044dSC.J. CollierWhen you enable the four Virtual Functions with the above command, the four enabled functions have a Function#
1995129044dSC.J. Collierrepresented by (Bus#, Device#, Function#) in sequence starting from 0 to 3.
2005129044dSC.J. CollierHowever:
2015129044dSC.J. Collier
2025129044dSC.J. Collier*   Virtual Functions 0 and 2 belong to Physical Function 0
2035129044dSC.J. Collier
2045129044dSC.J. Collier*   Virtual Functions 1 and 3 belong to Physical Function 1
2055129044dSC.J. Collier
2065129044dSC.J. Collier.. note::
2075129044dSC.J. Collier
2085129044dSC.J. Collier    The above is an important consideration to take into account when targeting specific packets to a selected port.
2095129044dSC.J. Collier
2105129044dSC.J. CollierIntel® 82599 10 Gigabit Ethernet Controller VF Infrastructure
2115129044dSC.J. Collier^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2125129044dSC.J. Collier
2135129044dSC.J. CollierThe programmer can enable a maximum of *63 Virtual Functions* and there must be *one Physical Function* per Intel® 82599
2145129044dSC.J. Collier10 Gigabit Ethernet Controller NIC port.
2155129044dSC.J. CollierThe reason for this is that the device allows for a maximum of 128 queues per port and a virtual/physical function has to
2165129044dSC.J. Collierhave at least one queue pair (RX/TX).
2175129044dSC.J. CollierThe current implementation of the DPDK ixgbevf driver supports a single queue pair (RX/TX) per Virtual Function.
2185129044dSC.J. CollierThe Physical Function in host could be either configured by the Linux* ixgbe driver
2195129044dSC.J. Collier(in the case of the Linux Kernel-based Virtual Machine [KVM]) or by DPDK PMD PF driver.
2205129044dSC.J. CollierWhen using both DPDK PMD PF/VF drivers, the whole NIC will be taken over by DPDK based application.
2215129044dSC.J. Collier
2225129044dSC.J. CollierFor example,
2235129044dSC.J. Collier
2245129044dSC.J. Collier*   Using Linux* ixgbe driver:
2255129044dSC.J. Collier
2265129044dSC.J. Collier    .. code-block:: console
2275129044dSC.J. Collier
2285129044dSC.J. Collier        rmmod ixgbe (To remove the ixgbe module)
2295129044dSC.J. Collier        insmod ixgbe max_vfs=2,2 (To enable two Virtual Functions per port)
2305129044dSC.J. Collier
2315129044dSC.J. Collier*   Using the DPDK PMD PF ixgbe driver:
2325129044dSC.J. Collier
2335129044dSC.J. Collier    Kernel Params: iommu=pt, intel_iommu=on
2345129044dSC.J. Collier
2355129044dSC.J. Collier    .. code-block:: console
2365129044dSC.J. Collier
2375129044dSC.J. Collier        modprobe uio
2385129044dSC.J. Collier        insmod igb_uio
2395b1ff351SRicardo Salveti        ./dpdk-devbind.py -b igb_uio bb:ss.f
2405129044dSC.J. Collier        echo 2 > /sys/bus/pci/devices/0000\:bb\:ss.f/max_vfs (To enable two VFs on a specific PCI device)
2415129044dSC.J. Collier
2425129044dSC.J. Collier    Launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library.
2435129044dSC.J. Collier
2445129044dSC.J. CollierVirtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC.
2455129044dSC.J. CollierWhen you enable the four Virtual Functions with the above command, the four enabled functions have a Function#
2465129044dSC.J. Collierrepresented by (Bus#, Device#, Function#) in sequence starting from 0 to 3.
2475129044dSC.J. CollierHowever:
2485129044dSC.J. Collier
2495129044dSC.J. Collier*   Virtual Functions 0 and 2 belong to Physical Function 0
2505129044dSC.J. Collier
2515129044dSC.J. Collier*   Virtual Functions 1 and 3 belong to Physical Function 1
2525129044dSC.J. Collier
2535129044dSC.J. Collier.. note::
2545129044dSC.J. Collier
2555129044dSC.J. Collier    The above is an important consideration to take into account when targeting specific packets to a selected port.
2565129044dSC.J. Collier
2575129044dSC.J. CollierIntel® 82576 Gigabit Ethernet Controller and Intel® Ethernet Controller I350 Family VF Infrastructure
2585129044dSC.J. Collier^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2595129044dSC.J. Collier
2605129044dSC.J. CollierIn a virtualized environment, an Intel® 82576 Gigabit Ethernet Controller serves up to eight virtual machines (VMs).
2615129044dSC.J. CollierThe controller has 16 TX and 16 RX queues.
2625129044dSC.J. CollierThey are generally referred to (or thought of) as queue pairs (one TX and one RX queue).
2635129044dSC.J. CollierThis gives the controller 16 queue pairs.
2645129044dSC.J. Collier
2655129044dSC.J. CollierA pool is a group of queue pairs for assignment to the same VF, used for transmit and receive operations.
2665129044dSC.J. CollierThe controller has eight pools, with each pool containing two queue pairs, that is, two TX and two RX queues assigned to each VF.
2675129044dSC.J. Collier
2685129044dSC.J. CollierIn a virtualized environment, an Intel® Ethernet Controller I350 family device serves up to eight virtual machines (VMs) per port.
2695129044dSC.J. CollierThe eight queues can be accessed by eight different VMs if configured correctly (the i350 has 4x1GbE ports each with 8T X and 8 RX queues),
2705129044dSC.J. Collierthat means, one Transmit and one Receive queue assigned to each VF.
2715129044dSC.J. Collier
2725129044dSC.J. CollierFor example,
2735129044dSC.J. Collier
2745129044dSC.J. Collier*   Using Linux* igb driver:
2755129044dSC.J. Collier
2765129044dSC.J. Collier    .. code-block:: console
2775129044dSC.J. Collier
2785129044dSC.J. Collier        rmmod igb (To remove the igb module)
2795129044dSC.J. Collier        insmod igb max_vfs=2,2 (To enable two Virtual Functions per port)
2805129044dSC.J. Collier
2815129044dSC.J. Collier*   Using DPDK PMD PF igb driver:
2825129044dSC.J. Collier
2835129044dSC.J. Collier    Kernel Params: iommu=pt, intel_iommu=on modprobe uio
2845129044dSC.J. Collier
2855129044dSC.J. Collier    .. code-block:: console
2865129044dSC.J. Collier
2875129044dSC.J. Collier        insmod igb_uio
2885b1ff351SRicardo Salveti        ./dpdk-devbind.py -b igb_uio bb:ss.f
2895129044dSC.J. Collier        echo 2 > /sys/bus/pci/devices/0000\:bb\:ss.f/max_vfs (To enable two VFs on a specific pci device)
2905129044dSC.J. Collier
2915129044dSC.J. Collier    Launch DPDK testpmd/example or your own host daemon application using the DPDK PMD library.
2925129044dSC.J. Collier
2935129044dSC.J. CollierVirtual Function enumeration is performed in the following sequence by the Linux* pci driver for a four-port NIC.
2945129044dSC.J. CollierWhen you enable the four Virtual Functions with the above command, the four enabled functions have a Function#
2955129044dSC.J. Collierrepresented by (Bus#, Device#, Function#) in sequence, starting from 0 to 7.
2965129044dSC.J. CollierHowever:
2975129044dSC.J. Collier
2985129044dSC.J. Collier*   Virtual Functions 0 and 4 belong to Physical Function 0
2995129044dSC.J. Collier
3005129044dSC.J. Collier*   Virtual Functions 1 and 5 belong to Physical Function 1
3015129044dSC.J. Collier
3025129044dSC.J. Collier*   Virtual Functions 2 and 6 belong to Physical Function 2
3035129044dSC.J. Collier
3045129044dSC.J. Collier*   Virtual Functions 3 and 7 belong to Physical Function 3
3055129044dSC.J. Collier
3065129044dSC.J. Collier.. note::
3075129044dSC.J. Collier
3085129044dSC.J. Collier    The above is an important consideration to take into account when targeting specific packets to a selected port.
3095129044dSC.J. Collier
3105129044dSC.J. CollierValidated Hypervisors
3115129044dSC.J. Collier~~~~~~~~~~~~~~~~~~~~~
3125129044dSC.J. Collier
3135129044dSC.J. CollierThe validated hypervisor is:
3145129044dSC.J. Collier
3155129044dSC.J. Collier*   KVM (Kernel Virtual Machine) with  Qemu, version 0.14.0
3165129044dSC.J. Collier
3175129044dSC.J. CollierHowever, the hypervisor is bypassed to configure the Virtual Function devices using the Mailbox interface,
3185129044dSC.J. Collierthe solution is hypervisor-agnostic.
3195129044dSC.J. CollierXen* and VMware* (when SR- IOV is supported) will also be able to support the DPDK with Virtual Function driver support.
3205129044dSC.J. Collier
3215129044dSC.J. CollierExpected Guest Operating System in Virtual Machine
3225129044dSC.J. Collier~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3235129044dSC.J. Collier
3245129044dSC.J. CollierThe expected guest operating systems in a virtualized environment are:
3255129044dSC.J. Collier
3265129044dSC.J. Collier*   Fedora* 14 (64-bit)
3275129044dSC.J. Collier
3285129044dSC.J. Collier*   Ubuntu* 10.04 (64-bit)
3295129044dSC.J. Collier
3305129044dSC.J. CollierFor supported kernel versions, refer to the *DPDK Release Notes*.
3315129044dSC.J. Collier
3325129044dSC.J. CollierSetting Up a KVM Virtual Machine Monitor
3335129044dSC.J. Collier----------------------------------------
3345129044dSC.J. Collier
3355129044dSC.J. CollierThe following describes a target environment:
3365129044dSC.J. Collier
3375129044dSC.J. Collier*   Host Operating System: Fedora 14
3385129044dSC.J. Collier
3395129044dSC.J. Collier*   Hypervisor: KVM (Kernel Virtual Machine) with Qemu  version 0.14.0
3405129044dSC.J. Collier
3415129044dSC.J. Collier*   Guest Operating System: Fedora 14
3425129044dSC.J. Collier
3435129044dSC.J. Collier*   Linux Kernel Version: Refer to the  *DPDK Getting Started Guide*
3445129044dSC.J. Collier
3455129044dSC.J. Collier*   Target Applications:  l2fwd, l3fwd-vf
3465129044dSC.J. Collier
3475129044dSC.J. CollierThe setup procedure is as follows:
3485129044dSC.J. Collier
3495129044dSC.J. Collier#.  Before booting the Host OS, open **BIOS setup** and enable **Intel® VT features**.
3505129044dSC.J. Collier
3515129044dSC.J. Collier#.  While booting the Host OS kernel, pass the intel_iommu=on kernel command line argument using GRUB.
3525129044dSC.J. Collier    When using DPDK PF driver on host, pass the iommu=pt kernel command line argument in GRUB.
3535129044dSC.J. Collier
3545129044dSC.J. Collier#.  Download qemu-kvm-0.14.0 from
3555129044dSC.J. Collier    `http://sourceforge.net/projects/kvm/files/qemu-kvm/ <http://sourceforge.net/projects/kvm/files/qemu-kvm/>`_
3565129044dSC.J. Collier    and install it in the Host OS using the following steps:
3575129044dSC.J. Collier
3585129044dSC.J. Collier    When using a recent kernel (2.6.25+) with kvm modules included:
3595129044dSC.J. Collier
3605129044dSC.J. Collier    .. code-block:: console
3615129044dSC.J. Collier
3625129044dSC.J. Collier        tar xzf qemu-kvm-release.tar.gz
3635129044dSC.J. Collier        cd qemu-kvm-release
3645129044dSC.J. Collier        ./configure --prefix=/usr/local/kvm
3655129044dSC.J. Collier        make
3665129044dSC.J. Collier        sudo make install
3675129044dSC.J. Collier        sudo /sbin/modprobe kvm-intel
3685129044dSC.J. Collier
3695129044dSC.J. Collier    When using an older kernel, or a kernel from a distribution without the kvm modules,
3705129044dSC.J. Collier    you must download (from the same link), compile and install the modules yourself:
3715129044dSC.J. Collier
3725129044dSC.J. Collier    .. code-block:: console
3735129044dSC.J. Collier
3745129044dSC.J. Collier        tar xjf kvm-kmod-release.tar.bz2
3755129044dSC.J. Collier        cd kvm-kmod-release
3765129044dSC.J. Collier        ./configure
3775129044dSC.J. Collier        make
3785129044dSC.J. Collier        sudo make install
3795129044dSC.J. Collier        sudo /sbin/modprobe kvm-intel
3805129044dSC.J. Collier
3815129044dSC.J. Collier    qemu-kvm installs in the /usr/local/bin directory.
3825129044dSC.J. Collier
3835129044dSC.J. Collier    For more details about KVM configuration and usage, please refer to:
3845129044dSC.J. Collier
3855129044dSC.J. Collier    `http://www.linux-kvm.org/page/HOWTO1 <http://www.linux-kvm.org/page/HOWTO1>`_.
3865129044dSC.J. Collier
3875129044dSC.J. Collier#.  Create a Virtual Machine and install Fedora 14 on the Virtual Machine.
3885129044dSC.J. Collier    This is referred to as the Guest Operating System (Guest OS).
3895129044dSC.J. Collier
3905129044dSC.J. Collier#.  Download and install the latest ixgbe driver from:
3915129044dSC.J. Collier
3925129044dSC.J. Collier    `http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=14687 <http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=14687>`_
3935129044dSC.J. Collier
3945129044dSC.J. Collier#.  In the Host OS
3955129044dSC.J. Collier
3965129044dSC.J. Collier    When using Linux kernel ixgbe driver, unload the Linux ixgbe driver and reload it with the max_vfs=2,2 argument:
3975129044dSC.J. Collier
3985129044dSC.J. Collier    .. code-block:: console
3995129044dSC.J. Collier
4005129044dSC.J. Collier        rmmod ixgbe
4015129044dSC.J. Collier        modprobe ixgbe max_vfs=2,2
4025129044dSC.J. Collier
4035129044dSC.J. Collier    When using DPDK PMD PF driver, insert DPDK kernel module igb_uio and set the number of VF by sysfs max_vfs:
4045129044dSC.J. Collier
4055129044dSC.J. Collier    .. code-block:: console
4065129044dSC.J. Collier
4075129044dSC.J. Collier        modprobe uio
4085129044dSC.J. Collier        insmod igb_uio
4095b1ff351SRicardo Salveti        ./dpdk-devbind.py -b igb_uio 02:00.0 02:00.1 0e:00.0 0e:00.1
4105129044dSC.J. Collier        echo 2 > /sys/bus/pci/devices/0000\:02\:00.0/max_vfs
4115129044dSC.J. Collier        echo 2 > /sys/bus/pci/devices/0000\:02\:00.1/max_vfs
4125129044dSC.J. Collier        echo 2 > /sys/bus/pci/devices/0000\:0e\:00.0/max_vfs
4135129044dSC.J. Collier        echo 2 > /sys/bus/pci/devices/0000\:0e\:00.1/max_vfs
4145129044dSC.J. Collier
4155129044dSC.J. Collier    .. note::
4165129044dSC.J. Collier
4175129044dSC.J. Collier        You need to explicitly specify number of vfs for each port, for example,
4185129044dSC.J. Collier        in the command above, it creates two vfs for the first two ixgbe ports.
4195129044dSC.J. Collier
4205129044dSC.J. Collier    Let say we have a machine with four physical ixgbe ports:
4215129044dSC.J. Collier
4225129044dSC.J. Collier
4235129044dSC.J. Collier        0000:02:00.0
4245129044dSC.J. Collier
4255129044dSC.J. Collier        0000:02:00.1
4265129044dSC.J. Collier
4275129044dSC.J. Collier        0000:0e:00.0
4285129044dSC.J. Collier
4295129044dSC.J. Collier        0000:0e:00.1
4305129044dSC.J. Collier
4315129044dSC.J. Collier    The command above creates two vfs for device 0000:02:00.0:
4325129044dSC.J. Collier
4335129044dSC.J. Collier    .. code-block:: console
4345129044dSC.J. Collier
4355129044dSC.J. Collier        ls -alrt /sys/bus/pci/devices/0000\:02\:00.0/virt*
4365129044dSC.J. Collier        lrwxrwxrwx. 1 root root 0 Apr 13 05:40 /sys/bus/pci/devices/0000:02:00.0/virtfn1 -> ../0000:02:10.2
4375129044dSC.J. Collier        lrwxrwxrwx. 1 root root 0 Apr 13 05:40 /sys/bus/pci/devices/0000:02:00.0/virtfn0 -> ../0000:02:10.0
4385129044dSC.J. Collier
4395129044dSC.J. Collier    It also creates two vfs for device 0000:02:00.1:
4405129044dSC.J. Collier
4415129044dSC.J. Collier    .. code-block:: console
4425129044dSC.J. Collier
4435129044dSC.J. Collier        ls -alrt /sys/bus/pci/devices/0000\:02\:00.1/virt*
4445129044dSC.J. Collier        lrwxrwxrwx. 1 root root 0 Apr 13 05:51 /sys/bus/pci/devices/0000:02:00.1/virtfn1 -> ../0000:02:10.3
4455129044dSC.J. Collier        lrwxrwxrwx. 1 root root 0 Apr 13 05:51 /sys/bus/pci/devices/0000:02:00.1/virtfn0 -> ../0000:02:10.1
4465129044dSC.J. Collier
4475129044dSC.J. Collier#.  List the PCI devices connected and notice that the Host OS shows two Physical Functions (traditional ports)
4485129044dSC.J. Collier    and four Virtual Functions (two for each port).
4495129044dSC.J. Collier    This is the result of the previous step.
4505129044dSC.J. Collier
4515129044dSC.J. Collier#.  Insert the pci_stub module to hold the PCI devices that are freed from the default driver using the following command
4525129044dSC.J. Collier    (see http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM Section 4 for more information):
4535129044dSC.J. Collier
4545129044dSC.J. Collier    .. code-block:: console
4555129044dSC.J. Collier
4565129044dSC.J. Collier        sudo /sbin/modprobe pci-stub
4575129044dSC.J. Collier
4585129044dSC.J. Collier    Unbind the default driver from the PCI devices representing the Virtual Functions.
4595129044dSC.J. Collier    A script to perform this action is as follows:
4605129044dSC.J. Collier
4615129044dSC.J. Collier    .. code-block:: console
4625129044dSC.J. Collier
4635129044dSC.J. Collier        echo "8086 10ed" > /sys/bus/pci/drivers/pci-stub/new_id
4645129044dSC.J. Collier        echo 0000:08:10.0 > /sys/bus/pci/devices/0000:08:10.0/driver/unbind
4655129044dSC.J. Collier        echo 0000:08:10.0 > /sys/bus/pci/drivers/pci-stub/bind
4665129044dSC.J. Collier
4675129044dSC.J. Collier    where, 0000:08:10.0 belongs to the Virtual Function visible in the Host OS.
4685129044dSC.J. Collier
4695129044dSC.J. Collier#.  Now, start the Virtual Machine by running the following command:
4705129044dSC.J. Collier
4715129044dSC.J. Collier    .. code-block:: console
4725129044dSC.J. Collier
4735129044dSC.J. Collier        /usr/local/kvm/bin/qemu-system-x86_64 -m 4096 -smp 4 -boot c -hda lucid.qcow2 -device pci-assign,host=08:10.0
4745129044dSC.J. Collier
4755129044dSC.J. Collier    where:
4765129044dSC.J. Collier
4775129044dSC.J. Collier        — -m = memory to assign
4785129044dSC.J. Collier
4795129044dSC.J. Collier        — -smp = number of smp cores
4805129044dSC.J. Collier
4815129044dSC.J. Collier        — -boot = boot option
4825129044dSC.J. Collier
4835129044dSC.J. Collier        — -hda = virtual disk image
4845129044dSC.J. Collier
4855129044dSC.J. Collier        — -device = device to attach
4865129044dSC.J. Collier
4875129044dSC.J. Collier    .. note::
4885129044dSC.J. Collier
4893d9b7210SChristian Ehrhardt        — The pci-assign,host=08:10.0 value indicates that you want to attach a PCI device
4905129044dSC.J. Collier        to a Virtual Machine and the respective (Bus:Device.Function)
4915129044dSC.J. Collier        numbers should be passed for the Virtual Function to be attached.
4925129044dSC.J. Collier
4935129044dSC.J. Collier        — qemu-kvm-0.14.0 allows a maximum of four PCI devices assigned to a VM,
4945129044dSC.J. Collier        but this is qemu-kvm version dependent since qemu-kvm-0.14.1 allows a maximum of five PCI devices.
4955129044dSC.J. Collier
4965129044dSC.J. Collier        — qemu-system-x86_64 also has a -cpu command line option that is used to select the cpu_model
4975129044dSC.J. Collier        to emulate in a Virtual Machine. Therefore, it can be used as:
4985129044dSC.J. Collier
4995129044dSC.J. Collier        .. code-block:: console
5005129044dSC.J. Collier
5015129044dSC.J. Collier            /usr/local/kvm/bin/qemu-system-x86_64 -cpu ?
5025129044dSC.J. Collier
5035129044dSC.J. Collier            (to list all available cpu_models)
5045129044dSC.J. Collier
5055129044dSC.J. Collier            /usr/local/kvm/bin/qemu-system-x86_64 -m 4096 -cpu host -smp 4 -boot c -hda lucid.qcow2 -device pci-assign,host=08:10.0
5065129044dSC.J. Collier
5075129044dSC.J. Collier            (to use the same cpu_model equivalent to the host cpu)
5085129044dSC.J. Collier
5095129044dSC.J. Collier        For more information, please refer to: `http://wiki.qemu.org/Features/CPUModels <http://wiki.qemu.org/Features/CPUModels>`_.
5105129044dSC.J. Collier
5115129044dSC.J. Collier#.  Install and run DPDK host app to take  over the Physical Function. Eg.
5125129044dSC.J. Collier
5135129044dSC.J. Collier    .. code-block:: console
5145129044dSC.J. Collier
5155129044dSC.J. Collier        make install T=x86_64-native-linuxapp-gcc
5165129044dSC.J. Collier        ./x86_64-native-linuxapp-gcc/app/testpmd -c f -n 4 -- -i
5175129044dSC.J. Collier
5185129044dSC.J. Collier#.  Finally, access the Guest OS using vncviewer with the localhost:5900 port and check the lspci command output in the Guest OS.
5195129044dSC.J. Collier    The virtual functions will be listed as available for use.
5205129044dSC.J. Collier
5215129044dSC.J. Collier#.  Configure and install the DPDK with an x86_64-native-linuxapp-gcc configuration on the Guest OS as normal,
5225129044dSC.J. Collier    that is, there is no change to the normal installation procedure.
5235129044dSC.J. Collier
5245129044dSC.J. Collier    .. code-block:: console
5255129044dSC.J. Collier
5265129044dSC.J. Collier        make config T=x86_64-native-linuxapp-gcc O=x86_64-native-linuxapp-gcc
5275129044dSC.J. Collier        cd x86_64-native-linuxapp-gcc
5285129044dSC.J. Collier        make
5295129044dSC.J. Collier
5305129044dSC.J. Collier.. note::
5315129044dSC.J. Collier
5325129044dSC.J. Collier    If you are unable to compile the DPDK and you are getting "error: CPU you selected does not support x86-64 instruction set",
5335129044dSC.J. Collier    power off the Guest OS and start the virtual machine with the correct -cpu option in the qemu- system-x86_64 command as shown in step 9.
5345129044dSC.J. Collier    You must select the best x86_64 cpu_model to emulate or you can select host option if available.
5355129044dSC.J. Collier
5365129044dSC.J. Collier.. note::
5375129044dSC.J. Collier
5385129044dSC.J. Collier    Run the DPDK l2fwd sample application in the Guest OS with Hugepages enabled.
5395129044dSC.J. Collier    For the expected benchmark performance, you must pin the cores from the Guest OS to the Host OS (taskset can be used to do this) and
5405129044dSC.J. Collier    you must also look at the PCI Bus layout on the board to ensure you are not running the traffic over the QPI Interface.
5415129044dSC.J. Collier
5425129044dSC.J. Collier.. note::
5435129044dSC.J. Collier
5445129044dSC.J. Collier    *   The Virtual Machine Manager (the Fedora package name is virt-manager) is a utility for virtual machine management
5455129044dSC.J. Collier        that can also be used to create, start, stop and delete virtual machines.
5465129044dSC.J. Collier        If this option is used, step 2 and 6 in the instructions provided will be different.
5475129044dSC.J. Collier
5485129044dSC.J. Collier    *   virsh, a command line utility for virtual machine management,
5495129044dSC.J. Collier        can also be used to bind and unbind devices to a virtual machine in Ubuntu.
5505129044dSC.J. Collier        If this option is used, step 6 in the instructions provided will be different.
5515129044dSC.J. Collier
5525129044dSC.J. Collier    *   The Virtual Machine Monitor (see :numref:`figure_perf_benchmark`) is equivalent to a Host OS with KVM installed as described in the instructions.
5535129044dSC.J. Collier
5545129044dSC.J. Collier.. _figure_perf_benchmark:
5555129044dSC.J. Collier
5565129044dSC.J. Collier.. figure:: img/perf_benchmark.*
5575129044dSC.J. Collier
5585129044dSC.J. Collier   Performance Benchmark Setup
5595129044dSC.J. Collier
5605129044dSC.J. Collier
5615129044dSC.J. CollierDPDK SR-IOV PMD PF/VF Driver Usage Model
5625129044dSC.J. Collier----------------------------------------
5635129044dSC.J. Collier
5645129044dSC.J. CollierFast Host-based Packet Processing
5655129044dSC.J. Collier~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5665129044dSC.J. Collier
5675129044dSC.J. CollierSoftware Defined Network (SDN) trends are demanding fast host-based packet handling.
5685129044dSC.J. CollierIn a virtualization environment,
5695129044dSC.J. Collierthe DPDK VF PMD driver performs the same throughput result as a non-VT native environment.
5705129044dSC.J. Collier
5715129044dSC.J. CollierWith such host instance fast packet processing, lots of services such as filtering, QoS,
5725129044dSC.J. CollierDPI can be offloaded on the host fast path.
5735129044dSC.J. Collier
5745129044dSC.J. Collier:numref:`figure_fast_pkt_proc` shows the scenario where some VMs directly communicate externally via a VFs,
5755129044dSC.J. Collierwhile others connect to a virtual switch and share the same uplink bandwidth.
5765129044dSC.J. Collier
5775129044dSC.J. Collier.. _figure_fast_pkt_proc:
5785129044dSC.J. Collier
5795129044dSC.J. Collier.. figure:: img/fast_pkt_proc.*
5805129044dSC.J. Collier
5815129044dSC.J. Collier   Fast Host-based Packet Processing
5825129044dSC.J. Collier
5835129044dSC.J. Collier
5845129044dSC.J. CollierSR-IOV (PF/VF) Approach for Inter-VM Communication
5855129044dSC.J. Collier--------------------------------------------------
5865129044dSC.J. Collier
5875129044dSC.J. CollierInter-VM data communication is one of the traffic bottle necks in virtualization platforms.
5885129044dSC.J. CollierSR-IOV device assignment helps a VM to attach the real device, taking advantage of the bridge in the NIC.
5895129044dSC.J. CollierSo VF-to-VF traffic within the same physical port (VM0<->VM1) have hardware acceleration.
5905129044dSC.J. CollierHowever, when VF crosses physical ports (VM0<->VM2), there is no such hardware bridge.
5915129044dSC.J. CollierIn this case, the DPDK PMD PF driver provides host forwarding between such VMs.
5925129044dSC.J. Collier
5935129044dSC.J. Collier:numref:`figure_inter_vm_comms` shows an example.
5945129044dSC.J. CollierIn this case an update of the MAC address lookup tables in both the NIC and host DPDK application is required.
5955129044dSC.J. Collier
5965129044dSC.J. CollierIn the NIC, writing the destination of a MAC address belongs to another cross device VM to the PF specific pool.
5975129044dSC.J. CollierSo when a packet comes in, its destination MAC address will match and forward to the host DPDK PMD application.
5985129044dSC.J. Collier
5995129044dSC.J. CollierIn the host DPDK application, the behavior is similar to L2 forwarding,
6005129044dSC.J. Collierthat is, the packet is forwarded to the correct PF pool.
6015129044dSC.J. CollierThe SR-IOV NIC switch forwards the packet to a specific VM according to the MAC destination address
6025129044dSC.J. Collierwhich belongs to the destination VF on the VM.
6035129044dSC.J. Collier
6045129044dSC.J. Collier.. _figure_inter_vm_comms:
6055129044dSC.J. Collier
6065129044dSC.J. Collier.. figure:: img/inter_vm_comms.*
6075129044dSC.J. Collier
6085129044dSC.J. Collier   Inter-VM Communication
609