15129044dSC.J. Collier..  BSD LICENSE
25129044dSC.J. Collier    Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
35129044dSC.J. Collier    All rights reserved.
45129044dSC.J. Collier
55129044dSC.J. Collier    Redistribution and use in source and binary forms, with or without
65129044dSC.J. Collier    modification, are permitted provided that the following conditions
75129044dSC.J. Collier    are met:
85129044dSC.J. Collier
95129044dSC.J. Collier    * Redistributions of source code must retain the above copyright
105129044dSC.J. Collier    notice, this list of conditions and the following disclaimer.
115129044dSC.J. Collier    * Redistributions in binary form must reproduce the above copyright
125129044dSC.J. Collier    notice, this list of conditions and the following disclaimer in
135129044dSC.J. Collier    the documentation and/or other materials provided with the
145129044dSC.J. Collier    distribution.
155129044dSC.J. Collier    * Neither the name of Intel Corporation nor the names of its
165129044dSC.J. Collier    contributors may be used to endorse or promote products derived
175129044dSC.J. Collier    from this software without specific prior written permission.
185129044dSC.J. Collier
195129044dSC.J. Collier    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
205129044dSC.J. Collier    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
215129044dSC.J. Collier    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
225129044dSC.J. Collier    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
235129044dSC.J. Collier    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
245129044dSC.J. Collier    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
255129044dSC.J. Collier    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
265129044dSC.J. Collier    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
275129044dSC.J. Collier    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
285129044dSC.J. Collier    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
295129044dSC.J. Collier    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
305129044dSC.J. Collier
315129044dSC.J. CollierFM10K Poll Mode Driver
325129044dSC.J. Collier======================
335129044dSC.J. Collier
345129044dSC.J. CollierThe FM10K poll mode driver library provides support for the Intel FM10000
355129044dSC.J. Collier(FM10K) family of 40GbE/100GbE adapters.
365129044dSC.J. Collier
375129044dSC.J. CollierFTAG Based Forwarding of FM10K
385129044dSC.J. Collier------------------------------
395129044dSC.J. Collier
405129044dSC.J. CollierFTAG Based Forwarding is a unique feature of FM10K. The FM10K family of NICs
415129044dSC.J. Colliersupport the addition of a Fabric Tag (FTAG) to carry special information.
425129044dSC.J. CollierThe FTAG is placed at the beginning of the frame, it contains information
435129044dSC.J. Colliersuch as where the packet comes from and goes, and the vlan tag. In FTAG based
445129044dSC.J. Collierforwarding mode, the switch logic forwards packets according to glort (global
455129044dSC.J. Collierresource tag) information, rather than the mac and vlan table. Currently this
465129044dSC.J. Collierfeature works only on PF.
475129044dSC.J. Collier
485129044dSC.J. CollierTo enable this feature, the user should pass a devargs parameter to the eal
495129044dSC.J. Collierlike "-w 84:00.0,enable_ftag=1", and the application should make sure an
505129044dSC.J. Collierappropriate FTAG is inserted for every frame on TX side.
515129044dSC.J. Collier
525129044dSC.J. CollierVector PMD for FM10K
535129044dSC.J. Collier--------------------
545129044dSC.J. Collier
555129044dSC.J. CollierVector PMD (vPMD) uses Intel® SIMD instructions to optimize packet I/O.
565129044dSC.J. CollierIt improves load/store bandwidth efficiency of L1 data cache by using a wider
575129044dSC.J. CollierSSE/AVX ''register (1)''.
585129044dSC.J. CollierThe wider register gives space to hold multiple packet buffers so as to save
595129044dSC.J. Collieron the number of instructions when bulk processing packets.
605129044dSC.J. Collier
615129044dSC.J. CollierThere is no change to the PMD API. The RX/TX handlers are the only two entries for
625129044dSC.J. ColliervPMD packet I/O. They are transparently registered at runtime RX/TX execution
635129044dSC.J. Collierif all required conditions are met.
645129044dSC.J. Collier
655129044dSC.J. Collier1.  To date, only an SSE version of FM10K vPMD is available.
665129044dSC.J. Collier    To ensure that vPMD is in the binary code, set
675129044dSC.J. Collier    ``CONFIG_RTE_LIBRTE_FM10K_INC_VECTOR=y`` in the configure file.
685129044dSC.J. Collier
695129044dSC.J. CollierSome constraints apply as pre-conditions for specific optimizations on bulk
705129044dSC.J. Collierpacket transfers. The following sections explain RX and TX constraints in the
715129044dSC.J. ColliervPMD.
725129044dSC.J. Collier
735129044dSC.J. Collier
745129044dSC.J. CollierRX Constraints
755129044dSC.J. Collier~~~~~~~~~~~~~~
765129044dSC.J. Collier
775129044dSC.J. Collier
785129044dSC.J. CollierPrerequisites and Pre-conditions
795129044dSC.J. Collier^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
805129044dSC.J. Collier
815129044dSC.J. CollierFor Vector RX it is assumed that the number of descriptor rings will be a power
825129044dSC.J. Collierof 2. With this pre-condition, the ring pointer can easily scroll back to the
835129044dSC.J. Collierhead after hitting the tail without a conditional check. In addition Vector RX
845129044dSC.J. Colliercan use this assumption to do a bit mask using ``ring_size - 1``.
855129044dSC.J. Collier
865129044dSC.J. Collier
875129044dSC.J. CollierFeatures not Supported by Vector RX PMD
885129044dSC.J. Collier^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
895129044dSC.J. Collier
905129044dSC.J. CollierSome features are not supported when trying to increase the throughput in
915129044dSC.J. ColliervPMD. They are:
925129044dSC.J. Collier
935129044dSC.J. Collier*   IEEE1588
945129044dSC.J. Collier
955129044dSC.J. Collier*   Flow director
965129044dSC.J. Collier
975129044dSC.J. Collier*   Header split
985129044dSC.J. Collier
995129044dSC.J. Collier*   RX checksum offload
1005129044dSC.J. Collier
1015129044dSC.J. CollierOther features are supported using optional MACRO configuration. They include:
1025129044dSC.J. Collier
1035129044dSC.J. Collier*   HW VLAN strip
1045129044dSC.J. Collier
1055129044dSC.J. Collier*   L3/L4 packet type
1065129044dSC.J. Collier
1075129044dSC.J. CollierTo enable via ``RX_OLFLAGS`` use ``RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE=y``.
1085129044dSC.J. Collier
1095129044dSC.J. CollierTo guarantee the constraint, the following configuration flags in ``dev_conf.rxmode``
1105129044dSC.J. Collierwill be checked:
1115129044dSC.J. Collier
1125129044dSC.J. Collier*   ``hw_vlan_extend``
1135129044dSC.J. Collier
1145129044dSC.J. Collier*   ``hw_ip_checksum``
1155129044dSC.J. Collier
1165129044dSC.J. Collier*   ``header_split``
1175129044dSC.J. Collier
1185129044dSC.J. Collier*   ``fdir_conf->mode``
1195129044dSC.J. Collier
1205129044dSC.J. Collier
1215129044dSC.J. CollierRX Burst Size
1225129044dSC.J. Collier^^^^^^^^^^^^^
1235129044dSC.J. Collier
1245129044dSC.J. CollierAs vPMD is focused on high throughput, it processes 4 packets at a time. So it assumes
1255129044dSC.J. Collierthat the RX burst should be greater than 4 packets per burst. It returns zero if using
1265129044dSC.J. Collier``nb_pkt`` < 4 in the receive handler. If ``nb_pkt`` is not a multiple of 4, a
1275129044dSC.J. Collierfloor alignment will be applied.
1285129044dSC.J. Collier
1295129044dSC.J. Collier
1305129044dSC.J. CollierTX Constraint
1315129044dSC.J. Collier~~~~~~~~~~~~~
1325129044dSC.J. Collier
1335129044dSC.J. CollierFeatures not Supported by TX Vector PMD
1345129044dSC.J. Collier^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1355129044dSC.J. Collier
1365129044dSC.J. CollierTX vPMD only works when ``txq_flags`` is set to ``FM10K_SIMPLE_TX_FLAG``.
1375129044dSC.J. CollierThis means that it does not support TX multi-segment, VLAN offload or TX csum
1385129044dSC.J. Collieroffload. The following MACROs are used for these three features:
1395129044dSC.J. Collier
1405129044dSC.J. Collier*   ``ETH_TXQ_FLAGS_NOMULTSEGS``
1415129044dSC.J. Collier
1425129044dSC.J. Collier*   ``ETH_TXQ_FLAGS_NOVLANOFFL``
1435129044dSC.J. Collier
1445129044dSC.J. Collier*   ``ETH_TXQ_FLAGS_NOXSUMSCTP``
1455129044dSC.J. Collier
1465129044dSC.J. Collier*   ``ETH_TXQ_FLAGS_NOXSUMUDP``
1475129044dSC.J. Collier
1485129044dSC.J. Collier*   ``ETH_TXQ_FLAGS_NOXSUMTCP``
1495129044dSC.J. Collier
1505129044dSC.J. CollierLimitations
1515129044dSC.J. Collier-----------
1525129044dSC.J. Collier
1535129044dSC.J. Collier
1545129044dSC.J. CollierSwitch manager
1555129044dSC.J. Collier~~~~~~~~~~~~~~
1565129044dSC.J. Collier
1575129044dSC.J. CollierThe Intel FM10000 family of NICs integrate a hardware switch and multiple host
1585129044dSC.J. Collierinterfaces. The FM10000 PMD driver only manages host interfaces. For the
1595129044dSC.J. Collierswitch component another switch driver has to be loaded prior to to the
160809f0800SChristian EhrhardtFM10000 PMD driver. The switch driver can be acquired from Intel support.
1615129044dSC.J. CollierOnly Testpoint is validated with DPDK, the latest version that has been
162809f0800SChristian Ehrhardtvalidated with DPDK is 4.1.6.
1635129044dSC.J. Collier
1645129044dSC.J. CollierCRC striping
1655129044dSC.J. Collier~~~~~~~~~~~~
1665129044dSC.J. Collier
1675129044dSC.J. CollierThe FM10000 family of NICs strip the CRC for every packets coming into the
1685129044dSC.J. Collierhost interface.  So, CRC will be stripped even when the
1695129044dSC.J. Collier``rxmode.hw_strip_crc`` member is set to 0 in ``struct rte_eth_conf``.
1705129044dSC.J. Collier
1715129044dSC.J. Collier
1725129044dSC.J. CollierMaximum packet length
1735129044dSC.J. Collier~~~~~~~~~~~~~~~~~~~~~
1745129044dSC.J. Collier
1755129044dSC.J. CollierThe FM10000 family of NICS support a maximum of a 15K jumbo frame. The value
1765129044dSC.J. Collieris fixed and cannot be changed. So, even when the ``rxmode.max_rx_pkt_len``
1775129044dSC.J. Colliermember of ``struct rte_eth_conf`` is set to a value lower than 15364, frames
1785129044dSC.J. Collierup to 15364 bytes can still reach the host interface.
1795129044dSC.J. Collier
1805129044dSC.J. CollierStatistic Polling Frequency
1815129044dSC.J. Collier~~~~~~~~~~~~~~~~~~~~~~~~~~~
1825129044dSC.J. Collier
1835129044dSC.J. CollierThe FM10000 NICs expose a set of statistics via the PCI BARs. These statistics
1845129044dSC.J. Collierare read from the hardware registers when ``rte_eth_stats_get()`` or
1855129044dSC.J. Collier``rte_eth_xstats_get()`` is called. The packet counting registers are 32 bits
1865129044dSC.J. Collierwhile the byte counting registers are 48 bits. As a result, the statistics must
1875129044dSC.J. Collierbe polled regularly in order to ensure the consistency of the returned reads.
1885129044dSC.J. Collier
1895129044dSC.J. CollierGiven the PCIe Gen3 x8, about 50Gbps of traffic can occur. With 64 byte packets
1905129044dSC.J. Collierthis gives almost 100 million packets/second, causing 32 bit integer overflow
1915129044dSC.J. Collierafter approx 40 seconds. To ensure these overflows are detected and accounted
1925129044dSC.J. Collierfor in the statistics, it is necessary to read statistic regularly. It is
1935129044dSC.J. Colliersuggested to read stats every 20 seconds, which will ensure the statistics
1945129044dSC.J. Collierare accurate.
1955129044dSC.J. Collier
1965129044dSC.J. Collier
1975129044dSC.J. CollierInterrupt mode
1985129044dSC.J. Collier~~~~~~~~~~~~~~
1995129044dSC.J. Collier
2005129044dSC.J. CollierThe FM10000 family of NICS need one separate interrupt for mailbox. So only
2015129044dSC.J. Collierdrivers which support multiple interrupt vectors e.g. vfio-pci can work
2025129044dSC.J. Collierfor fm10k interrupt mode.
203