faq.rst revision 97f17497
1..  BSD LICENSE
2    Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
3    All rights reserved.
4
5    Redistribution and use in source and binary forms, with or without
6    modification, are permitted provided that the following conditions
7    are met:
8
9    * Redistributions of source code must retain the above copyright
10    notice, this list of conditions and the following disclaimer.
11    * Redistributions in binary form must reproduce the above copyright
12    notice, this list of conditions and the following disclaimer in
13    the documentation and/or other materials provided with the
14    distribution.
15    * Neither the name of Intel Corporation nor the names of its
16    contributors may be used to endorse or promote products derived
17    from this software without specific prior written permission.
18
19    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31
32What does "EAL: map_all_hugepages(): open failed: Permission denied Cannot init memory" mean?
33---------------------------------------------------------------------------------------------
34
35This is most likely due to the test application not being run with sudo to promote the user to a superuser.
36Alternatively, applications can also be run as regular user.
37For more information, please refer to :ref:`DPDK Getting Started Guide <linux_gsg>`.
38
39
40If I want to change the number of TLB Hugepages allocated, how do I remove the original pages allocated?
41--------------------------------------------------------------------------------------------------------
42
43The number of pages allocated can be seen by executing the following command::
44
45   grep Huge /proc/meminfo
46
47Once all the pages are mmapped by an application, they stay that way.
48If you start a test application with less than the maximum, then you have free pages.
49When you stop and restart the test application, it looks to see if the pages are available in the ``/dev/huge`` directory and mmaps them.
50If you look in the directory, you will see ``n`` number of 2M pages files. If you specified 1024, you will see 1024 page files.
51These are then placed in memory segments to get contiguous memory.
52
53If you need to change the number of pages, it is easier to first remove the pages. The tools/setup.sh script provides an option to do this.
54See the "Quick Start Setup Script" section in the :ref:`DPDK Getting Started Guide <linux_gsg>` for more information.
55
56
57If I execute "l2fwd -c f -m 64 -n 3 -- -p 3", I get the following output, indicating that there are no socket 0 hugepages to allocate the mbuf and ring structures to?
58----------------------------------------------------------------------------------------------------------------------------------------------------------------------
59
60I have set up a total of 1024 Hugepages (that is, allocated 512 2M pages to each NUMA node).
61
62The -m command line parameter does not guarantee that huge pages will be reserved on specific sockets. Therefore, allocated huge pages may not be on socket 0.
63To request memory to be reserved on a specific socket, please use the --socket-mem command-line parameter instead of -m.
64
65
66I am running a 32-bit DPDK application on a NUMA system, and sometimes the application initializes fine but cannot allocate memory. Why is that happening?
67----------------------------------------------------------------------------------------------------------------------------------------------------------
68
6932-bit applications have limitations in terms of how much virtual memory is available, hence the number of hugepages they are able to allocate is also limited (1 GB per page size).
70If your system has a lot (>1 GB per page size) of hugepage memory, not all of it will be allocated.
71Due to hugepages typically being allocated on a local NUMA node, the hugepages allocation the application gets during the initialization depends on which
72NUMA node it is running on (the EAL does not affinitize cores until much later in the initialization process).
73Sometimes, the Linux OS runs the DPDK application on a core that is located on a different NUMA node from DPDK master core and
74therefore all the hugepages are allocated on the wrong socket.
75
76To avoid this scenario, either lower the amount of hugepage memory available to 1 GB per page size (or less), or run the application with taskset
77affinitizing the application to a would-be master core.
78
79For example, if your EAL coremask is 0xff0, the master core will usually be the first core in the coremask (0x10); this is what you have to supply to taskset::
80
81   taskset 0x10 ./l2fwd -c 0xff0 -n 2
82
83In this way, the hugepages have a greater chance of being allocated to the correct socket.
84Additionally, a ``--socket-mem`` option could be used to ensure the availability of memory for each socket, so that if hugepages were allocated on
85the wrong socket, the application simply will not start.
86
87
88On application startup, there is a lot of EAL information printed. Is there any way to reduce this?
89---------------------------------------------------------------------------------------------------
90
91Yes, each EAL has a configuration file that is located in the /config directory. Within each configuration file, you will find CONFIG_RTE_LOG_LEVEL=8.
92You can change this to a lower value, such as 6 to reduce this printout of debug information. The following is a list of LOG levels that can be found in the rte_log.h file.
93You must remove, then rebuild, the EAL directory for the change to become effective as the configuration file creates the rte_config.h file in the EAL directory.
94
95.. code-block:: c
96
97    #define RTE_LOG_EMERG 1U    /* System is unusable. */
98    #define RTE_LOG_ALERT 2U    /* Action must be taken immediately. */
99    #define RTE_LOG_CRIT 3U     /* Critical conditions. */
100    #define RTE_LOG_ERR 4U      /* Error conditions. */
101    #define RTE_LOG_WARNING 5U  /* Warning conditions. */
102    #define RTE_LOG_NOTICE 6U   /* Normal but significant condition. */
103    #define RTE_LOG_INFO 7U     /* Informational. */
104    #define RTE_LOG_DEBUG 8U    /* Debug-level messages. */
105
106
107How can I tune my network application to achieve lower latency?
108---------------------------------------------------------------
109
110Traditionally, there is a trade-off between throughput and latency. An application can be tuned to achieve a high throughput,
111but the end-to-end latency of an average packet typically increases as a result.
112Similarly, the application can be tuned to have, on average, a low end-to-end latency at the cost of lower throughput.
113
114To achieve higher throughput, the DPDK attempts to aggregate the cost of processing each packet individually by processing packets in bursts.
115Using the testpmd application as an example, the "burst" size can be set on the command line to a value of 16 (also the default value).
116This allows the application to request 16 packets at a time from the PMD.
117The testpmd application then immediately attempts to transmit all the packets that were received, in this case, all 16 packets.
118The packets are not transmitted until the tail pointer is updated on the corresponding TX queue of the network port.
119This behavior is desirable when tuning for high throughput because the cost of tail pointer updates to both the RX and TX queues
120can be spread across 16 packets, effectively hiding the relatively slow MMIO cost of writing to the PCIe* device.
121
122However, this is not very desirable when tuning for low latency, because the first packet that was received must also wait for the other 15 packets to be received.
123It cannot be transmitted until the other 15 packets have also been processed because the NIC will not know to transmit the packets until the TX tail pointer has been updated,
124which is not done until all 16 packets have been processed for transmission.
125
126To consistently achieve low latency even under heavy system load, the application developer should avoid processing packets in bunches.
127The testpmd application can be configured from the command line to use a burst value of 1.
128This allows a single packet to be processed at a time, providing lower latency, but with the added cost of lower throughput.
129
130
131Without NUMA enabled, my network throughput is low, why?
132--------------------------------------------------------
133
134I have a dual Intel® Xeon® E5645 processors 2.40 GHz with four Intel® 82599 10 Gigabit Ethernet NICs.
135Using eight logical cores on each processor with RSS set to distribute network load from two 10 GbE interfaces to the cores on each processor.
136
137Without NUMA enabled, memory is allocated from both sockets, since memory is interleaved.
138Therefore, each 64B chunk is interleaved across both memory domains.
139
140The first 64B chunk is mapped to node 0, the second 64B chunk is mapped to node 1, the third to node 0, the fourth to node 1.
141If you allocated 256B, you would get memory that looks like this:
142
143.. code-block:: console
144
145    256B buffer
146    Offset 0x00 - Node 0
147    Offset 0x40 - Node 1
148    Offset 0x80 - Node 0
149    Offset 0xc0 - Node 1
150
151Therefore, packet buffers and descriptor rings are allocated from both memory domains, thus incurring QPI bandwidth accessing the other memory and much higher latency.
152For best performance with NUMA disabled, only one socket should be populated.
153
154
155I am getting errors about not being able to open files. Why?
156------------------------------------------------------------
157
158As the DPDK operates, it opens a lot of files, which can result in reaching the open files limits, which is set using the ulimit command or in the limits.conf file.
159This is especially true when using a large number (>512) of 2 MB huge pages. Please increase the open file limit if your application is not able to open files.
160This can be done either by issuing a ulimit command or editing the limits.conf file. Please consult Linux manpages for usage information.
161
162
163VF driver for IXGBE devices cannot be initialized
164-------------------------------------------------
165
166Some versions of Linux IXGBE driver do not assign a random MAC address to VF devices at initialization.
167In this case, this has to be done manually on the VM host, using the following command:
168
169.. code-block:: console
170
171    ip link set <interface> vf <VF function> mac <MAC address>
172
173where <interface> being the interface providing the virtual functions for example, eth0, <VF function> being the virtual function number, for example 0,
174and <MAC address> being the desired MAC address.
175
176
177Is it safe to add an entry to the hash table while running?
178------------------------------------------------------------
179Currently the table implementation is not a thread safe implementation and assumes that locking between threads and processes is handled by the user's application.
180This is likely to be supported in future releases.
181
182
183What is the purpose of setting iommu=pt?
184----------------------------------------
185DPDK uses a 1:1 mapping and does not support IOMMU. IOMMU allows for simpler VM physical address translation.
186The second role of IOMMU is to allow protection from unwanted memory access by an unsafe device that has DMA privileges.
187Unfortunately, the protection comes with an extremely high performance cost for high speed NICs.
188
189Setting ``iommu=pt`` disables IOMMU support for the hypervisor.
190
191
192When trying to send packets from an application to itself, meaning smac==dmac, using Intel(R) 82599 VF packets are lost.
193------------------------------------------------------------------------------------------------------------------------
194
195Check on register ``LLE(PFVMTXSSW[n])``, which allows an individual pool to send traffic and have it looped back to itself.
196
197
198Can I split packet RX to use DPDK and have an application's higher order functions continue using Linux pthread?
199----------------------------------------------------------------------------------------------------------------
200
201The DPDK's lcore threads are Linux pthreads bound onto specific cores. Configure the DPDK to do work on the same
202cores and run the application's other work on other cores using the DPDK's "coremask" setting to specify which
203cores it should launch itself on.
204
205
206Is it possible to exchange data between DPDK processes and regular userspace processes via some shared memory or IPC mechanism?
207-------------------------------------------------------------------------------------------------------------------------------
208
209Yes - DPDK processes are regular Linux/BSD processes, and can use all OS provided IPC mechanisms.
210
211
212Can the multiple queues in Intel(R) I350 be used with DPDK?
213-----------------------------------------------------------
214
215I350 has RSS support and 8 queue pairs can be used in RSS mode. It should work with multi-queue DPDK applications using RSS.
216
217
218How can hugepage-backed memory be shared among multiple processes?
219------------------------------------------------------------------
220
221See the Primary and Secondary examples in the :ref:`multi-process sample application <multi_process_app>`.
222