metadata.md revision 7fa3dd28
1Buffer Metadata
2===============
3
4Each vlib_buffer_t (packet buffer) carries buffer metadata which
5describes the current packet-processing state. The underlying
6techniques have been used for decades, across multiple packet
7processing environments.
8
9We will examine vpp buffer metadata in some detail, but folks who need
10to manipulate and/or extend the scheme should expect to do a certain
11level of code inspection.
12
13Vlib (Vector library) primary buffer metatdata
14----------------------------------------------
15
16The first 64 octets of each vlib_buffer_t carries the primary buffer
17metadata. See .../src/vlib/buffer.h for full details.
18
19Important fields:
20
21* i16 current_data: the signed offset in data[], pre_data[] that we
22are currently processing. If negative current header points into
23the pre-data (rewrite space) area.
24* u16 current_length: nBytes between current_data and the end of this buffer.
25* u32 flags: Buffer flag bits. Heavily used, not many bits left
26  * src/vlib/buffer.h flag bits
27    * VLIB_BUFFER_IS_TRACED: buffer is traced
28    * VLIB_BUFFER_NEXT_PRESENT: buffer has multiple chunks
29    * VLIB_BUFFER_TOTAL_LENGTH_VALID: total_length_not_including_first_buffer is valid (see below)
30  * src/vnet/buffer.h flag bits
31    * VNET_BUFFER_F_L4_CHECKSUM_COMPUTED: tcp/udp checksum has been computed
32    * VNET_BUFFER_F_L4_CHECKSUM_CORRECT: tcp/udp checksum is correct
33    * VNET_BUFFER_F_VLAN_2_DEEP: two vlan tags present
34    * VNET_BUFFER_F_VLAN_1_DEEP: one vlan tag present
35    * VNET_BUFFER_F_SPAN_CLONE: packet has already been cloned (span feature)
36    * VNET_BUFFER_F_LOOP_COUNTER_VALID: packet look-up loop count valid
37    * VNET_BUFFER_F_LOCALLY_ORIGINATED: packet built by vpp
38    * VNET_BUFFER_F_IS_IP4: packet is ipv4, for checksum offload
39    * VNET_BUFFER_F_IS_IP6: packet is ipv6, for checksum offload
40    * VNET_BUFFER_F_OFFLOAD_IP_CKSUM: hardware ip checksum offload requested
41    * VNET_BUFFER_F_OFFLOAD_TCP_CKSUM: hardware tcp checksum offload requested
42    * VNET_BUFFER_F_OFFLOAD_UDP_CKSUM: hardware udp checksum offload requested
43    * VNET_BUFFER_F_IS_NATED: natted packet, skip input checks
44    * VNET_BUFFER_F_L2_HDR_OFFSET_VALID: L2 header offset valid
45    * VNET_BUFFER_F_L3_HDR_OFFSET_VALID: L3 header offset valid
46    * VNET_BUFFER_F_L4_HDR_OFFSET_VALID: L4 header offset valid
47    * VNET_BUFFER_F_FLOW_REPORT: packet is an ipfix packet
48    * VNET_BUFFER_F_IS_DVR: packet to be reinjected into the l2 output path
49    * VNET_BUFFER_F_QOS_DATA_VALID: QoS data valid in vnet_buffer_opaque2
50    * VNET_BUFFER_F_GSO: generic segmentation offload requested
51    * VNET_BUFFER_F_AVAIL1: available bit
52    * VNET_BUFFER_F_AVAIL2: available bit
53    * VNET_BUFFER_F_AVAIL3: available bit
54    * VNET_BUFFER_F_AVAIL4: available bit
55    * VNET_BUFFER_F_AVAIL5: available bit
56    * VNET_BUFFER_F_AVAIL6: available bit
57    * VNET_BUFFER_F_AVAIL7: available bit
58* u32 flow_id: generic flow identifier
59* u8 ref_count: buffer reference / clone count (e.g. for span replication)
60* u8 buffer_pool_index: buffer pool index which owns this buffer
61* vlib_error_t (u16) error: error code for buffers enqueued to error handler
62* u32 next_buffer: buffer index of next buffer in chain. Only valid if VLIB_BUFFER_NEXT_PRESENT is set
63* union
64  * u32 current_config_index: current index on feature arc
65  * u32 punt_reason: reason code once packet punted. Mutually exclusive with current_config_index
66* u32 opaque[10]: primary vnet-layer opaque data (see below)
67* END of first cache line / data initialized by the buffer allocator
68* u32 trace_index: buffer's index in the packet trace subsystem
69* u32 total_length_not_including_first_buffer: see VLIB_BUFFER_TOTAL_LENGTH_VALID above
70* u32 opaque2[14]: secondary vnet-layer opaque data (see below)
71* u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE]: rewrite space, often used to prepend tunnel encapsulations
72* u8 data[0]: buffer data received from the wire. Ordinarily, hardware devices use b->data[0] as the DMA target but there are exceptions. Do not write code which blindly assumes that packet data starts in b->data[0]. Use vlib_buffer_get_current(...).
73
74Vnet (network stack) primary buffer metadata
75--------------------------------------------
76
77Vnet primary buffer metadata occupies space reserved in the vlib
78opaque field shown above, and has the type name
79vnet_buffer_opaque_t. Ordinarily accessed using the vnet_buffer(b)
80macro. See ../src/vnet/buffer.h for full details.
81
82Important fields:
83
84* u32 sw_if_index[2]: RX and TX interface handles. At the ip lookup
85  stage, vnet_buffer(b)->sw_if_index[VLIB_TX] is interpreted as a FIB
86  index.
87* i16 l2_hdr_offset: offset from b->data[0] of the packet L2 header.
88  Valid only if b->flags & VNET_BUFFER_F_L2_HDR_OFFSET_VALID is set
89* i16 l3_hdr_offset: offset from b->data[0] of the packet L3 header.
90  Valid only if b->flags & VNET_BUFFER_F_L3_HDR_OFFSET_VALID is set
91* i16 l4_hdr_offset: offset from b->data[0] of the packet L4 header.
92  Valid only if b->flags & VNET_BUFFER_F_L4_HDR_OFFSET_VALID is set
93* u8 feature_arc_index: feature arc that the packet is currently traversing
94* union
95  * ip
96    * u32 adj_index[2]: adjacency from dest IP lookup in [VLIB_TX], adjacency
97      from source ip lookup in [VLIB_RX], set to ~0 until source lookup done
98    * union
99      * generic fields
100      * ICMP fields
101      * reassembly fields
102  * mpls fields
103  * l2 bridging fields, only valid in the L2 path
104  * l2tpv3 fields
105  * l2 classify fields
106  * vnet policer fields
107  * MAP fields
108  * MAP-T fields
109  * ip fragmentation fields
110  * COP (whitelist/blacklist filter) fields
111  * LISP fields
112  * TCP fields
113    * connection index
114    * sequence numbers
115    * header and data offsets
116    * data length
117    * flags
118  * SCTP fields
119  * NAT fields
120  * u32 unused[6]
121
122Vnet (network stack) secondary buffer metatdata
123-----------------------------------------------
124
125Vnet primary buffer metadata occupies space reserved in the vlib
126opaque2 field shown above, and has the type name
127vnet_buffer_opaque2_t. Ordinarily accessed using the vnet_buffer2(b)
128macro. See ../src/vnet/buffer.h for full details.
129
130Important fields:
131
132* qos fields
133  * u8 bits
134  * u8 source
135* u8 loop_counter: used to detect and report internal forwarding loops
136* group-based policy fields
137  * u8 flags
138  * u16 sclass: the packet's source class
139* u16 gso_size: L4 payload size, persists all the way to
140  interface-output in case GSO is not enabled
141* u16 gso_l4_hdr_sz: size of the L4 protocol header
142* union
143  * packet trajectory tracer (largely deprecated)
144    * u16 *trajectory_trace; only #if VLIB_BUFFER_TRACE_TRAJECTORY > 0
145  * packet generator
146    * u64 pg_replay_timestamp: timestamp for replayed pcap trace packets
147  * u32 unused[8]
148
149Buffer Metadata Extensions
150==========================
151
152Plugin developers may wish to extend either the primary or secondary
153vnet buffer opaque unions. Please perform a
154manual live variable analysis, otherwise nodes which use shared buffer metadata space may break things.
155
156It's not OK to add plugin or proprietary metadata to the core vpp
157engine header files named above. Instead, proceed as follows. The
158example concerns the vnet primary buffer opaque union
159vlib_buffer_opaque_t. It's a very simple variation to use the vnet
160secondary buffer opaque union vlib_buffer_opaque2_t.
161
162In a plugin header file:
163
164```
165    /* Add arbitrary buffer metadata */
166    #include <vnet/buffer.h>
167
168    typedef struct
169    {
170      u32 my_stuff[6];
171    } my_buffer_opaque_t;
172
173    STATIC_ASSERT (sizeof (my_buffer_opaque_t) <=
174                   STRUCT_SIZE_OF (vnet_buffer_opaque_t, unused),
175                   "Custom meta-data too large for vnet_buffer_opaque_t");
176
177    #define my_buffer_opaque(b)  \
178      ((my_buffer_opaque_t *)((u8 *)((b)->opaque) + STRUCT_OFFSET_OF (vnet_buffer_opaque_t, unused)))
179```
180To set data in the custom buffer opaque type given a vlib_buffer_t *b:
181
182```
183    my_buffer_opaque (b)->my_stuff[2] = 123;
184```
185
186To read data from the custom buffer opaque type:
187
188```
189    stuff0 = my_buffer_opaque (b)->my_stuff[2];
190```
191