xsk: Introduce AF_XDP buffer allocation API

In order to simplify AF_XDP zero-copy enablement for NIC driver
developers, a new AF_XDP buffer allocation API is added. The
implementation is based on a single core (single producer/consumer)
buffer pool for the AF_XDP UMEM.

A buffer is allocated using the xsk_buff_alloc() function, and
returned using xsk_buff_free(). If a buffer is disassociated with the
pool, e.g. when a buffer is passed to an AF_XDP socket, a buffer is
said to be released. Currently, the release function is only used by
the AF_XDP internals and not visible to the driver.

Drivers using this API should register the XDP memory model with the
new MEM_TYPE_XSK_BUFF_POOL type.

The API is defined in net/xdp_sock_drv.h.

The buffer type is struct xdp_buff, and follows the lifetime of
regular xdp_buffs, i.e.  the lifetime of an xdp_buff is restricted to
a NAPI context. In other words, the API is not replacing xdp_frames.

In addition to introducing the API and implementations, the AF_XDP
core is migrated to use the new APIs.

rfc->v1: Fixed build errors/warnings for m68k and riscv. (kbuild test
         robot)
         Added headroom/chunk size getter. (Maxim/Björn)

v1->v2: Swapped SoBs. (Maxim)

v2->v3: Initialize struct xdp_buff member frame_sz. (Björn)
        Add API to query the DMA address of a frame. (Maxim)
        Do DMA sync for CPU till the end of the frame to handle
        possible growth (frame_sz). (Maxim)

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-6-bjorn.topel@gmail.com
This commit is contained in:
Björn Töpel
2020-05-20 21:20:53 +02:00
committed by Alexei Starovoitov
parent 89e4a376e3
commit 2b43470add
12 changed files with 823 additions and 123 deletions

View File

@@ -9,6 +9,7 @@
#include <linux/types.h>
#include <linux/if_xdp.h>
#include <net/xdp_sock.h>
#include <net/xsk_buff_pool.h>
#include "xsk.h"
@@ -172,31 +173,45 @@ out:
return false;
}
static inline bool xskq_cons_read_addr_aligned(struct xsk_queue *q, u64 *addr)
{
struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
while (q->cached_cons != q->cached_prod) {
u32 idx = q->cached_cons & q->ring_mask;
*addr = ring->desc[idx];
if (xskq_cons_is_valid_addr(q, *addr))
return true;
q->cached_cons++;
}
return false;
}
static inline bool xskq_cons_read_addr_unchecked(struct xsk_queue *q, u64 *addr)
{
struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring;
if (q->cached_cons != q->cached_prod) {
u32 idx = q->cached_cons & q->ring_mask;
*addr = ring->desc[idx];
return true;
}
return false;
}
static inline bool xskq_cons_is_valid_desc(struct xsk_queue *q,
struct xdp_desc *d,
struct xdp_umem *umem)
{
if (umem->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG) {
if (!xskq_cons_is_valid_unaligned(q, d->addr, d->len, umem))
return false;
if (d->len > umem->chunk_size_nohr || d->options) {
q->invalid_descs++;
return false;
}
return true;
}
if (!xskq_cons_is_valid_addr(q, d->addr))
return false;
if (((d->addr + d->len) & q->chunk_mask) != (d->addr & q->chunk_mask) ||
d->options) {
if (!xp_validate_desc(umem->pool, d)) {
q->invalid_descs++;
return false;
}
return true;
}
@@ -260,6 +275,20 @@ static inline bool xskq_cons_peek_addr(struct xsk_queue *q, u64 *addr,
return xskq_cons_read_addr(q, addr, umem);
}
static inline bool xskq_cons_peek_addr_aligned(struct xsk_queue *q, u64 *addr)
{
if (q->cached_prod == q->cached_cons)
xskq_cons_get_entries(q);
return xskq_cons_read_addr_aligned(q, addr);
}
static inline bool xskq_cons_peek_addr_unchecked(struct xsk_queue *q, u64 *addr)
{
if (q->cached_prod == q->cached_cons)
xskq_cons_get_entries(q);
return xskq_cons_read_addr_unchecked(q, addr);
}
static inline bool xskq_cons_peek_desc(struct xsk_queue *q,
struct xdp_desc *desc,
struct xdp_umem *umem)