123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193 |
- =============================
- BPF Kernel Functions (kfuncs)
- =============================
- 1. Introduction
- ===============
- BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux
- kernel which are exposed for use by BPF programs. Unlike normal BPF helpers,
- kfuncs do not have a stable interface and can change from one kernel release to
- another. Hence, BPF programs need to be updated in response to changes in the
- kernel.
- 2. Defining a kfunc
- ===================
- There are two ways to expose a kernel function to BPF programs, either make an
- existing function in the kernel visible, or add a new wrapper for BPF. In both
- cases, care must be taken that BPF program can only call such function in a
- valid context. To enforce this, visibility of a kfunc can be per program type.
- If you are not creating a BPF wrapper for existing kernel function, skip ahead
- to :ref:`BPF_kfunc_nodef`.
- 2.1 Creating a wrapper kfunc
- ----------------------------
- When defining a wrapper kfunc, the wrapper function should have extern linkage.
- This prevents the compiler from optimizing away dead code, as this wrapper kfunc
- is not invoked anywhere in the kernel itself. It is not necessary to provide a
- prototype in a header for the wrapper kfunc.
- An example is given below::
- /* Disables missing prototype warnings */
- __diag_push();
- __diag_ignore_all("-Wmissing-prototypes",
- "Global kfuncs as their definitions will be in BTF");
- struct task_struct *bpf_find_get_task_by_vpid(pid_t nr)
- {
- return find_get_task_by_vpid(nr);
- }
- __diag_pop();
- A wrapper kfunc is often needed when we need to annotate parameters of the
- kfunc. Otherwise one may directly make the kfunc visible to the BPF program by
- registering it with the BPF subsystem. See :ref:`BPF_kfunc_nodef`.
- 2.2 Annotating kfunc parameters
- -------------------------------
- Similar to BPF helpers, there is sometime need for additional context required
- by the verifier to make the usage of kernel functions safer and more useful.
- Hence, we can annotate a parameter by suffixing the name of the argument of the
- kfunc with a __tag, where tag may be one of the supported annotations.
- 2.2.1 __sz Annotation
- ---------------------
- This annotation is used to indicate a memory and size pair in the argument list.
- An example is given below::
- void bpf_memzero(void *mem, int mem__sz)
- {
- ...
- }
- Here, the verifier will treat first argument as a PTR_TO_MEM, and second
- argument as its size. By default, without __sz annotation, the size of the type
- of the pointer is used. Without __sz annotation, a kfunc cannot accept a void
- pointer.
- .. _BPF_kfunc_nodef:
- 2.3 Using an existing kernel function
- -------------------------------------
- When an existing function in the kernel is fit for consumption by BPF programs,
- it can be directly registered with the BPF subsystem. However, care must still
- be taken to review the context in which it will be invoked by the BPF program
- and whether it is safe to do so.
- 2.4 Annotating kfuncs
- ---------------------
- In addition to kfuncs' arguments, verifier may need more information about the
- type of kfunc(s) being registered with the BPF subsystem. To do so, we define
- flags on a set of kfuncs as follows::
- BTF_SET8_START(bpf_task_set)
- BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
- BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
- BTF_SET8_END(bpf_task_set)
- This set encodes the BTF ID of each kfunc listed above, and encodes the flags
- along with it. Ofcourse, it is also allowed to specify no flags.
- 2.4.1 KF_ACQUIRE flag
- ---------------------
- The KF_ACQUIRE flag is used to indicate that the kfunc returns a pointer to a
- refcounted object. The verifier will then ensure that the pointer to the object
- is eventually released using a release kfunc, or transferred to a map using a
- referenced kptr (by invoking bpf_kptr_xchg). If not, the verifier fails the
- loading of the BPF program until no lingering references remain in all possible
- explored states of the program.
- 2.4.2 KF_RET_NULL flag
- ----------------------
- The KF_RET_NULL flag is used to indicate that the pointer returned by the kfunc
- may be NULL. Hence, it forces the user to do a NULL check on the pointer
- returned from the kfunc before making use of it (dereferencing or passing to
- another helper). This flag is often used in pairing with KF_ACQUIRE flag, but
- both are orthogonal to each other.
- 2.4.3 KF_RELEASE flag
- ---------------------
- The KF_RELEASE flag is used to indicate that the kfunc releases the pointer
- passed in to it. There can be only one referenced pointer that can be passed in.
- All copies of the pointer being released are invalidated as a result of invoking
- kfunc with this flag.
- 2.4.4 KF_KPTR_GET flag
- ----------------------
- The KF_KPTR_GET flag is used to indicate that the kfunc takes the first argument
- as a pointer to kptr, safely increments the refcount of the object it points to,
- and returns a reference to the user. The rest of the arguments may be normal
- arguments of a kfunc. The KF_KPTR_GET flag should be used in conjunction with
- KF_ACQUIRE and KF_RET_NULL flags.
- 2.4.5 KF_TRUSTED_ARGS flag
- --------------------------
- The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
- indicates that the all pointer arguments will always have a guaranteed lifetime,
- and pointers to kernel objects are always passed to helpers in their unmodified
- form (as obtained from acquire kfuncs).
- It can be used to enforce that a pointer to a refcounted object acquired from a
- kfunc or BPF helper is passed as an argument to this kfunc without any
- modifications (e.g. pointer arithmetic) such that it is trusted and points to
- the original object.
- Meanwhile, it is also allowed pass pointers to normal memory to such kfuncs,
- but those can have a non-zero offset.
- This flag is often used for kfuncs that operate (change some property, perform
- some operation) on an object that was obtained using an acquire kfunc. Such
- kfuncs need an unchanged pointer to ensure the integrity of the operation being
- performed on the expected object.
- 2.4.6 KF_SLEEPABLE flag
- -----------------------
- The KF_SLEEPABLE flag is used for kfuncs that may sleep. Such kfuncs can only
- be called by sleepable BPF programs (BPF_F_SLEEPABLE).
- 2.4.7 KF_DESTRUCTIVE flag
- --------------------------
- The KF_DESTRUCTIVE flag is used to indicate functions calling which is
- destructive to the system. For example such a call can result in system
- rebooting or panicking. Due to this additional restrictions apply to these
- calls. At the moment they only require CAP_SYS_BOOT capability, but more can be
- added later.
- 2.5 Registering the kfuncs
- --------------------------
- Once the kfunc is prepared for use, the final step to making it visible is
- registering it with the BPF subsystem. Registration is done per BPF program
- type. An example is shown below::
- BTF_SET8_START(bpf_task_set)
- BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
- BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
- BTF_SET8_END(bpf_task_set)
- static const struct btf_kfunc_id_set bpf_task_kfunc_set = {
- .owner = THIS_MODULE,
- .set = &bpf_task_set,
- };
- static int init_subsystem(void)
- {
- return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_task_kfunc_set);
- }
- late_initcall(init_subsystem);
|