123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681 |
- .. SPDX-License-Identifier: BSD-3-Clause
- =======================
- Introduction to Netlink
- =======================
- Netlink is often described as an ioctl() replacement.
- It aims to replace fixed-format C structures as supplied
- to ioctl() with a format which allows an easy way to add
- or extended the arguments.
- To achieve this Netlink uses a minimal fixed-format metadata header
- followed by multiple attributes in the TLV (type, length, value) format.
- Unfortunately the protocol has evolved over the years, in an organic
- and undocumented fashion, making it hard to coherently explain.
- To make the most practical sense this document starts by describing
- netlink as it is used today and dives into more "historical" uses
- in later sections.
- Opening a socket
- ================
- Netlink communication happens over sockets, a socket needs to be
- opened first:
- .. code-block:: c
- fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
- The use of sockets allows for a natural way of exchanging information
- in both directions (to and from the kernel). The operations are still
- performed synchronously when applications send() the request but
- a separate recv() system call is needed to read the reply.
- A very simplified flow of a Netlink "call" will therefore look
- something like:
- .. code-block:: c
- fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
- /* format the request */
- send(fd, &request, sizeof(request));
- n = recv(fd, &response, RSP_BUFFER_SIZE);
- /* interpret the response */
- Netlink also provides natural support for "dumping", i.e. communicating
- to user space all objects of a certain type (e.g. dumping all network
- interfaces).
- .. code-block:: c
- fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
- /* format the dump request */
- send(fd, &request, sizeof(request));
- while (1) {
- n = recv(fd, &buffer, RSP_BUFFER_SIZE);
- /* one recv() call can read multiple messages, hence the loop below */
- for (nl_msg in buffer) {
- if (nl_msg.nlmsg_type == NLMSG_DONE)
- goto dump_finished;
- /* process the object */
- }
- }
- dump_finished:
- The first two arguments of the socket() call require little explanation -
- it is opening a Netlink socket, with all headers provided by the user
- (hence NETLINK, RAW). The last argument is the protocol within Netlink.
- This field used to identify the subsystem with which the socket will
- communicate.
- Classic vs Generic Netlink
- --------------------------
- Initial implementation of Netlink depended on a static allocation
- of IDs to subsystems and provided little supporting infrastructure.
- Let us refer to those protocols collectively as **Classic Netlink**.
- The list of them is defined on top of the ``include/uapi/linux/netlink.h``
- file, they include among others - general networking (NETLINK_ROUTE),
- iSCSI (NETLINK_ISCSI), and audit (NETLINK_AUDIT).
- **Generic Netlink** (introduced in 2005) allows for dynamic registration of
- subsystems (and subsystem ID allocation), introspection and simplifies
- implementing the kernel side of the interface.
- The following section describes how to use Generic Netlink, as the
- number of subsystems using Generic Netlink outnumbers the older
- protocols by an order of magnitude. There are also no plans for adding
- more Classic Netlink protocols to the kernel.
- Basic information on how communicating with core networking parts of
- the Linux kernel (or another of the 20 subsystems using Classic
- Netlink) differs from Generic Netlink is provided later in this document.
- Generic Netlink
- ===============
- In addition to the Netlink fixed metadata header each Netlink protocol
- defines its own fixed metadata header. (Similarly to how network
- headers stack - Ethernet > IP > TCP we have Netlink > Generic N. > Family.)
- A Netlink message always starts with struct nlmsghdr, which is followed
- by a protocol-specific header. In case of Generic Netlink the protocol
- header is struct genlmsghdr.
- The practical meaning of the fields in case of Generic Netlink is as follows:
- .. code-block:: c
- struct nlmsghdr {
- __u32 nlmsg_len; /* Length of message including headers */
- __u16 nlmsg_type; /* Generic Netlink Family (subsystem) ID */
- __u16 nlmsg_flags; /* Flags - request or dump */
- __u32 nlmsg_seq; /* Sequence number */
- __u32 nlmsg_pid; /* Port ID, set to 0 */
- };
- struct genlmsghdr {
- __u8 cmd; /* Command, as defined by the Family */
- __u8 version; /* Irrelevant, set to 1 */
- __u16 reserved; /* Reserved, set to 0 */
- };
- /* TLV attributes follow... */
- In Classic Netlink :c:member:`nlmsghdr.nlmsg_type` used to identify
- which operation within the subsystem the message was referring to
- (e.g. get information about a netdev). Generic Netlink needs to mux
- multiple subsystems in a single protocol so it uses this field to
- identify the subsystem, and :c:member:`genlmsghdr.cmd` identifies
- the operation instead. (See :ref:`res_fam` for
- information on how to find the Family ID of the subsystem of interest.)
- Note that the first 16 values (0 - 15) of this field are reserved for
- control messages both in Classic Netlink and Generic Netlink.
- See :ref:`nl_msg_type` for more details.
- There are 3 usual types of message exchanges on a Netlink socket:
- - performing a single action (``do``);
- - dumping information (``dump``);
- - getting asynchronous notifications (``multicast``).
- Classic Netlink is very flexible and presumably allows other types
- of exchanges to happen, but in practice those are the three that get
- used.
- Asynchronous notifications are sent by the kernel and received by
- the user sockets which subscribed to them. ``do`` and ``dump`` requests
- are initiated by the user. :c:member:`nlmsghdr.nlmsg_flags` should
- be set as follows:
- - for ``do``: ``NLM_F_REQUEST | NLM_F_ACK``
- - for ``dump``: ``NLM_F_REQUEST | NLM_F_ACK | NLM_F_DUMP``
- :c:member:`nlmsghdr.nlmsg_seq` should be a set to a monotonically
- increasing value. The value gets echoed back in responses and doesn't
- matter in practice, but setting it to an increasing value for each
- message sent is considered good hygiene. The purpose of the field is
- matching responses to requests. Asynchronous notifications will have
- :c:member:`nlmsghdr.nlmsg_seq` of ``0``.
- :c:member:`nlmsghdr.nlmsg_pid` is the Netlink equivalent of an address.
- This field can be set to ``0`` when talking to the kernel.
- See :ref:`nlmsg_pid` for the (uncommon) uses of the field.
- The expected use for :c:member:`genlmsghdr.version` was to allow
- versioning of the APIs provided by the subsystems. No subsystem to
- date made significant use of this field, so setting it to ``1`` seems
- like a safe bet.
- .. _nl_msg_type:
- Netlink message types
- ---------------------
- As previously mentioned :c:member:`nlmsghdr.nlmsg_type` carries
- protocol specific values but the first 16 identifiers are reserved
- (first subsystem specific message type should be equal to
- ``NLMSG_MIN_TYPE`` which is ``0x10``).
- There are only 4 Netlink control messages defined:
- - ``NLMSG_NOOP`` - ignore the message, not used in practice;
- - ``NLMSG_ERROR`` - carries the return code of an operation;
- - ``NLMSG_DONE`` - marks the end of a dump;
- - ``NLMSG_OVERRUN`` - socket buffer has overflown, not used to date.
- ``NLMSG_ERROR`` and ``NLMSG_DONE`` are of practical importance.
- They carry return codes for operations. Note that unless
- the ``NLM_F_ACK`` flag is set on the request Netlink will not respond
- with ``NLMSG_ERROR`` if there is no error. To avoid having to special-case
- this quirk it is recommended to always set ``NLM_F_ACK``.
- The format of ``NLMSG_ERROR`` is described by struct nlmsgerr::
- ----------------------------------------------
- | struct nlmsghdr - response header |
- ----------------------------------------------
- | int error |
- ----------------------------------------------
- | struct nlmsghdr - original request header |
- ----------------------------------------------
- | ** optionally (1) payload of the request |
- ----------------------------------------------
- | ** optionally (2) extended ACK |
- ----------------------------------------------
- There are two instances of struct nlmsghdr here, first of the response
- and second of the request. ``NLMSG_ERROR`` carries the information about
- the request which led to the error. This could be useful when trying
- to match requests to responses or re-parse the request to dump it into
- logs.
- The payload of the request is not echoed in messages reporting success
- (``error == 0``) or if ``NETLINK_CAP_ACK`` setsockopt() was set.
- The latter is common
- and perhaps recommended as having to read a copy of every request back
- from the kernel is rather wasteful. The absence of request payload
- is indicated by ``NLM_F_CAPPED`` in :c:member:`nlmsghdr.nlmsg_flags`.
- The second optional element of ``NLMSG_ERROR`` are the extended ACK
- attributes. See :ref:`ext_ack` for more details. The presence
- of extended ACK is indicated by ``NLM_F_ACK_TLVS`` in
- :c:member:`nlmsghdr.nlmsg_flags`.
- ``NLMSG_DONE`` is simpler, the request is never echoed but the extended
- ACK attributes may be present::
- ----------------------------------------------
- | struct nlmsghdr - response header |
- ----------------------------------------------
- | int error |
- ----------------------------------------------
- | ** optionally extended ACK |
- ----------------------------------------------
- .. _res_fam:
- Resolving the Family ID
- -----------------------
- This section explains how to find the Family ID of a subsystem.
- It also serves as an example of Generic Netlink communication.
- Generic Netlink is itself a subsystem exposed via the Generic Netlink API.
- To avoid a circular dependency Generic Netlink has a statically allocated
- Family ID (``GENL_ID_CTRL`` which is equal to ``NLMSG_MIN_TYPE``).
- The Generic Netlink family implements a command used to find out information
- about other families (``CTRL_CMD_GETFAMILY``).
- To get information about the Generic Netlink family named for example
- ``"test1"`` we need to send a message on the previously opened Generic Netlink
- socket. The message should target the Generic Netlink Family (1), be a
- ``do`` (2) call to ``CTRL_CMD_GETFAMILY`` (3). A ``dump`` version of this
- call would make the kernel respond with information about *all* the families
- it knows about. Last but not least the name of the family in question has
- to be specified (4) as an attribute with the appropriate type::
- struct nlmsghdr:
- __u32 nlmsg_len: 32
- __u16 nlmsg_type: GENL_ID_CTRL // (1)
- __u16 nlmsg_flags: NLM_F_REQUEST | NLM_F_ACK // (2)
- __u32 nlmsg_seq: 1
- __u32 nlmsg_pid: 0
- struct genlmsghdr:
- __u8 cmd: CTRL_CMD_GETFAMILY // (3)
- __u8 version: 2 /* or 1, doesn't matter */
- __u16 reserved: 0
- struct nlattr: // (4)
- __u16 nla_len: 10
- __u16 nla_type: CTRL_ATTR_FAMILY_NAME
- char data: test1\0
- (padding:)
- char data: \0\0
- The length fields in Netlink (:c:member:`nlmsghdr.nlmsg_len`
- and :c:member:`nlattr.nla_len`) always *include* the header.
- Attribute headers in netlink must be aligned to 4 bytes from the start
- of the message, hence the extra ``\0\0`` after ``CTRL_ATTR_FAMILY_NAME``.
- The attribute lengths *exclude* the padding.
- If the family is found kernel will reply with two messages, the response
- with all the information about the family::
- /* Message #1 - reply */
- struct nlmsghdr:
- __u32 nlmsg_len: 136
- __u16 nlmsg_type: GENL_ID_CTRL
- __u16 nlmsg_flags: 0
- __u32 nlmsg_seq: 1 /* echoed from our request */
- __u32 nlmsg_pid: 5831 /* The PID of our user space process */
- struct genlmsghdr:
- __u8 cmd: CTRL_CMD_GETFAMILY
- __u8 version: 2
- __u16 reserved: 0
- struct nlattr:
- __u16 nla_len: 10
- __u16 nla_type: CTRL_ATTR_FAMILY_NAME
- char data: test1\0
- (padding:)
- data: \0\0
- struct nlattr:
- __u16 nla_len: 6
- __u16 nla_type: CTRL_ATTR_FAMILY_ID
- __u16: 123 /* The Family ID we are after */
- (padding:)
- char data: \0\0
- struct nlattr:
- __u16 nla_len: 9
- __u16 nla_type: CTRL_ATTR_FAMILY_VERSION
- __u16: 1
- /* ... etc, more attributes will follow. */
- And the error code (success) since ``NLM_F_ACK`` had been set on the request::
- /* Message #2 - the ACK */
- struct nlmsghdr:
- __u32 nlmsg_len: 36
- __u16 nlmsg_type: NLMSG_ERROR
- __u16 nlmsg_flags: NLM_F_CAPPED /* There won't be a payload */
- __u32 nlmsg_seq: 1 /* echoed from our request */
- __u32 nlmsg_pid: 5831 /* The PID of our user space process */
- int error: 0
- struct nlmsghdr: /* Copy of the request header as we sent it */
- __u32 nlmsg_len: 32
- __u16 nlmsg_type: GENL_ID_CTRL
- __u16 nlmsg_flags: NLM_F_REQUEST | NLM_F_ACK
- __u32 nlmsg_seq: 1
- __u32 nlmsg_pid: 0
- The order of attributes (struct nlattr) is not guaranteed so the user
- has to walk the attributes and parse them.
- Note that Generic Netlink sockets are not associated or bound to a single
- family. A socket can be used to exchange messages with many different
- families, selecting the recipient family on message-by-message basis using
- the :c:member:`nlmsghdr.nlmsg_type` field.
- .. _ext_ack:
- Extended ACK
- ------------
- Extended ACK controls reporting of additional error/warning TLVs
- in ``NLMSG_ERROR`` and ``NLMSG_DONE`` messages. To maintain backward
- compatibility this feature has to be explicitly enabled by setting
- the ``NETLINK_EXT_ACK`` setsockopt() to ``1``.
- Types of extended ack attributes are defined in enum nlmsgerr_attrs.
- The most commonly used attributes are ``NLMSGERR_ATTR_MSG``,
- ``NLMSGERR_ATTR_OFFS`` and ``NLMSGERR_ATTR_MISS_*``.
- ``NLMSGERR_ATTR_MSG`` carries a message in English describing
- the encountered problem. These messages are far more detailed
- than what can be expressed thru standard UNIX error codes.
- ``NLMSGERR_ATTR_OFFS`` points to the attribute which caused the problem.
- ``NLMSGERR_ATTR_MISS_TYPE`` and ``NLMSGERR_ATTR_MISS_NEST``
- inform about a missing attribute.
- Extended ACKs can be reported on errors as well as in case of success.
- The latter should be treated as a warning.
- Extended ACKs greatly improve the usability of Netlink and should
- always be enabled, appropriately parsed and reported to the user.
- Advanced topics
- ===============
- Dump consistency
- ----------------
- Some of the data structures kernel uses for storing objects make
- it hard to provide an atomic snapshot of all the objects in a dump
- (without impacting the fast-paths updating them).
- Kernel may set the ``NLM_F_DUMP_INTR`` flag on any message in a dump
- (including the ``NLMSG_DONE`` message) if the dump was interrupted and
- may be inconsistent (e.g. missing objects). User space should retry
- the dump if it sees the flag set.
- Introspection
- -------------
- The basic introspection abilities are enabled by access to the Family
- object as reported in :ref:`res_fam`. User can query information about
- the Generic Netlink family, including which operations are supported
- by the kernel and what attributes the kernel understands.
- Family information includes the highest ID of an attribute kernel can parse,
- a separate command (``CTRL_CMD_GETPOLICY``) provides detailed information
- about supported attributes, including ranges of values the kernel accepts.
- Querying family information is useful in cases when user space needs
- to make sure that the kernel has support for a feature before issuing
- a request.
- .. _nlmsg_pid:
- nlmsg_pid
- ---------
- :c:member:`nlmsghdr.nlmsg_pid` is the Netlink equivalent of an address.
- It is referred to as Port ID, sometimes Process ID because for historical
- reasons if the application does not select (bind() to) an explicit Port ID
- kernel will automatically assign it the ID equal to its Process ID
- (as reported by the getpid() system call).
- Similarly to the bind() semantics of the TCP/IP network protocols the value
- of zero means "assign automatically", hence it is common for applications
- to leave the :c:member:`nlmsghdr.nlmsg_pid` field initialized to ``0``.
- The field is still used today in rare cases when kernel needs to send
- a unicast notification. User space application can use bind() to associate
- its socket with a specific PID, it then communicates its PID to the kernel.
- This way the kernel can reach the specific user space process.
- This sort of communication is utilized in UMH (User Mode Helper)-like
- scenarios when kernel needs to trigger user space processing or ask user
- space for a policy decision.
- Multicast notifications
- -----------------------
- One of the strengths of Netlink is the ability to send event notifications
- to user space. This is a unidirectional form of communication (kernel ->
- user) and does not involve any control messages like ``NLMSG_ERROR`` or
- ``NLMSG_DONE``.
- For example the Generic Netlink family itself defines a set of multicast
- notifications about registered families. When a new family is added the
- sockets subscribed to the notifications will get the following message::
- struct nlmsghdr:
- __u32 nlmsg_len: 136
- __u16 nlmsg_type: GENL_ID_CTRL
- __u16 nlmsg_flags: 0
- __u32 nlmsg_seq: 0
- __u32 nlmsg_pid: 0
- struct genlmsghdr:
- __u8 cmd: CTRL_CMD_NEWFAMILY
- __u8 version: 2
- __u16 reserved: 0
- struct nlattr:
- __u16 nla_len: 10
- __u16 nla_type: CTRL_ATTR_FAMILY_NAME
- char data: test1\0
- (padding:)
- data: \0\0
- struct nlattr:
- __u16 nla_len: 6
- __u16 nla_type: CTRL_ATTR_FAMILY_ID
- __u16: 123 /* The Family ID we are after */
- (padding:)
- char data: \0\0
- struct nlattr:
- __u16 nla_len: 9
- __u16 nla_type: CTRL_ATTR_FAMILY_VERSION
- __u16: 1
- /* ... etc, more attributes will follow. */
- The notification contains the same information as the response
- to the ``CTRL_CMD_GETFAMILY`` request.
- The Netlink headers of the notification are mostly 0 and irrelevant.
- The :c:member:`nlmsghdr.nlmsg_seq` may be either zero or a monotonically
- increasing notification sequence number maintained by the family.
- To receive notifications the user socket must subscribe to the relevant
- notification group. Much like the Family ID, the Group ID for a given
- multicast group is dynamic and can be found inside the Family information.
- The ``CTRL_ATTR_MCAST_GROUPS`` attribute contains nests with names
- (``CTRL_ATTR_MCAST_GRP_NAME``) and IDs (``CTRL_ATTR_MCAST_GRP_ID``) of
- the groups family.
- Once the Group ID is known a setsockopt() call adds the socket to the group:
- .. code-block:: c
- unsigned int group_id;
- /* .. find the group ID... */
- setsockopt(fd, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP,
- &group_id, sizeof(group_id));
- The socket will now receive notifications.
- It is recommended to use separate sockets for receiving notifications
- and sending requests to the kernel. The asynchronous nature of notifications
- means that they may get mixed in with the responses making the message
- handling much harder.
- Buffer sizing
- -------------
- Netlink sockets are datagram sockets rather than stream sockets,
- meaning that each message must be received in its entirety by a single
- recv()/recvmsg() system call. If the buffer provided by the user is too
- short, the message will be truncated and the ``MSG_TRUNC`` flag set
- in struct msghdr (struct msghdr is the second argument
- of the recvmsg() system call, *not* a Netlink header).
- Upon truncation the remaining part of the message is discarded.
- Netlink expects that the user buffer will be at least 8kB or a page
- size of the CPU architecture, whichever is bigger. Particular Netlink
- families may, however, require a larger buffer. 32kB buffer is recommended
- for most efficient handling of dumps (larger buffer fits more dumped
- objects and therefore fewer recvmsg() calls are needed).
- Classic Netlink
- ===============
- The main differences between Classic and Generic Netlink are the dynamic
- allocation of subsystem identifiers and availability of introspection.
- In theory the protocol does not differ significantly, however, in practice
- Classic Netlink experimented with concepts which were abandoned in Generic
- Netlink (really, they usually only found use in a small corner of a single
- subsystem). This section is meant as an explainer of a few of such concepts,
- with the explicit goal of giving the Generic Netlink
- users the confidence to ignore them when reading the uAPI headers.
- Most of the concepts and examples here refer to the ``NETLINK_ROUTE`` family,
- which covers much of the configuration of the Linux networking stack.
- Real documentation of that family, deserves a chapter (or a book) of its own.
- Families
- --------
- Netlink refers to subsystems as families. This is a remnant of using
- sockets and the concept of protocol families, which are part of message
- demultiplexing in ``NETLINK_ROUTE``.
- Sadly every layer of encapsulation likes to refer to whatever it's carrying
- as "families" making the term very confusing:
- 1. AF_NETLINK is a bona fide socket protocol family
- 2. AF_NETLINK's documentation refers to what comes after its own
- header (struct nlmsghdr) in a message as a "Family Header"
- 3. Generic Netlink is a family for AF_NETLINK (struct genlmsghdr follows
- struct nlmsghdr), yet it also calls its users "Families".
- Note that the Generic Netlink Family IDs are in a different "ID space"
- and overlap with Classic Netlink protocol numbers (e.g. ``NETLINK_CRYPTO``
- has the Classic Netlink protocol ID of 21 which Generic Netlink will
- happily allocate to one of its families as well).
- Strict checking
- ---------------
- The ``NETLINK_GET_STRICT_CHK`` socket option enables strict input checking
- in ``NETLINK_ROUTE``. It was needed because historically kernel did not
- validate the fields of structures it didn't process. This made it impossible
- to start using those fields later without risking regressions in applications
- which initialized them incorrectly or not at all.
- ``NETLINK_GET_STRICT_CHK`` declares that the application is initializing
- all fields correctly. It also opts into validating that message does not
- contain trailing data and requests that kernel rejects attributes with
- type higher than largest attribute type known to the kernel.
- ``NETLINK_GET_STRICT_CHK`` is not used outside of ``NETLINK_ROUTE``.
- Unknown attributes
- ------------------
- Historically Netlink ignored all unknown attributes. The thinking was that
- it would free the application from having to probe what kernel supports.
- The application could make a request to change the state and check which
- parts of the request "stuck".
- This is no longer the case for new Generic Netlink families and those opting
- in to strict checking. See enum netlink_validation for validation types
- performed.
- Fixed metadata and structures
- -----------------------------
- Classic Netlink made liberal use of fixed-format structures within
- the messages. Messages would commonly have a structure with
- a considerable number of fields after struct nlmsghdr. It was also
- common to put structures with multiple members inside attributes,
- without breaking each member into an attribute of its own.
- This has caused problems with validation and extensibility and
- therefore using binary structures is actively discouraged for new
- attributes.
- Request types
- -------------
- ``NETLINK_ROUTE`` categorized requests into 4 types ``NEW``, ``DEL``, ``GET``,
- and ``SET``. Each object can handle all or some of those requests
- (objects being netdevs, routes, addresses, qdiscs etc.) Request type
- is defined by the 2 lowest bits of the message type, so commands for
- new objects would always be allocated with a stride of 4.
- Each object would also have it's own fixed metadata shared by all request
- types (e.g. struct ifinfomsg for netdev requests, struct ifaddrmsg for address
- requests, struct tcmsg for qdisc requests).
- Even though other protocols and Generic Netlink commands often use
- the same verbs in their message names (``GET``, ``SET``) the concept
- of request types did not find wider adoption.
- Notification echo
- -----------------
- ``NLM_F_ECHO`` requests for notifications resulting from the request
- to be queued onto the requesting socket. This is useful to discover
- the impact of the request.
- Note that this feature is not universally implemented.
- Other request-type-specific flags
- ---------------------------------
- Classic Netlink defined various flags for its ``GET``, ``NEW``
- and ``DEL`` requests in the upper byte of nlmsg_flags in struct nlmsghdr.
- Since request types have not been generalized the request type specific
- flags are rarely used (and considered deprecated for new families).
- For ``GET`` - ``NLM_F_ROOT`` and ``NLM_F_MATCH`` are combined into
- ``NLM_F_DUMP``, and not used separately. ``NLM_F_ATOMIC`` is never used.
- For ``DEL`` - ``NLM_F_NONREC`` is only used by nftables and ``NLM_F_BULK``
- only by FDB some operations.
- The flags for ``NEW`` are used most commonly in classic Netlink. Unfortunately,
- the meaning is not crystal clear. The following description is based on the
- best guess of the intention of the authors, and in practice all families
- stray from it in one way or another. ``NLM_F_REPLACE`` asks to replace
- an existing object, if no matching object exists the operation should fail.
- ``NLM_F_EXCL`` has the opposite semantics and only succeeds if object already
- existed.
- ``NLM_F_CREATE`` asks for the object to be created if it does not
- exist, it can be combined with ``NLM_F_REPLACE`` and ``NLM_F_EXCL``.
- A comment in the main Netlink uAPI header states::
- 4.4BSD ADD NLM_F_CREATE|NLM_F_EXCL
- 4.4BSD CHANGE NLM_F_REPLACE
- True CHANGE NLM_F_CREATE|NLM_F_REPLACE
- Append NLM_F_CREATE
- Check NLM_F_EXCL
- which seems to indicate that those flags predate request types.
- ``NLM_F_REPLACE`` without ``NLM_F_CREATE`` was initially used instead
- of ``SET`` commands.
- ``NLM_F_EXCL`` without ``NLM_F_CREATE`` was used to check if object exists
- without creating it, presumably predating ``GET`` commands.
- ``NLM_F_APPEND`` indicates that if one key can have multiple objects associated
- with it (e.g. multiple next-hop objects for a route) the new object should be
- added to the list rather than replacing the entire list.
- uAPI reference
- ==============
- .. kernel-doc:: include/uapi/linux/netlink.h
|