Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor comment merge conflict in mlx5. Staging driver has a fixup due to the skb->xmit_more changes in 'net-next', but was removed in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
@@ -148,16 +148,16 @@ The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()``
|
||||
for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
|
||||
|
||||
The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
|
||||
for this int. For example, a bitfield struct member has: * btf member bit
|
||||
offset 100 from the start of the structure, * btf member pointing to an int
|
||||
type, * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
|
||||
for this int. For example, a bitfield struct member has:
|
||||
* btf member bit offset 100 from the start of the structure,
|
||||
* btf member pointing to an int type,
|
||||
* the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
|
||||
|
||||
Then in the struct memory layout, this member will occupy ``4`` bits starting
|
||||
from bits ``100 + 2 = 102``.
|
||||
|
||||
Alternatively, the bitfield struct member can be the following to access the
|
||||
same bits as the above:
|
||||
|
||||
* btf member bit offset 102,
|
||||
* btf member pointing to an int type,
|
||||
* the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
|
||||
|
@@ -26,7 +26,7 @@ Required node properties:
|
||||
|
||||
Optional node properties:
|
||||
|
||||
- ti,mode: Operation mode (see above).
|
||||
- ti,mode: Operation mode (u8) (see above).
|
||||
|
||||
|
||||
Example (operation mode 2):
|
||||
@@ -34,5 +34,5 @@ Example (operation mode 2):
|
||||
adc128d818@1d {
|
||||
compatible = "ti,adc128d818";
|
||||
reg = <0x1d>;
|
||||
ti,mode = <2>;
|
||||
ti,mode = /bits/ 8 <2>;
|
||||
};
|
||||
|
@@ -16,6 +16,7 @@ Required properties:
|
||||
* "mediatek,mt8127-uart" for MT8127 compatible UARTS
|
||||
* "mediatek,mt8135-uart" for MT8135 compatible UARTS
|
||||
* "mediatek,mt8173-uart" for MT8173 compatible UARTS
|
||||
* "mediatek,mt8183-uart", "mediatek,mt6577-uart" for MT8183 compatible UARTS
|
||||
* "mediatek,mt6577-uart" for MT6577 and all of the above
|
||||
|
||||
- reg: The base address of the UART register bank.
|
||||
|
@@ -12,11 +12,13 @@ CONTENTS
|
||||
|
||||
(4) Filesystem context security.
|
||||
|
||||
(5) VFS filesystem context operations.
|
||||
(5) VFS filesystem context API.
|
||||
|
||||
(6) Parameter description.
|
||||
(6) Superblock creation helpers.
|
||||
|
||||
(7) Parameter helper functions.
|
||||
(7) Parameter description.
|
||||
|
||||
(8) Parameter helper functions.
|
||||
|
||||
|
||||
========
|
||||
@@ -41,12 +43,15 @@ The creation of new mounts is now to be done in a multistep process:
|
||||
|
||||
(7) Destroy the context.
|
||||
|
||||
To support this, the file_system_type struct gains a new field:
|
||||
To support this, the file_system_type struct gains two new fields:
|
||||
|
||||
int (*init_fs_context)(struct fs_context *fc);
|
||||
const struct fs_parameter_description *parameters;
|
||||
|
||||
which is invoked to set up the filesystem-specific parts of a filesystem
|
||||
context, including the additional space.
|
||||
The first is invoked to set up the filesystem-specific parts of a filesystem
|
||||
context, including the additional space, and the second points to the
|
||||
parameter description for validation at registration time and querying by a
|
||||
future system call.
|
||||
|
||||
Note that security initialisation is done *after* the filesystem is called so
|
||||
that the namespaces may be adjusted first.
|
||||
@@ -73,9 +78,9 @@ context. This is represented by the fs_context structure:
|
||||
void *s_fs_info;
|
||||
unsigned int sb_flags;
|
||||
unsigned int sb_flags_mask;
|
||||
unsigned int s_iflags;
|
||||
unsigned int lsm_flags;
|
||||
enum fs_context_purpose purpose:8;
|
||||
bool sloppy:1;
|
||||
bool silent:1;
|
||||
...
|
||||
};
|
||||
|
||||
@@ -141,6 +146,10 @@ The fs_context fields are as follows:
|
||||
|
||||
Which bits SB_* flags are to be set/cleared in super_block::s_flags.
|
||||
|
||||
(*) unsigned int s_iflags
|
||||
|
||||
These will be bitwise-OR'd with s->s_iflags when a superblock is created.
|
||||
|
||||
(*) enum fs_context_purpose
|
||||
|
||||
This indicates the purpose for which the context is intended. The
|
||||
@@ -150,17 +159,6 @@ The fs_context fields are as follows:
|
||||
FS_CONTEXT_FOR_SUBMOUNT -- New automatic submount of extant mount
|
||||
FS_CONTEXT_FOR_RECONFIGURE -- Change an existing mount
|
||||
|
||||
(*) bool sloppy
|
||||
(*) bool silent
|
||||
|
||||
These are set if the sloppy or silent mount options are given.
|
||||
|
||||
[NOTE] sloppy is probably unnecessary when userspace passes over one
|
||||
option at a time since the error can just be ignored if userspace deems it
|
||||
to be unimportant.
|
||||
|
||||
[NOTE] silent is probably redundant with sb_flags & SB_SILENT.
|
||||
|
||||
The mount context is created by calling vfs_new_fs_context() or
|
||||
vfs_dup_fs_context() and is destroyed with put_fs_context(). Note that the
|
||||
structure is not refcounted.
|
||||
@@ -342,28 +340,47 @@ number of operations used by the new mount code for this purpose:
|
||||
It should return 0 on success or a negative error code on failure.
|
||||
|
||||
|
||||
=================================
|
||||
VFS FILESYSTEM CONTEXT OPERATIONS
|
||||
=================================
|
||||
==========================
|
||||
VFS FILESYSTEM CONTEXT API
|
||||
==========================
|
||||
|
||||
There are four operations for creating a filesystem context and
|
||||
one for destroying a context:
|
||||
There are four operations for creating a filesystem context and one for
|
||||
destroying a context:
|
||||
|
||||
(*) struct fs_context *vfs_new_fs_context(struct file_system_type *fs_type,
|
||||
struct dentry *reference,
|
||||
unsigned int sb_flags,
|
||||
unsigned int sb_flags_mask,
|
||||
enum fs_context_purpose purpose);
|
||||
(*) struct fs_context *fs_context_for_mount(
|
||||
struct file_system_type *fs_type,
|
||||
unsigned int sb_flags);
|
||||
|
||||
Create a filesystem context for a given filesystem type and purpose. This
|
||||
allocates the filesystem context, sets the superblock flags, initialises
|
||||
the security and calls fs_type->init_fs_context() to initialise the
|
||||
filesystem private data.
|
||||
Allocate a filesystem context for the purpose of setting up a new mount,
|
||||
whether that be with a new superblock or sharing an existing one. This
|
||||
sets the superblock flags, initialises the security and calls
|
||||
fs_type->init_fs_context() to initialise the filesystem private data.
|
||||
|
||||
reference can be NULL or it may indicate the root dentry of a superblock
|
||||
that is going to be reconfigured (FS_CONTEXT_FOR_RECONFIGURE) or
|
||||
the automount point that triggered a submount (FS_CONTEXT_FOR_SUBMOUNT).
|
||||
This is provided as a source of namespace information.
|
||||
fs_type specifies the filesystem type that will manage the context and
|
||||
sb_flags presets the superblock flags stored therein.
|
||||
|
||||
(*) struct fs_context *fs_context_for_reconfigure(
|
||||
struct dentry *dentry,
|
||||
unsigned int sb_flags,
|
||||
unsigned int sb_flags_mask);
|
||||
|
||||
Allocate a filesystem context for the purpose of reconfiguring an
|
||||
existing superblock. dentry provides a reference to the superblock to be
|
||||
configured. sb_flags and sb_flags_mask indicate which superblock flags
|
||||
need changing and to what.
|
||||
|
||||
(*) struct fs_context *fs_context_for_submount(
|
||||
struct file_system_type *fs_type,
|
||||
struct dentry *reference);
|
||||
|
||||
Allocate a filesystem context for the purpose of creating a new mount for
|
||||
an automount point or other derived superblock. fs_type specifies the
|
||||
filesystem type that will manage the context and the reference dentry
|
||||
supplies the parameters. Namespaces are propagated from the reference
|
||||
dentry's superblock also.
|
||||
|
||||
Note that it's not a requirement that the reference dentry be of the same
|
||||
filesystem type as fs_type.
|
||||
|
||||
(*) struct fs_context *vfs_dup_fs_context(struct fs_context *src_fc);
|
||||
|
||||
@@ -390,20 +407,6 @@ context pointer or a negative error code.
|
||||
For the remaining operations, if an error occurs, a negative error code will be
|
||||
returned.
|
||||
|
||||
(*) int vfs_get_tree(struct fs_context *fc);
|
||||
|
||||
Get or create the mountable root and superblock, using the parameters in
|
||||
the filesystem context to select/configure the superblock. This invokes
|
||||
the ->validate() op and then the ->get_tree() op.
|
||||
|
||||
[NOTE] ->validate() could perhaps be rolled into ->get_tree() and
|
||||
->reconfigure().
|
||||
|
||||
(*) struct vfsmount *vfs_create_mount(struct fs_context *fc);
|
||||
|
||||
Create a mount given the parameters in the specified filesystem context.
|
||||
Note that this does not attach the mount to anything.
|
||||
|
||||
(*) int vfs_parse_fs_param(struct fs_context *fc,
|
||||
struct fs_parameter *param);
|
||||
|
||||
@@ -432,17 +435,80 @@ returned.
|
||||
clear the pointer, but then becomes responsible for disposing of the
|
||||
object.
|
||||
|
||||
(*) int vfs_parse_fs_string(struct fs_context *fc, char *key,
|
||||
(*) int vfs_parse_fs_string(struct fs_context *fc, const char *key,
|
||||
const char *value, size_t v_size);
|
||||
|
||||
A wrapper around vfs_parse_fs_param() that just passes a constant string.
|
||||
A wrapper around vfs_parse_fs_param() that copies the value string it is
|
||||
passed.
|
||||
|
||||
(*) int generic_parse_monolithic(struct fs_context *fc, void *data);
|
||||
|
||||
Parse a sys_mount() data page, assuming the form to be a text list
|
||||
consisting of key[=val] options separated by commas. Each item in the
|
||||
list is passed to vfs_mount_option(). This is the default when the
|
||||
->parse_monolithic() operation is NULL.
|
||||
->parse_monolithic() method is NULL.
|
||||
|
||||
(*) int vfs_get_tree(struct fs_context *fc);
|
||||
|
||||
Get or create the mountable root and superblock, using the parameters in
|
||||
the filesystem context to select/configure the superblock. This invokes
|
||||
the ->get_tree() method.
|
||||
|
||||
(*) struct vfsmount *vfs_create_mount(struct fs_context *fc);
|
||||
|
||||
Create a mount given the parameters in the specified filesystem context.
|
||||
Note that this does not attach the mount to anything.
|
||||
|
||||
|
||||
===========================
|
||||
SUPERBLOCK CREATION HELPERS
|
||||
===========================
|
||||
|
||||
A number of VFS helpers are available for use by filesystems for the creation
|
||||
or looking up of superblocks.
|
||||
|
||||
(*) struct super_block *
|
||||
sget_fc(struct fs_context *fc,
|
||||
int (*test)(struct super_block *sb, struct fs_context *fc),
|
||||
int (*set)(struct super_block *sb, struct fs_context *fc));
|
||||
|
||||
This is the core routine. If test is non-NULL, it searches for an
|
||||
existing superblock matching the criteria held in the fs_context, using
|
||||
the test function to match them. If no match is found, a new superblock
|
||||
is created and the set function is called to set it up.
|
||||
|
||||
Prior to the set function being called, fc->s_fs_info will be transferred
|
||||
to sb->s_fs_info - and fc->s_fs_info will be cleared if set returns
|
||||
success (ie. 0).
|
||||
|
||||
The following helpers all wrap sget_fc():
|
||||
|
||||
(*) int vfs_get_super(struct fs_context *fc,
|
||||
enum vfs_get_super_keying keying,
|
||||
int (*fill_super)(struct super_block *sb,
|
||||
struct fs_context *fc))
|
||||
|
||||
This creates/looks up a deviceless superblock. The keying indicates how
|
||||
many superblocks of this type may exist and in what manner they may be
|
||||
shared:
|
||||
|
||||
(1) vfs_get_single_super
|
||||
|
||||
Only one such superblock may exist in the system. Any further
|
||||
attempt to get a new superblock gets this one (and any parameter
|
||||
differences are ignored).
|
||||
|
||||
(2) vfs_get_keyed_super
|
||||
|
||||
Multiple superblocks of this type may exist and they're keyed on
|
||||
their s_fs_info pointer (for example this may refer to a
|
||||
namespace).
|
||||
|
||||
(3) vfs_get_independent_super
|
||||
|
||||
Multiple independent superblocks of this type may exist. This
|
||||
function never matches an existing one and always creates a new
|
||||
one.
|
||||
|
||||
|
||||
=====================
|
||||
@@ -454,35 +520,22 @@ There's a core description struct that links everything together:
|
||||
|
||||
struct fs_parameter_description {
|
||||
const char name[16];
|
||||
u8 nr_params;
|
||||
u8 nr_alt_keys;
|
||||
u8 nr_enums;
|
||||
bool ignore_unknown;
|
||||
bool no_source;
|
||||
const char *const *keys;
|
||||
const struct constant_table *alt_keys;
|
||||
const struct fs_parameter_spec *specs;
|
||||
const struct fs_parameter_enum *enums;
|
||||
};
|
||||
|
||||
For example:
|
||||
|
||||
enum afs_param {
|
||||
enum {
|
||||
Opt_autocell,
|
||||
Opt_bar,
|
||||
Opt_dyn,
|
||||
Opt_foo,
|
||||
Opt_source,
|
||||
nr__afs_params
|
||||
};
|
||||
|
||||
static const struct fs_parameter_description afs_fs_parameters = {
|
||||
.name = "kAFS",
|
||||
.nr_params = nr__afs_params,
|
||||
.nr_alt_keys = ARRAY_SIZE(afs_param_alt_keys),
|
||||
.nr_enums = ARRAY_SIZE(afs_param_enums),
|
||||
.keys = afs_param_keys,
|
||||
.alt_keys = afs_param_alt_keys,
|
||||
.specs = afs_param_specs,
|
||||
.enums = afs_param_enums,
|
||||
};
|
||||
@@ -494,28 +547,24 @@ The members are as follows:
|
||||
The name to be used in error messages generated by the parse helper
|
||||
functions.
|
||||
|
||||
(2) u8 nr_params;
|
||||
(2) const struct fs_parameter_specification *specs;
|
||||
|
||||
The number of discrete parameter identifiers. This indicates the number
|
||||
of elements in the ->types[] array and also limits the values that may be
|
||||
used in the values that the ->keys[] array maps to.
|
||||
Table of parameter specifications, terminated with a null entry, where the
|
||||
entries are of type:
|
||||
|
||||
It is expected that, for example, two parameters that are related, say
|
||||
"acl" and "noacl" with have the same ID, but will be flagged to indicate
|
||||
that one is the inverse of the other. The value can then be picked out
|
||||
from the parse result.
|
||||
|
||||
(3) const struct fs_parameter_specification *specs;
|
||||
|
||||
Table of parameter specifications, where the entries are of type:
|
||||
|
||||
struct fs_parameter_type {
|
||||
enum fs_parameter_spec type:8;
|
||||
u8 flags;
|
||||
struct fs_parameter_spec {
|
||||
const char *name;
|
||||
u8 opt;
|
||||
enum fs_parameter_type type:8;
|
||||
unsigned short flags;
|
||||
};
|
||||
|
||||
and the parameter identifier is the index to the array. 'type' indicates
|
||||
the desired value type and must be one of:
|
||||
The 'name' field is a string to match exactly to the parameter key (no
|
||||
wildcards, patterns and no case-independence) and 'opt' is the value that
|
||||
will be returned by the fs_parser() function in the case of a successful
|
||||
match.
|
||||
|
||||
The 'type' field indicates the desired value type and must be one of:
|
||||
|
||||
TYPE NAME EXPECTED VALUE RESULT IN
|
||||
======================= ======================= =====================
|
||||
@@ -525,85 +574,65 @@ The members are as follows:
|
||||
fs_param_is_u32_octal 32-bit octal int result->uint_32
|
||||
fs_param_is_u32_hex 32-bit hex int result->uint_32
|
||||
fs_param_is_s32 32-bit signed int result->int_32
|
||||
fs_param_is_u64 64-bit unsigned int result->uint_64
|
||||
fs_param_is_enum Enum value name result->uint_32
|
||||
fs_param_is_string Arbitrary string param->string
|
||||
fs_param_is_blob Binary blob param->blob
|
||||
fs_param_is_blockdev Blockdev path * Needs lookup
|
||||
fs_param_is_path Path * Needs lookup
|
||||
fs_param_is_fd File descriptor param->file
|
||||
|
||||
And each parameter can be qualified with 'flags':
|
||||
|
||||
fs_param_v_optional The value is optional
|
||||
fs_param_neg_with_no If key name is prefixed with "no", it is false
|
||||
fs_param_neg_with_empty If value is "", it is false
|
||||
fs_param_deprecated The parameter is deprecated.
|
||||
|
||||
For example:
|
||||
|
||||
static const struct fs_parameter_spec afs_param_specs[nr__afs_params] = {
|
||||
[Opt_autocell] = { fs_param_is flag },
|
||||
[Opt_bar] = { fs_param_is_enum },
|
||||
[Opt_dyn] = { fs_param_is flag },
|
||||
[Opt_foo] = { fs_param_is_bool, fs_param_neg_with_no },
|
||||
[Opt_source] = { fs_param_is_string },
|
||||
};
|
||||
fs_param_is_fd File descriptor result->int_32
|
||||
|
||||
Note that if the value is of fs_param_is_bool type, fs_parse() will try
|
||||
to match any string value against "0", "1", "no", "yes", "false", "true".
|
||||
|
||||
[!] NOTE that the table must be sorted according to primary key name so
|
||||
that ->keys[] is also sorted.
|
||||
Each parameter can also be qualified with 'flags':
|
||||
|
||||
(4) const char *const *keys;
|
||||
fs_param_v_optional The value is optional
|
||||
fs_param_neg_with_no result->negated set if key is prefixed with "no"
|
||||
fs_param_neg_with_empty result->negated set if value is ""
|
||||
fs_param_deprecated The parameter is deprecated.
|
||||
|
||||
Table of primary key names for the parameters. There must be one entry
|
||||
per defined parameter. The table is optional if ->nr_params is 0. The
|
||||
table is just an array of names e.g.:
|
||||
These are wrapped with a number of convenience wrappers:
|
||||
|
||||
static const char *const afs_param_keys[nr__afs_params] = {
|
||||
[Opt_autocell] = "autocell",
|
||||
[Opt_bar] = "bar",
|
||||
[Opt_dyn] = "dyn",
|
||||
[Opt_foo] = "foo",
|
||||
[Opt_source] = "source",
|
||||
MACRO SPECIFIES
|
||||
======================= ===============================================
|
||||
fsparam_flag() fs_param_is_flag
|
||||
fsparam_flag_no() fs_param_is_flag, fs_param_neg_with_no
|
||||
fsparam_bool() fs_param_is_bool
|
||||
fsparam_u32() fs_param_is_u32
|
||||
fsparam_u32oct() fs_param_is_u32_octal
|
||||
fsparam_u32hex() fs_param_is_u32_hex
|
||||
fsparam_s32() fs_param_is_s32
|
||||
fsparam_u64() fs_param_is_u64
|
||||
fsparam_enum() fs_param_is_enum
|
||||
fsparam_string() fs_param_is_string
|
||||
fsparam_blob() fs_param_is_blob
|
||||
fsparam_bdev() fs_param_is_blockdev
|
||||
fsparam_path() fs_param_is_path
|
||||
fsparam_fd() fs_param_is_fd
|
||||
|
||||
all of which take two arguments, name string and option number - for
|
||||
example:
|
||||
|
||||
static const struct fs_parameter_spec afs_param_specs[] = {
|
||||
fsparam_flag ("autocell", Opt_autocell),
|
||||
fsparam_flag ("dyn", Opt_dyn),
|
||||
fsparam_string ("source", Opt_source),
|
||||
fsparam_flag_no ("foo", Opt_foo),
|
||||
{}
|
||||
};
|
||||
|
||||
[!] NOTE that the table must be sorted such that the table can be searched
|
||||
with bsearch() using strcmp(). This means that the Opt_* values must
|
||||
correspond to the entries in this table.
|
||||
|
||||
(5) const struct constant_table *alt_keys;
|
||||
u8 nr_alt_keys;
|
||||
|
||||
Table of additional key names and their mappings to parameter ID plus the
|
||||
number of elements in the table. This is optional. The table is just an
|
||||
array of { name, integer } pairs, e.g.:
|
||||
|
||||
static const struct constant_table afs_param_keys[] = {
|
||||
{ "baz", Opt_bar },
|
||||
{ "dynamic", Opt_dyn },
|
||||
};
|
||||
|
||||
[!] NOTE that the table must be sorted such that strcmp() can be used with
|
||||
bsearch() to search the entries.
|
||||
|
||||
The parameter ID can also be fs_param_key_removed to indicate that a
|
||||
deprecated parameter has been removed and that an error will be given.
|
||||
This differs from fs_param_deprecated where the parameter may still have
|
||||
an effect.
|
||||
|
||||
Further, the behaviour of the parameter may differ when an alternate name
|
||||
is used (for instance with NFS, "v3", "v4.2", etc. are alternate names).
|
||||
An addition macro, __fsparam() is provided that takes an additional pair
|
||||
of arguments to specify the type and the flags for anything that doesn't
|
||||
match one of the above macros.
|
||||
|
||||
(6) const struct fs_parameter_enum *enums;
|
||||
u8 nr_enums;
|
||||
|
||||
Table of enum value names to integer mappings and the number of elements
|
||||
stored therein. This is of type:
|
||||
Table of enum value names to integer mappings, terminated with a null
|
||||
entry. This is of type:
|
||||
|
||||
struct fs_parameter_enum {
|
||||
u8 param_id;
|
||||
u8 opt;
|
||||
char name[14];
|
||||
u8 value;
|
||||
};
|
||||
@@ -621,11 +650,6 @@ The members are as follows:
|
||||
try to look the value up in the enum table and the result will be stored
|
||||
in the parse result.
|
||||
|
||||
(7) bool no_source;
|
||||
|
||||
If this is set, fs_parse() will ignore any "source" parameter and not
|
||||
pass it to the filesystem.
|
||||
|
||||
The parser should be pointed to by the parser pointer in the file_system_type
|
||||
struct as this will provide validation on registration (if
|
||||
CONFIG_VALIDATE_FS_PARSER=y) and will allow the description to be queried from
|
||||
@@ -650,9 +674,8 @@ process the parameters it is given.
|
||||
int value;
|
||||
};
|
||||
|
||||
and it must be sorted such that it can be searched using bsearch() using
|
||||
strcmp(). If a match is found, the corresponding value is returned. If a
|
||||
match isn't found, the not_found value is returned instead.
|
||||
If a match is found, the corresponding value is returned. If a match
|
||||
isn't found, the not_found value is returned instead.
|
||||
|
||||
(*) bool validate_constant_table(const struct constant_table *tbl,
|
||||
size_t tbl_size,
|
||||
@@ -665,36 +688,36 @@ process the parameters it is given.
|
||||
should just be set to lie inside the low-to-high range.
|
||||
|
||||
If all is good, true is returned. If the table is invalid, errors are
|
||||
logged to dmesg, the stack is dumped and false is returned.
|
||||
logged to dmesg and false is returned.
|
||||
|
||||
(*) bool fs_validate_description(const struct fs_parameter_description *desc);
|
||||
|
||||
This performs some validation checks on a parameter description. It
|
||||
returns true if the description is good and false if it is not. It will
|
||||
log errors to dmesg if validation fails.
|
||||
|
||||
(*) int fs_parse(struct fs_context *fc,
|
||||
const struct fs_param_parser *parser,
|
||||
const struct fs_parameter_description *desc,
|
||||
struct fs_parameter *param,
|
||||
struct fs_param_parse_result *result);
|
||||
struct fs_parse_result *result);
|
||||
|
||||
This is the main interpreter of parameters. It uses the parameter
|
||||
description (parser) to look up the name of the parameter to use and to
|
||||
convert that to a parameter ID (stored in result->key).
|
||||
description to look up a parameter by key name and to convert that to an
|
||||
option number (which it returns).
|
||||
|
||||
If successful, and if the parameter type indicates the result is a
|
||||
boolean, integer or enum type, the value is converted by this function and
|
||||
the result stored in result->{boolean,int_32,uint_32}.
|
||||
the result stored in result->{boolean,int_32,uint_32,uint_64}.
|
||||
|
||||
If a match isn't initially made, the key is prefixed with "no" and no
|
||||
value is present then an attempt will be made to look up the key with the
|
||||
prefix removed. If this matches a parameter for which the type has flag
|
||||
fs_param_neg_with_no set, then a match will be made and the value will be
|
||||
set to false/0/NULL.
|
||||
fs_param_neg_with_no set, then a match will be made and result->negated
|
||||
will be set to true.
|
||||
|
||||
If the parameter is successfully matched and, optionally, parsed
|
||||
correctly, 1 is returned. If the parameter isn't matched and
|
||||
parser->ignore_unknown is set, then 0 is returned. Otherwise -EINVAL is
|
||||
returned.
|
||||
|
||||
(*) bool fs_validate_description(const struct fs_parameter_description *desc);
|
||||
|
||||
This is validates the parameter description. It returns true if the
|
||||
description is good and false if it is not.
|
||||
If the parameter isn't matched, -ENOPARAM will be returned; if the
|
||||
parameter is matched, but the value is erroneous, -EINVAL will be
|
||||
returned; otherwise the parameter's option number will be returned.
|
||||
|
||||
(*) int fs_lookup_param(struct fs_context *fc,
|
||||
struct fs_parameter *value,
|
||||
|
@@ -36,6 +36,7 @@ Supported adapters:
|
||||
* Intel Cannon Lake (PCH)
|
||||
* Intel Cedar Fork (PCH)
|
||||
* Intel Ice Lake (PCH)
|
||||
* Intel Comet Lake (PCH)
|
||||
Datasheets: Publicly available at the Intel website
|
||||
|
||||
On Intel Patsburg and later chipsets, both the normal host SMBus controller
|
||||
|
126
Documentation/networking/bpf_flow_dissector.rst
Normal file
126
Documentation/networking/bpf_flow_dissector.rst
Normal file
@@ -0,0 +1,126 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==================
|
||||
BPF Flow Dissector
|
||||
==================
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
Flow dissector is a routine that parses metadata out of the packets. It's
|
||||
used in the various places in the networking subsystem (RFS, flow hash, etc).
|
||||
|
||||
BPF flow dissector is an attempt to reimplement C-based flow dissector logic
|
||||
in BPF to gain all the benefits of BPF verifier (namely, limits on the
|
||||
number of instructions and tail calls).
|
||||
|
||||
API
|
||||
===
|
||||
|
||||
BPF flow dissector programs operate on an ``__sk_buff``. However, only the
|
||||
limited set of fields is allowed: ``data``, ``data_end`` and ``flow_keys``.
|
||||
``flow_keys`` is ``struct bpf_flow_keys`` and contains flow dissector input
|
||||
and output arguments.
|
||||
|
||||
The inputs are:
|
||||
* ``nhoff`` - initial offset of the networking header
|
||||
* ``thoff`` - initial offset of the transport header, initialized to nhoff
|
||||
* ``n_proto`` - L3 protocol type, parsed out of L2 header
|
||||
|
||||
Flow dissector BPF program should fill out the rest of the ``struct
|
||||
bpf_flow_keys`` fields. Input arguments ``nhoff/thoff/n_proto`` should be
|
||||
also adjusted accordingly.
|
||||
|
||||
The return code of the BPF program is either BPF_OK to indicate successful
|
||||
dissection, or BPF_DROP to indicate parsing error.
|
||||
|
||||
__sk_buff->data
|
||||
===============
|
||||
|
||||
In the VLAN-less case, this is what the initial state of the BPF flow
|
||||
dissector looks like::
|
||||
|
||||
+------+------+------------+-----------+
|
||||
| DMAC | SMAC | ETHER_TYPE | L3_HEADER |
|
||||
+------+------+------------+-----------+
|
||||
^
|
||||
|
|
||||
+-- flow dissector starts here
|
||||
|
||||
|
||||
.. code:: c
|
||||
|
||||
skb->data + flow_keys->nhoff point to the first byte of L3_HEADER
|
||||
flow_keys->thoff = nhoff
|
||||
flow_keys->n_proto = ETHER_TYPE
|
||||
|
||||
In case of VLAN, flow dissector can be called with the two different states.
|
||||
|
||||
Pre-VLAN parsing::
|
||||
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
^
|
||||
|
|
||||
+-- flow dissector starts here
|
||||
|
||||
.. code:: c
|
||||
|
||||
skb->data + flow_keys->nhoff point the to first byte of TCI
|
||||
flow_keys->thoff = nhoff
|
||||
flow_keys->n_proto = TPID
|
||||
|
||||
Please note that TPID can be 802.1AD and, hence, BPF program would
|
||||
have to parse VLAN information twice for double tagged packets.
|
||||
|
||||
|
||||
Post-VLAN parsing::
|
||||
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
|
||||
+------+------+------+-----+-----------+-----------+
|
||||
^
|
||||
|
|
||||
+-- flow dissector starts here
|
||||
|
||||
.. code:: c
|
||||
|
||||
skb->data + flow_keys->nhoff point the to first byte of L3_HEADER
|
||||
flow_keys->thoff = nhoff
|
||||
flow_keys->n_proto = ETHER_TYPE
|
||||
|
||||
In this case VLAN information has been processed before the flow dissector
|
||||
and BPF flow dissector is not required to handle it.
|
||||
|
||||
|
||||
The takeaway here is as follows: BPF flow dissector program can be called with
|
||||
the optional VLAN header and should gracefully handle both cases: when single
|
||||
or double VLAN is present and when it is not present. The same program
|
||||
can be called for both cases and would have to be written carefully to
|
||||
handle both cases.
|
||||
|
||||
|
||||
Reference Implementation
|
||||
========================
|
||||
|
||||
See ``tools/testing/selftests/bpf/progs/bpf_flow.c`` for the reference
|
||||
implementation and ``tools/testing/selftests/bpf/flow_dissector_load.[hc]``
|
||||
for the loader. bpftool can be used to load BPF flow dissector program as well.
|
||||
|
||||
The reference implementation is organized as follows:
|
||||
* ``jmp_table`` map that contains sub-programs for each supported L3 protocol
|
||||
* ``_dissect`` routine - entry point; it does input ``n_proto`` parsing and
|
||||
does ``bpf_tail_call`` to the appropriate L3 handler
|
||||
|
||||
Since BPF at this point doesn't support looping (or any jumping back),
|
||||
jmp_table is used instead to handle multiple levels of encapsulation (and
|
||||
IPv6 options).
|
||||
|
||||
|
||||
Current Limitations
|
||||
===================
|
||||
BPF flow dissector doesn't support exporting all the metadata that in-kernel
|
||||
C-based implementation can export. Notable example is single VLAN (802.1Q)
|
||||
and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
|
||||
for a set of information that's currently can be exported from the BPF context.
|
@@ -9,6 +9,7 @@ Contents:
|
||||
netdev-FAQ
|
||||
af_xdp
|
||||
batman-adv
|
||||
bpf_flow_dissector
|
||||
can
|
||||
can_ucan_protocol
|
||||
device_drivers/freescale/dpaa2/index
|
||||
|
@@ -5,25 +5,32 @@ The Definitive KVM (Kernel-based Virtual Machine) API Documentation
|
||||
----------------------
|
||||
|
||||
The kvm API is a set of ioctls that are issued to control various aspects
|
||||
of a virtual machine. The ioctls belong to three classes
|
||||
of a virtual machine. The ioctls belong to three classes:
|
||||
|
||||
- System ioctls: These query and set global attributes which affect the
|
||||
whole kvm subsystem. In addition a system ioctl is used to create
|
||||
virtual machines
|
||||
virtual machines.
|
||||
|
||||
- VM ioctls: These query and set attributes that affect an entire virtual
|
||||
machine, for example memory layout. In addition a VM ioctl is used to
|
||||
create virtual cpus (vcpus).
|
||||
create virtual cpus (vcpus) and devices.
|
||||
|
||||
Only run VM ioctls from the same process (address space) that was used
|
||||
to create the VM.
|
||||
VM ioctls must be issued from the same process (address space) that was
|
||||
used to create the VM.
|
||||
|
||||
- vcpu ioctls: These query and set attributes that control the operation
|
||||
of a single virtual cpu.
|
||||
|
||||
Only run vcpu ioctls from the same thread that was used to create the
|
||||
vcpu.
|
||||
vcpu ioctls should be issued from the same thread that was used to create
|
||||
the vcpu, except for asynchronous vcpu ioctl that are marked as such in
|
||||
the documentation. Otherwise, the first ioctl after switching threads
|
||||
could see a performance impact.
|
||||
|
||||
- device ioctls: These query and set attributes that control the operation
|
||||
of a single device.
|
||||
|
||||
device ioctls must be issued from the same process (address space) that
|
||||
was used to create the VM.
|
||||
|
||||
2. File descriptors
|
||||
-------------------
|
||||
@@ -32,17 +39,34 @@ The kvm API is centered around file descriptors. An initial
|
||||
open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
|
||||
can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this
|
||||
handle will create a VM file descriptor which can be used to issue VM
|
||||
ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
|
||||
and return a file descriptor pointing to it. Finally, ioctls on a vcpu
|
||||
fd can be used to control the vcpu, including the important task of
|
||||
actually running guest code.
|
||||
ioctls. A KVM_CREATE_VCPU or KVM_CREATE_DEVICE ioctl on a VM fd will
|
||||
create a virtual cpu or device and return a file descriptor pointing to
|
||||
the new resource. Finally, ioctls on a vcpu or device fd can be used
|
||||
to control the vcpu or device. For vcpus, this includes the important
|
||||
task of actually running guest code.
|
||||
|
||||
In general file descriptors can be migrated among processes by means
|
||||
of fork() and the SCM_RIGHTS facility of unix domain socket. These
|
||||
kinds of tricks are explicitly not supported by kvm. While they will
|
||||
not cause harm to the host, their actual behavior is not guaranteed by
|
||||
the API. The only supported use is one virtual machine per process,
|
||||
and one vcpu per thread.
|
||||
the API. See "General description" for details on the ioctl usage
|
||||
model that is supported by KVM.
|
||||
|
||||
It is important to note that althought VM ioctls may only be issued from
|
||||
the process that created the VM, a VM's lifecycle is associated with its
|
||||
file descriptor, not its creator (process). In other words, the VM and
|
||||
its resources, *including the associated address space*, are not freed
|
||||
until the last reference to the VM's file descriptor has been released.
|
||||
For example, if fork() is issued after ioctl(KVM_CREATE_VM), the VM will
|
||||
not be freed until both the parent (original) process and its child have
|
||||
put their references to the VM's file descriptor.
|
||||
|
||||
Because a VM's resources are not freed until the last reference to its
|
||||
file descriptor is released, creating additional references to a VM via
|
||||
via fork(), dup(), etc... without careful consideration is strongly
|
||||
discouraged and may have unwanted side effects, e.g. memory allocated
|
||||
by and on behalf of the VM's process may not be freed/unaccounted when
|
||||
the VM is shut down.
|
||||
|
||||
|
||||
It is important to note that althought VM ioctls may only be issued from
|
||||
@@ -515,11 +539,15 @@ c) KVM_INTERRUPT_SET_LEVEL
|
||||
Note that any value for 'irq' other than the ones stated above is invalid
|
||||
and incurs unexpected behavior.
|
||||
|
||||
This is an asynchronous vcpu ioctl and can be invoked from any thread.
|
||||
|
||||
MIPS:
|
||||
|
||||
Queues an external interrupt to be injected into the virtual CPU. A negative
|
||||
interrupt number dequeues the interrupt.
|
||||
|
||||
This is an asynchronous vcpu ioctl and can be invoked from any thread.
|
||||
|
||||
|
||||
4.17 KVM_DEBUG_GUEST
|
||||
|
||||
@@ -1086,14 +1114,12 @@ struct kvm_userspace_memory_region {
|
||||
#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
|
||||
#define KVM_MEM_READONLY (1UL << 1)
|
||||
|
||||
This ioctl allows the user to create or modify a guest physical memory
|
||||
slot. When changing an existing slot, it may be moved in the guest
|
||||
physical memory space, or its flags may be modified. It may not be
|
||||
resized. Slots may not overlap in guest physical address space.
|
||||
Bits 0-15 of "slot" specifies the slot id and this value should be
|
||||
less than the maximum number of user memory slots supported per VM.
|
||||
The maximum allowed slots can be queried using KVM_CAP_NR_MEMSLOTS,
|
||||
if this capability is supported by the architecture.
|
||||
This ioctl allows the user to create, modify or delete a guest physical
|
||||
memory slot. Bits 0-15 of "slot" specify the slot id and this value
|
||||
should be less than the maximum number of user memory slots supported per
|
||||
VM. The maximum allowed slots can be queried using KVM_CAP_NR_MEMSLOTS,
|
||||
if this capability is supported by the architecture. Slots may not
|
||||
overlap in guest physical address space.
|
||||
|
||||
If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of "slot"
|
||||
specifies the address space which is being modified. They must be
|
||||
@@ -1102,6 +1128,10 @@ KVM_CAP_MULTI_ADDRESS_SPACE capability. Slots in separate address spaces
|
||||
are unrelated; the restriction on overlapping slots only applies within
|
||||
each address space.
|
||||
|
||||
Deleting a slot is done by passing zero for memory_size. When changing
|
||||
an existing slot, it may be moved in the guest physical memory space,
|
||||
or its flags may be modified, but it may not be resized.
|
||||
|
||||
Memory for the region is taken starting at the address denoted by the
|
||||
field userspace_addr, which must point at user addressable memory for
|
||||
the entire memory slot size. Any object may back this memory, including
|
||||
@@ -2493,7 +2523,7 @@ KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm,
|
||||
machine checks needing further payload are not
|
||||
supported by this ioctl)
|
||||
|
||||
Note that the vcpu ioctl is asynchronous to vcpu execution.
|
||||
This is an asynchronous vcpu ioctl and can be invoked from any thread.
|
||||
|
||||
4.78 KVM_PPC_GET_HTAB_FD
|
||||
|
||||
@@ -3042,8 +3072,7 @@ KVM_S390_INT_EMERGENCY - sigp emergency; parameters in .emerg
|
||||
KVM_S390_INT_EXTERNAL_CALL - sigp external call; parameters in .extcall
|
||||
KVM_S390_MCHK - machine check interrupt; parameters in .mchk
|
||||
|
||||
|
||||
Note that the vcpu ioctl is asynchronous to vcpu execution.
|
||||
This is an asynchronous vcpu ioctl and can be invoked from any thread.
|
||||
|
||||
4.94 KVM_S390_GET_IRQ_STATE
|
||||
|
||||
|
@@ -142,7 +142,7 @@ Shadow pages contain the following information:
|
||||
If clear, this page corresponds to a guest page table denoted by the gfn
|
||||
field.
|
||||
role.quadrant:
|
||||
When role.cr4_pae=0, the guest uses 32-bit gptes while the host uses 64-bit
|
||||
When role.gpte_is_8_bytes=0, the guest uses 32-bit gptes while the host uses 64-bit
|
||||
sptes. That means a guest page table contains more ptes than the host,
|
||||
so multiple shadow pages are needed to shadow one guest page.
|
||||
For first-level shadow pages, role.quadrant can be 0 or 1 and denotes the
|
||||
@@ -158,9 +158,9 @@ Shadow pages contain the following information:
|
||||
The page is invalid and should not be used. It is a root page that is
|
||||
currently pinned (by a cpu hardware register pointing to it); once it is
|
||||
unpinned it will be destroyed.
|
||||
role.cr4_pae:
|
||||
Contains the value of cr4.pae for which the page is valid (e.g. whether
|
||||
32-bit or 64-bit gptes are in use).
|
||||
role.gpte_is_8_bytes:
|
||||
Reflects the size of the guest PTE for which the page is valid, i.e. '1'
|
||||
if 64-bit gptes are in use, '0' if 32-bit gptes are in use.
|
||||
role.nxe:
|
||||
Contains the value of efer.nxe for which the page is valid.
|
||||
role.cr0_wp:
|
||||
@@ -173,6 +173,9 @@ Shadow pages contain the following information:
|
||||
Contains the value of cr4.smap && !cr0.wp for which the page is valid
|
||||
(pages for which this is true are different from other pages; see the
|
||||
treatment of cr0.wp=0 below).
|
||||
role.ept_sp:
|
||||
This is a virtual flag to denote a shadowed nested EPT page. ept_sp
|
||||
is true if "cr0_wp && smap_andnot_wp", an otherwise invalid combination.
|
||||
role.smm:
|
||||
Is 1 if the page is valid in system management mode. This field
|
||||
determines which of the kvm_memslots array was used to build this
|
||||
|
Reference in New Issue
Block a user