Merge 922a763ae1 ("Merge tag 'zonefs-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs") into android-mainline

Steps on the way to 5.10-rc1

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I520719ae5e0d992c3756e393cb299d77d650622e
This commit is contained in:
Greg Kroah-Hartman
2020-10-26 10:08:43 +01:00
174 changed files with 4635 additions and 1162 deletions

View File

@@ -963,7 +963,7 @@ exit and perhaps also vice versa. Therefore, whenever the
``->dynticks_nesting`` field is incremented up from zero, the
``->dynticks_nmi_nesting`` field is set to a large positive number, and
whenever the ``->dynticks_nesting`` field is decremented down to zero,
the the ``->dynticks_nmi_nesting`` field is set to zero. Assuming that
the ``->dynticks_nmi_nesting`` field is set to zero. Assuming that
the number of misnested interrupts is not sufficient to overflow the
counter, this approach corrects the ``->dynticks_nmi_nesting`` field
every time the corresponding CPU enters the idle loop from process

View File

@@ -2162,7 +2162,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
this sort of thing.
#. If a CPU is in a portion of the kernel that is absolutely positively
no-joking guaranteed to never execute any RCU read-side critical
sections, and RCU believes this CPU to to be idle, no problem. This
sections, and RCU believes this CPU to be idle, no problem. This
sort of thing is used by some architectures for light-weight
exception handlers, which can then avoid the overhead of
``rcu_irq_enter()`` and ``rcu_irq_exit()`` at exception entry and
@@ -2431,7 +2431,7 @@ However, there are legitimate preemptible-RCU implementations that do
not have this property, given that any point in the code outside of an
RCU read-side critical section can be a quiescent state. Therefore,
*RCU-sched* was created, which follows “classic” RCU in that an
RCU-sched grace period waits for for pre-existing interrupt and NMI
RCU-sched grace period waits for pre-existing interrupt and NMI
handlers. In kernels built with ``CONFIG_PREEMPT=n``, the RCU and
RCU-sched APIs have identical implementations, while kernels built with
``CONFIG_PREEMPT=y`` provide a separate implementation for each.

View File

@@ -360,7 +360,7 @@ order to amortize their overhead over many uses of the corresponding APIs.
There are at least three flavors of RCU usage in the Linux kernel. The diagram
above shows the most common one. On the updater side, the rcu_assign_pointer(),
sychronize_rcu() and call_rcu() primitives used are the same for all three
synchronize_rcu() and call_rcu() primitives used are the same for all three
flavors. However for protection (on the reader side), the primitives used vary
depending on the flavor:

View File

@@ -3099,6 +3099,10 @@
and gids from such clients. This is intended to ease
migration from NFSv2/v3.
nmi_backtrace.backtrace_idle [KNL]
Dump stacks even of idle CPUs in response to an
NMI stack-backtrace request.
nmi_debug= [KNL,SH] Specify one or more actions to take
when a NMI is triggered.
Format: [state][,regs][,debounce][,die]
@@ -4178,46 +4182,55 @@
This wake_up() will be accompanied by a
WARN_ONCE() splat and an ftrace_dump().
rcutree.rcu_unlock_delay= [KNL]
In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels,
this specifies an rcu_read_unlock()-time delay
in microseconds. This defaults to zero.
Larger delays increase the probability of
catching RCU pointer leaks, that is, buggy use
of RCU-protected pointers after the relevant
rcu_read_unlock() has completed.
rcutree.sysrq_rcu= [KNL]
Commandeer a sysrq key to dump out Tree RCU's
rcu_node tree with an eye towards determining
why a new grace period has not yet started.
rcuperf.gp_async= [KNL]
rcuscale.gp_async= [KNL]
Measure performance of asynchronous
grace-period primitives such as call_rcu().
rcuperf.gp_async_max= [KNL]
rcuscale.gp_async_max= [KNL]
Specify the maximum number of outstanding
callbacks per writer thread. When a writer
thread exceeds this limit, it invokes the
corresponding flavor of rcu_barrier() to allow
previously posted callbacks to drain.
rcuperf.gp_exp= [KNL]
rcuscale.gp_exp= [KNL]
Measure performance of expedited synchronous
grace-period primitives.
rcuperf.holdoff= [KNL]
rcuscale.holdoff= [KNL]
Set test-start holdoff period. The purpose of
this parameter is to delay the start of the
test until boot completes in order to avoid
interference.
rcuperf.kfree_rcu_test= [KNL]
rcuscale.kfree_rcu_test= [KNL]
Set to measure performance of kfree_rcu() flooding.
rcuperf.kfree_nthreads= [KNL]
rcuscale.kfree_nthreads= [KNL]
The number of threads running loops of kfree_rcu().
rcuperf.kfree_alloc_num= [KNL]
rcuscale.kfree_alloc_num= [KNL]
Number of allocations and frees done in an iteration.
rcuperf.kfree_loops= [KNL]
Number of loops doing rcuperf.kfree_alloc_num number
rcuscale.kfree_loops= [KNL]
Number of loops doing rcuscale.kfree_alloc_num number
of allocations and frees.
rcuperf.nreaders= [KNL]
rcuscale.nreaders= [KNL]
Set number of RCU readers. The value -1 selects
N, where N is the number of CPUs. A value
"n" less than -1 selects N-n+1, where N is again
@@ -4226,23 +4239,23 @@
A value of "n" less than or equal to -N selects
a single reader.
rcuperf.nwriters= [KNL]
rcuscale.nwriters= [KNL]
Set number of RCU writers. The values operate
the same as for rcuperf.nreaders.
the same as for rcuscale.nreaders.
N, where N is the number of CPUs
rcuperf.perf_type= [KNL]
rcuscale.perf_type= [KNL]
Specify the RCU implementation to test.
rcuperf.shutdown= [KNL]
rcuscale.shutdown= [KNL]
Shut the system down after performance tests
complete. This is useful for hands-off automated
testing.
rcuperf.verbose= [KNL]
rcuscale.verbose= [KNL]
Enable additional printk() statements.
rcuperf.writer_holdoff= [KNL]
rcuscale.writer_holdoff= [KNL]
Write-side holdoff between grace periods,
in microseconds. The default of zero says
no holdoff.
@@ -4295,6 +4308,18 @@
are zero, rcutorture acts as if is interpreted
they are all non-zero.
rcutorture.irqreader= [KNL]
Run RCU readers from irq handlers, or, more
accurately, from a timer handler. Not all RCU
flavors take kindly to this sort of thing.
rcutorture.leakpointer= [KNL]
Leak an RCU-protected pointer out of the reader.
This can of course result in splats, and is
intended to test the ability of things like
CONFIG_RCU_STRICT_GRACE_PERIOD=y to detect
such leaks.
rcutorture.n_barrier_cbs= [KNL]
Set callbacks/threads for rcu_barrier() testing.
@@ -4516,8 +4541,8 @@
refscale.shutdown= [KNL]
Shut down the system at the end of the performance
test. This defaults to 1 (shut it down) when
rcuperf is built into the kernel and to 0 (leave
it running) when rcuperf is built as a module.
refscale is built into the kernel and to 0 (leave
it running) when refscale is built as a module.
refscale.verbose= [KNL]
Enable additional printk() statements.
@@ -4663,6 +4688,98 @@
Format: integer between 0 and 10
Default is 0.
scftorture.holdoff= [KNL]
Number of seconds to hold off before starting
test. Defaults to zero for module insertion and
to 10 seconds for built-in smp_call_function()
tests.
scftorture.longwait= [KNL]
Request ridiculously long waits randomly selected
up to the chosen limit in seconds. Zero (the
default) disables this feature. Please note
that requesting even small non-zero numbers of
seconds can result in RCU CPU stall warnings,
softlockup complaints, and so on.
scftorture.nthreads= [KNL]
Number of kthreads to spawn to invoke the
smp_call_function() family of functions.
The default of -1 specifies a number of kthreads
equal to the number of CPUs.
scftorture.onoff_holdoff= [KNL]
Number seconds to wait after the start of the
test before initiating CPU-hotplug operations.
scftorture.onoff_interval= [KNL]
Number seconds to wait between successive
CPU-hotplug operations. Specifying zero (which
is the default) disables CPU-hotplug operations.
scftorture.shutdown_secs= [KNL]
The number of seconds following the start of the
test after which to shut down the system. The
default of zero avoids shutting down the system.
Non-zero values are useful for automated tests.
scftorture.stat_interval= [KNL]
The number of seconds between outputting the
current test statistics to the console. A value
of zero disables statistics output.
scftorture.stutter_cpus= [KNL]
The number of jiffies to wait between each change
to the set of CPUs under test.
scftorture.use_cpus_read_lock= [KNL]
Use use_cpus_read_lock() instead of the default
preempt_disable() to disable CPU hotplug
while invoking one of the smp_call_function*()
functions.
scftorture.verbose= [KNL]
Enable additional printk() statements.
scftorture.weight_single= [KNL]
The probability weighting to use for the
smp_call_function_single() function with a zero
"wait" parameter. A value of -1 selects the
default if all other weights are -1. However,
if at least one weight has some other value, a
value of -1 will instead select a weight of zero.
scftorture.weight_single_wait= [KNL]
The probability weighting to use for the
smp_call_function_single() function with a
non-zero "wait" parameter. See weight_single.
scftorture.weight_many= [KNL]
The probability weighting to use for the
smp_call_function_many() function with a zero
"wait" parameter. See weight_single.
Note well that setting a high probability for
this weighting can place serious IPI load
on the system.
scftorture.weight_many_wait= [KNL]
The probability weighting to use for the
smp_call_function_many() function with a
non-zero "wait" parameter. See weight_single
and weight_many.
scftorture.weight_all= [KNL]
The probability weighting to use for the
smp_call_function_all() function with a zero
"wait" parameter. See weight_single and
weight_many.
scftorture.weight_all_wait= [KNL]
The probability weighting to use for the
smp_call_function_all() function with a
non-zero "wait" parameter. See weight_single
and weight_many.
skew_tick= [KNL] Offset the periodic timer tick per cpu to mitigate
xtime_lock contention on larger systems, and/or RCU lock
contention on all systems with CONFIG_MAXSMP set.

View File

@@ -11,6 +11,7 @@ KUnit - Unit Testing for the Linux Kernel
usage
kunit-tool
api/index
style
faq
What is KUnit?

View File

@@ -0,0 +1,205 @@
.. SPDX-License-Identifier: GPL-2.0
===========================
Test Style and Nomenclature
===========================
To make finding, writing, and using KUnit tests as simple as possible, it's
strongly encouraged that they are named and written according to the guidelines
below. While it's possible to write KUnit tests which do not follow these rules,
they may break some tooling, may conflict with other tests, and may not be run
automatically by testing systems.
It's recommended that you only deviate from these guidelines when:
1. Porting tests to KUnit which are already known with an existing name, or
2. Writing tests which would cause serious problems if automatically run (e.g.,
non-deterministically producing false positives or negatives, or taking an
extremely long time to run).
Subsystems, Suites, and Tests
=============================
In order to make tests as easy to find as possible, they're grouped into suites
and subsystems. A test suite is a group of tests which test a related area of
the kernel, and a subsystem is a set of test suites which test different parts
of the same kernel subsystem or driver.
Subsystems
----------
Every test suite must belong to a subsystem. A subsystem is a collection of one
or more KUnit test suites which test the same driver or part of the kernel. A
rule of thumb is that a test subsystem should match a single kernel module. If
the code being tested can't be compiled as a module, in many cases the subsystem
should correspond to a directory in the source tree or an entry in the
MAINTAINERS file. If unsure, follow the conventions set by tests in similar
areas.
Test subsystems should be named after the code being tested, either after the
module (wherever possible), or after the directory or files being tested. Test
subsystems should be named to avoid ambiguity where necessary.
If a test subsystem name has multiple components, they should be separated by
underscores. *Do not* include "test" or "kunit" directly in the subsystem name
unless you are actually testing other tests or the kunit framework itself.
Example subsystems could be:
``ext4``
Matches the module and filesystem name.
``apparmor``
Matches the module name and LSM name.
``kasan``
Common name for the tool, prominent part of the path ``mm/kasan``
``snd_hda_codec_hdmi``
Has several components (``snd``, ``hda``, ``codec``, ``hdmi``) separated by
underscores. Matches the module name.
Avoid names like these:
``linear-ranges``
Names should use underscores, not dashes, to separate words. Prefer
``linear_ranges``.
``qos-kunit-test``
As well as using underscores, this name should not have "kunit-test" as a
suffix, and ``qos`` is ambiguous as a subsystem name. ``power_qos`` would be a
better name.
``pc_parallel_port``
The corresponding module name is ``parport_pc``, so this subsystem should also
be named ``parport_pc``.
.. note::
The KUnit API and tools do not explicitly know about subsystems. They're
simply a way of categorising test suites and naming modules which
provides a simple, consistent way for humans to find and run tests. This
may change in the future, though.
Suites
------
KUnit tests are grouped into test suites, which cover a specific area of
functionality being tested. Test suites can have shared initialisation and
shutdown code which is run for all tests in the suite.
Not all subsystems will need to be split into multiple test suites (e.g. simple drivers).
Test suites are named after the subsystem they are part of. If a subsystem
contains several suites, the specific area under test should be appended to the
subsystem name, separated by an underscore.
In the event that there are multiple types of test using KUnit within a
subsystem (e.g., both unit tests and integration tests), they should be put into
separate suites, with the type of test as the last element in the suite name.
Unless these tests are actually present, avoid using ``_test``, ``_unittest`` or
similar in the suite name.
The full test suite name (including the subsystem name) should be specified as
the ``.name`` member of the ``kunit_suite`` struct, and forms the base for the
module name (see below).
Example test suites could include:
``ext4_inode``
Part of the ``ext4`` subsystem, testing the ``inode`` area.
``kunit_try_catch``
Part of the ``kunit`` implementation itself, testing the ``try_catch`` area.
``apparmor_property_entry``
Part of the ``apparmor`` subsystem, testing the ``property_entry`` area.
``kasan``
The ``kasan`` subsystem has only one suite, so the suite name is the same as
the subsystem name.
Avoid names like:
``ext4_ext4_inode``
There's no reason to state the subsystem twice.
``property_entry``
The suite name is ambiguous without the subsystem name.
``kasan_integration_test``
Because there is only one suite in the ``kasan`` subsystem, the suite should
just be called ``kasan``. There's no need to redundantly add
``integration_test``. Should a separate test suite with, for example, unit
tests be added, then that suite could be named ``kasan_unittest`` or similar.
Test Cases
----------
Individual tests consist of a single function which tests a constrained
codepath, property, or function. In the test output, individual tests' results
will show up as subtests of the suite's results.
Tests should be named after what they're testing. This is often the name of the
function being tested, with a description of the input or codepath being tested.
As tests are C functions, they should be named and written in accordance with
the kernel coding style.
.. note::
As tests are themselves functions, their names cannot conflict with
other C identifiers in the kernel. This may require some creative
naming. It's a good idea to make your test functions `static` to avoid
polluting the global namespace.
Example test names include:
``unpack_u32_with_null_name``
Tests the ``unpack_u32`` function when a NULL name is passed in.
``test_list_splice``
Tests the ``list_splice`` macro. It has the prefix ``test_`` to avoid a
name conflict with the macro itself.
Should it be necessary to refer to a test outside the context of its test suite,
the *fully-qualified* name of a test should be the suite name followed by the
test name, separated by a colon (i.e. ``suite:test``).
Test Kconfig Entries
====================
Every test suite should be tied to a Kconfig entry.
This Kconfig entry must:
* be named ``CONFIG_<name>_KUNIT_TEST``: where <name> is the name of the test
suite.
* be listed either alongside the config entries for the driver/subsystem being
tested, or be under [Kernel Hacking]→[Kernel Testing and Coverage]
* depend on ``CONFIG_KUNIT``
* be visible only if ``CONFIG_KUNIT_ALL_TESTS`` is not enabled.
* have a default value of ``CONFIG_KUNIT_ALL_TESTS``.
* have a brief description of KUnit in the help text
Unless there's a specific reason not to (e.g. the test is unable to be built as
a module), Kconfig entries for tests should be tristate.
An example Kconfig entry:
.. code-block:: none
config FOO_KUNIT_TEST
tristate "KUnit test for foo" if !KUNIT_ALL_TESTS
depends on KUNIT
default KUNIT_ALL_TESTS
help
This builds unit tests for foo.
For more information on KUnit and unit tests in general, please refer
to the KUnit documentation in Documentation/dev-tools/kunit
If unsure, say N
Test File and Module Names
==========================
KUnit tests can often be compiled as a module. These modules should be named
after the test suite, followed by ``_test``. If this is likely to conflict with
non-KUnit tests, the suffix ``_kunit`` can also be used.
The easiest way of achieving this is to name the file containing the test suite
``<suite>_test.c`` (or, as above, ``<suite>_kunit.c``). This file should be
placed next to the code under test.
If the suite name contains some or all of the name of the test's parent
directory, it may make sense to modify the source filename to reduce redundancy.
For example, a ``foo_firmware`` suite could be in the ``foo/firmware_test.c``
file.

View File

@@ -211,6 +211,11 @@ KUnit test framework.
.. note::
A test case will only be run if it is associated with a test suite.
``kunit_test_suite(...)`` is a macro which tells the linker to put the specified
test suite in a special linker section so that it can be run by KUnit either
after late_init, or when the test module is loaded (depending on whether the
test was built in or not).
For more information on these types of things see the :doc:`api/test`.
Isolating Behavior

View File

@@ -0,0 +1,135 @@
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
%YAML 1.2
---
$id: http://devicetree.org/schemas/mailbox/arm,mhu.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: ARM MHU Mailbox Controller
maintainers:
- Jassi Brar <jaswinder.singh@linaro.org>
description: |
The ARM's Message-Handling-Unit (MHU) is a mailbox controller that has 3
independent channels/links to communicate with remote processor(s). MHU links
are hardwired on a platform. A link raises interrupt for any received data.
However, there is no specified way of knowing if the sent data has been read
by the remote. This driver assumes the sender polls STAT register and the
remote clears it after having read the data. The last channel is specified to
be a 'Secure' resource, hence can't be used by Linux running NS.
The MHU hardware also allows operations in doorbell mode. The MHU drives the
interrupt signal using a 32-bit register, with all 32-bits logically ORed
together. It provides a set of registers to enable software to set, clear and
check the status of each of the bits of this register independently. The use
of 32 bits per interrupt line enables software to provide more information
about the source of the interrupt. For example, each bit of the register can
be associated with a type of event that can contribute to raising the
interrupt. Each of the 32-bits can be used as "doorbell" to alert the remote
processor.
# We need a select here so we don't match all nodes with 'arm,primecell'
select:
properties:
compatible:
contains:
enum:
- arm,mhu
- arm,mhu-doorbell
required:
- compatible
properties:
compatible:
oneOf:
- description: Data transfer mode
items:
- const: arm,mhu
- const: arm,primecell
- description: Doorbell mode
items:
- const: arm,mhu-doorbell
- const: arm,primecell
reg:
maxItems: 1
interrupts:
items:
- description: low-priority non-secure
- description: high-priority non-secure
- description: Secure
maxItems: 3
clocks:
maxItems: 1
clock-names:
items:
- const: apb_pclk
'#mbox-cells':
description: |
Set to 1 in data transfer mode and represents index of the channel.
Set to 2 in doorbell mode and represents index of the channel and doorbell
number.
enum: [ 1, 2 ]
required:
- compatible
- reg
- interrupts
- '#mbox-cells'
additionalProperties: false
examples:
# Data transfer mode.
- |
soc {
#address-cells = <2>;
#size-cells = <2>;
mhuA: mailbox@2b1f0000 {
#mbox-cells = <1>;
compatible = "arm,mhu", "arm,primecell";
reg = <0 0x2b1f0000 0 0x1000>;
interrupts = <0 36 4>, /* LP-NonSecure */
<0 35 4>, /* HP-NonSecure */
<0 37 4>; /* Secure */
clocks = <&clock 0 2 1>;
clock-names = "apb_pclk";
};
mhu_client_scb: scb@2e000000 {
compatible = "fujitsu,mb86s70-scb-1.0";
reg = <0 0x2e000000 0 0x4000>;
mboxes = <&mhuA 1>; /* HP-NonSecure */
};
};
# Doorbell mode.
- |
soc {
#address-cells = <2>;
#size-cells = <2>;
mhuB: mailbox@2b2f0000 {
#mbox-cells = <2>;
compatible = "arm,mhu-doorbell", "arm,primecell";
reg = <0 0x2b2f0000 0 0x1000>;
interrupts = <0 36 4>, /* LP-NonSecure */
<0 35 4>, /* HP-NonSecure */
<0 37 4>; /* Secure */
clocks = <&clock 0 2 1>;
clock-names = "apb_pclk";
};
mhu_client_scpi: scpi@2f000000 {
compatible = "arm,scpi";
reg = <0 0x2f000000 0 0x200>;
mboxes = <&mhuB 1 4>; /* HP-NonSecure, 5th doorbell */
};
};

View File

@@ -1,43 +0,0 @@
ARM MHU Mailbox Driver
======================
The ARM's Message-Handling-Unit (MHU) is a mailbox controller that has
3 independent channels/links to communicate with remote processor(s).
MHU links are hardwired on a platform. A link raises interrupt for any
received data. However, there is no specified way of knowing if the sent
data has been read by the remote. This driver assumes the sender polls
STAT register and the remote clears it after having read the data.
The last channel is specified to be a 'Secure' resource, hence can't be
used by Linux running NS.
Mailbox Device Node:
====================
Required properties:
--------------------
- compatible: Shall be "arm,mhu" & "arm,primecell"
- reg: Contains the mailbox register address range (base
address and length)
- #mbox-cells Shall be 1 - the index of the channel needed.
- interrupts: Contains the interrupt information corresponding to
each of the 3 links of MHU.
Example:
--------
mhu: mailbox@2b1f0000 {
#mbox-cells = <1>;
compatible = "arm,mhu", "arm,primecell";
reg = <0 0x2b1f0000 0x1000>;
interrupts = <0 36 4>, /* LP-NonSecure */
<0 35 4>, /* HP-NonSecure */
<0 37 4>; /* Secure */
clocks = <&clock 0 2 1>;
clock-names = "apb_pclk";
};
mhu_client: scb@2e000000 {
compatible = "fujitsu,mb86s70-scb-1.0";
reg = <0 0x2e000000 0x4000>;
mboxes = <&mhu 1>; /* HP-NonSecure */
};

View File

@@ -326,6 +326,21 @@ discover the amount of data that has been written to the zone. In the case of a
read-only zone discovered at run-time, as indicated in the previous section.
The size of the zone file is left unchanged from its last updated value.
A zoned block device (e.g. an NVMe Zoned Namespace device) may have limits on
the number of zones that can be active, that is, zones that are in the
implicit open, explicit open or closed conditions. This potential limitation
translates into a risk for applications to see write IO errors due to this
limit being exceeded if the zone of a file is not already active when a write
request is issued by the user.
To avoid these potential errors, the "explicit-open" mount option forces zones
to be made active using an open zone command when a file is opened for writing
for the first time. If the zone open command succeeds, the application is then
guaranteed that write requests can be processed. Conversely, the
"explicit-open" mount option will result in a zone close command being issued
to the device on the last close() of a zone file if the zone is not full nor
empty.
Zonefs User Space Tools
=======================

View File

@@ -17685,8 +17685,9 @@ S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
F: Documentation/RCU/torture.rst
F: kernel/locking/locktorture.c
F: kernel/rcu/rcuperf.c
F: kernel/rcu/rcuscale.c
F: kernel/rcu/rcutorture.c
F: kernel/rcu/refscale.c
F: kernel/torture.c
TOSHIBA ACPI EXTRAS DRIVER

View File

@@ -479,3 +479,4 @@
547 common openat2 sys_openat2
548 common pidfd_getfd sys_pidfd_getfd
549 common faccessat2 sys_faccessat2
550 common process_madvise sys_process_madvise

View File

@@ -17,7 +17,6 @@
#include <asm/cp15.h>
#include <asm/cputype.h>
#include <asm/sections.h>
#include <asm/cachetype.h>
#include <asm/fixmap.h>
#include <asm/sections.h>

View File

@@ -453,3 +453,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise

View File

@@ -38,7 +38,7 @@
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
#define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)
#define __NR_compat_syscalls 440
#define __NR_compat_syscalls 441
#endif
#define __ARCH_WANT_SYS_CLONE

View File

@@ -887,6 +887,8 @@ __SYSCALL(__NR_openat2, sys_openat2)
__SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
#define __NR_faccessat2 439
__SYSCALL(__NR_faccessat2, sys_faccessat2)
#define __NR_process_madvise 440
__SYSCALL(__NR_process_madvise, sys_process_madvise)
/*
* Please add new compat syscalls above this comment and update

View File

@@ -40,7 +40,7 @@ obj-y += esi_stub.o # must be in kernel proper
endif
obj-$(CONFIG_INTEL_IOMMU) += pci-dma.o
obj-$(CONFIG_BINFMT_ELF) += elfcore.o
obj-$(CONFIG_ELF_CORE) += elfcore.o
# fp_emulate() expects f2-f5,f16-f31 to contain the user-level state.
CFLAGS_traps.o += -mfixed-range=f2-f5,f16-f31

View File

@@ -360,3 +360,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise

View File

@@ -439,3 +439,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise

View File

@@ -445,3 +445,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise

View File

@@ -378,3 +378,4 @@
437 n32 openat2 sys_openat2
438 n32 pidfd_getfd sys_pidfd_getfd
439 n32 faccessat2 sys_faccessat2
440 n32 process_madvise sys_process_madvise

View File

@@ -354,3 +354,4 @@
437 n64 openat2 sys_openat2
438 n64 pidfd_getfd sys_pidfd_getfd
439 n64 faccessat2 sys_faccessat2
440 n64 process_madvise sys_process_madvise

View File

@@ -427,3 +427,4 @@
437 o32 openat2 sys_openat2
438 o32 pidfd_getfd sys_pidfd_getfd
439 o32 faccessat2 sys_faccessat2
440 o32 process_madvise sys_process_madvise

View File

@@ -437,3 +437,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise

View File

@@ -529,3 +529,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise

View File

@@ -442,3 +442,4 @@
437 common openat2 sys_openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise sys_process_madvise

View File

@@ -442,3 +442,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise

View File

@@ -485,3 +485,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise

View File

@@ -62,12 +62,12 @@ config NR_CPUS
source "arch/$(HEADER_ARCH)/um/Kconfig"
config FORBID_STATIC_LINK
config MAY_HAVE_RUNTIME_DEPS
bool
config STATIC_LINK
bool "Force a static link"
depends on !FORBID_STATIC_LINK
depends on CC_CAN_LINK_STATIC_NO_RUNTIME_DEPS || !MAY_HAVE_RUNTIME_DEPS
help
This option gives you the ability to force a static link of UML.
Normally, UML is linked as a shared binary. This is inconvenient for

View File

@@ -234,7 +234,7 @@ config UML_NET_DAEMON
config UML_NET_VECTOR
bool "Vector I/O high performance network devices"
depends on UML_NET
select FORBID_STATIC_LINK
select MAY_HAVE_RUNTIME_DEPS
help
This User-Mode Linux network driver uses multi-message send
and receive functions. The host running the UML guest must have
@@ -246,7 +246,7 @@ config UML_NET_VECTOR
config UML_NET_VDE
bool "VDE transport (obsolete)"
depends on UML_NET
select FORBID_STATIC_LINK
select MAY_HAVE_RUNTIME_DEPS
help
This User-Mode Linux network transport allows one or more running
UMLs on a single host to communicate with each other and also
@@ -294,7 +294,7 @@ config UML_NET_MCAST
config UML_NET_PCAP
bool "pcap transport (obsolete)"
depends on UML_NET
select FORBID_STATIC_LINK
select MAY_HAVE_RUNTIME_DEPS
help
The pcap transport makes a pcap packet stream on the host look
like an ethernet device inside UML. This is useful for making

View File

@@ -7,6 +7,7 @@
*/
#include <stdint.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>

View File

@@ -32,7 +32,7 @@ static int pcap_user_init(void *data, void *dev)
return 0;
}
static int pcap_open(void *data)
static int pcap_user_open(void *data)
{
struct pcap_data *pri = data;
__u32 netmask;
@@ -44,14 +44,14 @@ static int pcap_open(void *data)
if (pri->filter != NULL) {
err = dev_netmask(pri->dev, &netmask);
if (err < 0) {
printk(UM_KERN_ERR "pcap_open : dev_netmask failed\n");
printk(UM_KERN_ERR "pcap_user_open : dev_netmask failed\n");
return -EIO;
}
pri->compiled = uml_kmalloc(sizeof(struct bpf_program),
UM_GFP_KERNEL);
if (pri->compiled == NULL) {
printk(UM_KERN_ERR "pcap_open : kmalloc failed\n");
printk(UM_KERN_ERR "pcap_user_open : kmalloc failed\n");
return -ENOMEM;
}
@@ -59,14 +59,14 @@ static int pcap_open(void *data)
(struct bpf_program *) pri->compiled,
pri->filter, pri->optimize, netmask);
if (err < 0) {
printk(UM_KERN_ERR "pcap_open : pcap_compile failed - "
printk(UM_KERN_ERR "pcap_user_open : pcap_compile failed - "
"'%s'\n", pcap_geterr(pri->pcap));
goto out;
}
err = pcap_setfilter(pri->pcap, pri->compiled);
if (err < 0) {
printk(UM_KERN_ERR "pcap_open : pcap_setfilter "
printk(UM_KERN_ERR "pcap_user_open : pcap_setfilter "
"failed - '%s'\n", pcap_geterr(pri->pcap));
goto out;
}
@@ -127,7 +127,7 @@ int pcap_user_read(int fd, void *buffer, int len, struct pcap_data *pri)
const struct net_user_info pcap_user_info = {
.init = pcap_user_init,
.open = pcap_open,
.open = pcap_user_open,
.close = NULL,
.remove = pcap_remove,
.add_address = NULL,

View File

@@ -9,7 +9,7 @@
#include <errno.h>
#include <fcntl.h>
#include <string.h>
#include <sys/termios.h>
#include <termios.h>
#include <sys/wait.h>
#include <net_user.h>
#include <os.h>

View File

@@ -1403,7 +1403,7 @@ static int vector_net_load_bpf_flash(struct net_device *dev,
kfree(vp->bpf->filter);
vp->bpf->filter = NULL;
} else {
vp->bpf = kmalloc(sizeof(struct sock_fprog), GFP_KERNEL);
vp->bpf = kmalloc(sizeof(struct sock_fprog), GFP_ATOMIC);
if (vp->bpf == NULL) {
netdev_err(dev, "failed to allocate memory for firmware\n");
goto flash_fail;
@@ -1415,7 +1415,7 @@ static int vector_net_load_bpf_flash(struct net_device *dev,
if (request_firmware(&fw, efl->data, &vdevice->pdev.dev))
goto flash_fail;
vp->bpf->filter = kmemdup(fw->data, fw->size, GFP_KERNEL);
vp->bpf->filter = kmemdup(fw->data, fw->size, GFP_ATOMIC);
if (!vp->bpf->filter)
goto free_buffer;

View File

@@ -18,9 +18,7 @@
#include <fcntl.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <net/ethernet.h>
#include <netinet/ip.h>
#include <netinet/ether.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>
#include <sys/wait.h>
@@ -39,6 +37,7 @@
#define ID_MAX 2
#define TOKEN_IFNAME "ifname"
#define TOKEN_SCRIPT "ifup"
#define TRANS_RAW "raw"
#define TRANS_RAW_LEN strlen(TRANS_RAW)
@@ -55,6 +54,9 @@
#define MAX_UN_LEN 107
static const char padchar[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
static const char *template = "tapXXXXXX";
/* This is very ugly and brute force lookup, but it is done
* only once at initialization so not worth doing hashes or
* anything more intelligent
@@ -191,16 +193,21 @@ raw_fd_cleanup:
return err;
}
static struct vector_fds *user_init_tap_fds(struct arglist *ifspec)
{
int fd = -1;
int fd = -1, i;
char *iface;
struct vector_fds *result = NULL;
bool dynamic = false;
char dynamic_ifname[IFNAMSIZ];
char *argv[] = {NULL, NULL, NULL, NULL};
iface = uml_vector_fetch_arg(ifspec, TOKEN_IFNAME);
if (iface == NULL) {
printk(UM_KERN_ERR "uml_tap: failed to parse interface spec\n");
goto tap_cleanup;
dynamic = true;
iface = dynamic_ifname;
srand(getpid());
}
result = uml_kmalloc(sizeof(struct vector_fds), UM_GFP_KERNEL);
@@ -214,14 +221,30 @@ static struct vector_fds *user_init_tap_fds(struct arglist *ifspec)
result->remote_addr_size = 0;
/* TAP */
do {
if (dynamic) {
strcpy(iface, template);
for (i = 0; i < strlen(iface); i++) {
if (iface[i] == 'X') {
iface[i] = padchar[rand() % strlen(padchar)];
}
}
}
fd = create_tap_fd(iface);
if (fd < 0) {
if ((fd < 0) && (!dynamic)) {
printk(UM_KERN_ERR "uml_tap: failed to create tun interface\n");
goto tap_cleanup;
}
result->tx_fd = fd;
result->rx_fd = fd;
} while (fd < 0);
argv[0] = uml_vector_fetch_arg(ifspec, TOKEN_SCRIPT);
if (argv[0]) {
argv[1] = iface;
run_helper(NULL, NULL, argv);
}
return result;
tap_cleanup:
printk(UM_KERN_ERR "user_init_tap: init failed, error %d", fd);
@@ -233,6 +256,7 @@ static struct vector_fds *user_init_hybrid_fds(struct arglist *ifspec)
{
char *iface;
struct vector_fds *result = NULL;
char *argv[] = {NULL, NULL, NULL, NULL};
iface = uml_vector_fetch_arg(ifspec, TOKEN_IFNAME);
if (iface == NULL) {
@@ -266,6 +290,12 @@ static struct vector_fds *user_init_hybrid_fds(struct arglist *ifspec)
"uml_tap: failed to create paired raw socket: %i\n", result->rx_fd);
goto hybrid_cleanup;
}
argv[0] = uml_vector_fetch_arg(ifspec, TOKEN_SCRIPT);
if (argv[0]) {
argv[1] = iface;
run_helper(NULL, NULL, argv);
}
return result;
hybrid_cleanup:
printk(UM_KERN_ERR "user_init_hybrid: init failed");
@@ -332,7 +362,7 @@ static struct vector_fds *user_init_unix_fds(struct arglist *ifspec, int id)
}
switch (id) {
case ID_BESS:
if (connect(fd, remote_addr, sizeof(struct sockaddr_un)) < 0) {
if (connect(fd, (const struct sockaddr *) remote_addr, sizeof(struct sockaddr_un)) < 0) {
printk(UM_KERN_ERR "bess open:cannot connect to %s %i", remote_addr->sun_path, -errno);
goto unix_cleanup;
}
@@ -399,7 +429,6 @@ static struct vector_fds *user_init_fd_fds(struct arglist *ifspec)
fd_cleanup:
if (fd >= 0)
os_close_file(fd);
if (result != NULL)
kfree(result);
return NULL;
}
@@ -410,6 +439,7 @@ static struct vector_fds *user_init_raw_fds(struct arglist *ifspec)
int err = -ENOMEM;
char *iface;
struct vector_fds *result = NULL;
char *argv[] = {NULL, NULL, NULL, NULL};
iface = uml_vector_fetch_arg(ifspec, TOKEN_IFNAME);
if (iface == NULL)
@@ -432,6 +462,11 @@ static struct vector_fds *user_init_raw_fds(struct arglist *ifspec)
result->remote_addr = NULL;
result->remote_addr_size = 0;
}
argv[0] = uml_vector_fetch_arg(ifspec, TOKEN_SCRIPT);
if (argv[0]) {
argv[1] = iface;
run_helper(NULL, NULL, argv);
}
return result;
raw_cleanup:
printk(UM_KERN_ERR "user_init_raw: init failed, error %d", err);
@@ -789,10 +824,12 @@ void *uml_vector_user_bpf(char *filename)
return false;
}
bpf_prog = uml_kmalloc(sizeof(struct sock_fprog), UM_GFP_KERNEL);
if (bpf_prog != NULL) {
if (bpf_prog == NULL) {
printk(KERN_ERR "Failed to allocate bpf prog buffer");
return NULL;
}
bpf_prog->len = statbuf.st_size / sizeof(struct sock_filter);
bpf_prog->filter = NULL;
}
ffd = os_open_file(filename, of_read(OPENFLAGS()), 0);
if (ffd < 0) {
printk(KERN_ERR "Error %d opening bpf file", -errno);

View File

@@ -35,14 +35,14 @@ int write_sigio_irq(int fd)
}
/* These are called from os-Linux/sigio.c to protect its pollfds arrays. */
static DEFINE_SPINLOCK(sigio_spinlock);
static DEFINE_MUTEX(sigio_mutex);
void sigio_lock(void)
{
spin_lock(&sigio_spinlock);
mutex_lock(&sigio_mutex);
}
void sigio_unlock(void)
{
spin_unlock(&sigio_spinlock);
mutex_unlock(&sigio_mutex);
}

View File

@@ -47,12 +47,10 @@ void show_stack(struct task_struct *task, unsigned long *stack,
if (kstack_end(stack))
break;
if (i && ((i % STACKSLOTS_PER_LINE) == 0))
printk("%s\n", loglvl);
pr_cont("\n");
pr_cont(" %08lx", *stack++);
}
printk("%s\n", loglvl);
printk("%sCall Trace:\n", loglvl);
dump_trace(current, &stackops, (void *)loglvl);
printk("%s\n", loglvl);
}

View File

@@ -70,14 +70,18 @@ static void time_travel_handle_message(struct um_timetravel_msg *msg,
* read of the message and write of the ACK.
*/
if (mode != TTMH_READ) {
while (os_poll(1, &time_travel_ext_fd) != 0) {
if (mode == TTMH_IDLE) {
BUG_ON(!irqs_disabled());
bool disabled = irqs_disabled();
BUG_ON(mode == TTMH_IDLE && !disabled);
if (disabled)
local_irq_enable();
while (os_poll(1, &time_travel_ext_fd) != 0) {
/* nothing */
}
if (disabled)
local_irq_disable();
}
}
}
ret = os_read_file(time_travel_ext_fd, msg, sizeof(*msg));
@@ -102,6 +106,7 @@ static void time_travel_handle_message(struct um_timetravel_msg *msg,
break;
}
resp.seq = msg->seq;
os_write_file(time_travel_ext_fd, &resp, sizeof(resp));
}

View File

@@ -97,7 +97,7 @@ static int remove_files_and_dir(char *dir)
while ((ent = readdir(directory)) != NULL) {
if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, ".."))
continue;
len = strlen(dir) + sizeof("/") + strlen(ent->d_name) + 1;
len = strlen(dir) + strlen("/") + strlen(ent->d_name) + 1;
if (len > sizeof(file)) {
ret = -E2BIG;
goto out;
@@ -135,7 +135,7 @@ out:
*/
static inline int is_umdir_used(char *dir)
{
char pid[sizeof("nnnnn\0")], *end, *file;
char pid[sizeof("nnnnnnnnn")], *end, *file;
int dead, fd, p, n, err;
size_t filelen;
@@ -217,10 +217,10 @@ static int umdir_take_if_dead(char *dir)
static void __init create_pid_file(void)
{
char pid[sizeof("nnnnn\0")], *file;
char pid[sizeof("nnnnnnnnn")], *file;
int fd, n;
n = strlen(uml_dir) + UMID_LEN + sizeof("/pid\0");
n = strlen(uml_dir) + UMID_LEN + sizeof("/pid");
file = malloc(n);
if (!file)
return;

View File

@@ -10,7 +10,7 @@
#include <signal.h>
#include <string.h>
#include <termios.h>
#include <wait.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <sys/utsname.h>
#include <init.h>

View File

@@ -444,3 +444,4 @@
437 i386 openat2 sys_openat2
438 i386 pidfd_getfd sys_pidfd_getfd
439 i386 faccessat2 sys_faccessat2
440 i386 process_madvise sys_process_madvise

View File

@@ -361,6 +361,7 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise
#
# x32-specific system call numbers start at 512 to avoid cache impact

View File

@@ -229,7 +229,8 @@ void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
return;
idx = srcu_read_lock(&head->track_srcu);
hlist_for_each_entry_rcu(n, &head->track_notifier_list, node)
hlist_for_each_entry_srcu(n, &head->track_notifier_list, node,
srcu_read_lock_held(&head->track_srcu))
if (n->track_write)
n->track_write(vcpu, gpa, new, bytes, n);
srcu_read_unlock(&head->track_srcu, idx);
@@ -254,7 +255,8 @@ void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot)
return;
idx = srcu_read_lock(&head->track_srcu);
hlist_for_each_entry_rcu(n, &head->track_notifier_list, node)
hlist_for_each_entry_srcu(n, &head->track_notifier_list, node,
srcu_read_lock_held(&head->track_srcu))
if (n->track_flush_slot)
n->track_flush_slot(kvm, slot, n);
srcu_read_unlock(&head->track_srcu, idx);

View File

@@ -52,14 +52,6 @@ static const int reg_offsets[] =
int putreg(struct task_struct *child, int regno, unsigned long value)
{
#ifdef TIF_IA32
/*
* Some code in the 64bit emulation may not be 64bit clean.
* Don't take any chances.
*/
if (test_tsk_thread_flag(child, TIF_IA32))
value &= 0xffffffff;
#endif
switch (regno) {
case R8:
case R9:
@@ -137,10 +129,7 @@ int poke_user(struct task_struct *child, long addr, long data)
unsigned long getreg(struct task_struct *child, int regno)
{
unsigned long mask = ~0UL;
#ifdef TIF_IA32
if (test_tsk_thread_flag(child, TIF_IA32))
mask = 0xffffffff;
#endif
switch (regno) {
case R8:
case R9:

View File

@@ -2,7 +2,7 @@
#include <stdio.h>
#include <stddef.h>
#include <signal.h>
#include <sys/poll.h>
#include <poll.h>
#include <sys/mman.h>
#include <sys/user.h>
#define __FRAME_OFFSETS

View File

@@ -25,6 +25,7 @@
static struct gnttab_vm_area {
struct vm_struct *area;
pte_t **ptes;
int idx;
} gnttab_shared_vm_area, gnttab_status_vm_area;
int arch_gnttab_map_shared(unsigned long *frames, unsigned long nr_gframes,
@@ -90,19 +91,31 @@ void arch_gnttab_unmap(void *shared, unsigned long nr_gframes)
}
}
static int gnttab_apply(pte_t *pte, unsigned long addr, void *data)
{
struct gnttab_vm_area *area = data;
area->ptes[area->idx++] = pte;
return 0;
}
static int arch_gnttab_valloc(struct gnttab_vm_area *area, unsigned nr_frames)
{
area->ptes = kmalloc_array(nr_frames, sizeof(*area->ptes), GFP_KERNEL);
if (area->ptes == NULL)
return -ENOMEM;
area->area = alloc_vm_area(PAGE_SIZE * nr_frames, area->ptes);
if (area->area == NULL) {
area->area = get_vm_area(PAGE_SIZE * nr_frames, VM_IOREMAP);
if (!area->area)
goto out_free_ptes;
if (apply_to_page_range(&init_mm, (unsigned long)area->area->addr,
PAGE_SIZE * nr_frames, gnttab_apply, area))
goto out_free_vm_area;
return 0;
out_free_vm_area:
free_vm_area(area->area);
out_free_ptes:
kfree(area->ptes);
return -ENOMEM;
}
return 0;
}
static void arch_gnttab_vfree(struct gnttab_vm_area *area)

View File

@@ -410,3 +410,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
440 common process_madvise sys_process_madvise

View File

@@ -25,6 +25,7 @@ config DRM_I915
select CRC32
select SND_HDA_I915 if SND_HDA_CORE
select CEC_CORE if CEC_NOTIFIER
select VMAP_PFN
help
Choose this option if you have a system that has "Intel Graphics
Media Accelerator" or "HD Graphics" integrated graphics,

View File

@@ -162,8 +162,6 @@ static void unmap_object(struct drm_i915_gem_object *obj, void *ptr)
{
if (is_vmalloc_addr(ptr))
vunmap(ptr);
else
kunmap(kmap_to_page(ptr));
}
struct sg_table *
@@ -234,34 +232,21 @@ unlock:
return err;
}
static inline pte_t iomap_pte(resource_size_t base,
dma_addr_t offset,
pgprot_t prot)
{
return pte_mkspecial(pfn_pte((base + offset) >> PAGE_SHIFT, prot));
}
/* The 'mapping' part of i915_gem_object_pin_map() below */
static void *i915_gem_object_map(struct drm_i915_gem_object *obj,
static void *i915_gem_object_map_page(struct drm_i915_gem_object *obj,
enum i915_map_type type)
{
unsigned long n_pte = obj->base.size >> PAGE_SHIFT;
struct sg_table *sgt = obj->mm.pages;
pte_t *stack[32], **mem;
struct vm_struct *area;
unsigned long n_pages = obj->base.size >> PAGE_SHIFT, i;
struct page *stack[32], **pages = stack, *page;
struct sgt_iter iter;
pgprot_t pgprot;
void *vaddr;
if (!i915_gem_object_has_struct_page(obj) && type != I915_MAP_WC)
return NULL;
if (GEM_WARN_ON(type == I915_MAP_WC &&
!static_cpu_has(X86_FEATURE_PAT)))
return NULL;
/* A single page can always be kmapped */
if (n_pte == 1 && type == I915_MAP_WB) {
struct page *page = sg_page(sgt->sgl);
switch (type) {
default:
MISSING_CASE(type);
fallthrough; /* to use PAGE_KERNEL anyway */
case I915_MAP_WB:
/*
* On 32b, highmem using a finite set of indirect PTE (i.e.
* vmap) to provide virtual mappings of the high pages.
@@ -277,33 +262,10 @@ static void *i915_gem_object_map(struct drm_i915_gem_object *obj,
* forever.
*
* So if the page is beyond the 32b boundary, make an explicit
* vmap. On 64b, this check will be optimised away as we can
* directly kmap any page on the system.
* vmap.
*/
if (!PageHighMem(page))
return kmap(page);
}
mem = stack;
if (n_pte > ARRAY_SIZE(stack)) {
/* Too big for stack -- allocate temporary array instead */
mem = kvmalloc_array(n_pte, sizeof(*mem), GFP_KERNEL);
if (!mem)
return NULL;
}
area = alloc_vm_area(obj->base.size, mem);
if (!area) {
if (mem != stack)
kvfree(mem);
return NULL;
}
switch (type) {
default:
MISSING_CASE(type);
fallthrough; /* to use PAGE_KERNEL anyway */
case I915_MAP_WB:
if (n_pages == 1 && !PageHighMem(sg_page(obj->mm.pages->sgl)))
return page_address(sg_page(obj->mm.pages->sgl));
pgprot = PAGE_KERNEL;
break;
case I915_MAP_WC:
@@ -311,30 +273,50 @@ static void *i915_gem_object_map(struct drm_i915_gem_object *obj,
break;
}
if (i915_gem_object_has_struct_page(obj)) {
struct sgt_iter iter;
struct page *page;
pte_t **ptes = mem;
for_each_sgt_page(page, iter, sgt)
**ptes++ = mk_pte(page, pgprot);
} else {
resource_size_t iomap;
struct sgt_iter iter;
pte_t **ptes = mem;
dma_addr_t addr;
iomap = obj->mm.region->iomap.base;
iomap -= obj->mm.region->region.start;
for_each_sgt_daddr(addr, iter, sgt)
**ptes++ = iomap_pte(iomap, addr, pgprot);
if (n_pages > ARRAY_SIZE(stack)) {
/* Too big for stack -- allocate temporary array instead */
pages = kvmalloc_array(n_pages, sizeof(*pages), GFP_KERNEL);
if (!pages)
return NULL;
}
if (mem != stack)
kvfree(mem);
i = 0;
for_each_sgt_page(page, iter, obj->mm.pages)
pages[i++] = page;
vaddr = vmap(pages, n_pages, 0, pgprot);
if (pages != stack)
kvfree(pages);
return vaddr;
}
return area->addr;
static void *i915_gem_object_map_pfn(struct drm_i915_gem_object *obj,
enum i915_map_type type)
{
resource_size_t iomap = obj->mm.region->iomap.base -
obj->mm.region->region.start;
unsigned long n_pfn = obj->base.size >> PAGE_SHIFT;
unsigned long stack[32], *pfns = stack, i;
struct sgt_iter iter;
dma_addr_t addr;
void *vaddr;
if (type != I915_MAP_WC)
return NULL;
if (n_pfn > ARRAY_SIZE(stack)) {
/* Too big for stack -- allocate temporary array instead */
pfns = kvmalloc_array(n_pfn, sizeof(*pfns), GFP_KERNEL);
if (!pfns)
return NULL;
}
i = 0;
for_each_sgt_daddr(addr, iter, obj->mm.pages)
pfns[i++] = (iomap + addr) >> PAGE_SHIFT;
vaddr = vmap_pfn(pfns, n_pfn, pgprot_writecombine(PAGE_KERNEL_IO));
if (pfns != stack)
kvfree(pfns);
return vaddr;
}
/* get, pin, and map the pages of the object into kernel space */
@@ -386,7 +368,13 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
}
if (!ptr) {
ptr = i915_gem_object_map(obj, type);
if (GEM_WARN_ON(type == I915_MAP_WC &&
!static_cpu_has(X86_FEATURE_PAT)))
ptr = NULL;
else if (i915_gem_object_has_struct_page(obj))
ptr = i915_gem_object_map_page(obj, type);
else
ptr = i915_gem_object_map_pfn(obj, type);
if (!ptr) {
err = -ENOMEM;
goto err_unpin;

View File

@@ -49,80 +49,40 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj)
return file;
}
static size_t shmem_npte(struct file *file)
{
return file->f_mapping->host->i_size >> PAGE_SHIFT;
}
static void __shmem_unpin_map(struct file *file, void *ptr, size_t n_pte)
{
unsigned long pfn;
vunmap(ptr);
for (pfn = 0; pfn < n_pte; pfn++) {
struct page *page;
page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
GFP_KERNEL);
if (!WARN_ON(IS_ERR(page))) {
put_page(page);
put_page(page);
}
}
}
void *shmem_pin_map(struct file *file)
{
const size_t n_pte = shmem_npte(file);
pte_t *stack[32], **ptes, **mem;
struct vm_struct *area;
unsigned long pfn;
struct page **pages;
size_t n_pages, i;
void *vaddr;
mem = stack;
if (n_pte > ARRAY_SIZE(stack)) {
mem = kvmalloc_array(n_pte, sizeof(*mem), GFP_KERNEL);
if (!mem)
n_pages = file->f_mapping->host->i_size >> PAGE_SHIFT;
pages = kvmalloc_array(n_pages, sizeof(*pages), GFP_KERNEL);
if (!pages)
return NULL;
}
area = alloc_vm_area(n_pte << PAGE_SHIFT, mem);
if (!area) {
if (mem != stack)
kvfree(mem);
return NULL;
}
ptes = mem;
for (pfn = 0; pfn < n_pte; pfn++) {
struct page *page;
page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
for (i = 0; i < n_pages; i++) {
pages[i] = shmem_read_mapping_page_gfp(file->f_mapping, i,
GFP_KERNEL);
if (IS_ERR(page))
if (IS_ERR(pages[i]))
goto err_page;
**ptes++ = mk_pte(page, PAGE_KERNEL);
}
if (mem != stack)
kvfree(mem);
vaddr = vmap(pages, n_pages, VM_MAP_PUT_PAGES, PAGE_KERNEL);
if (!vaddr)
goto err_page;
mapping_set_unevictable(file->f_mapping);
return area->addr;
return vaddr;
err_page:
if (mem != stack)
kvfree(mem);
__shmem_unpin_map(file, area->addr, pfn);
while (--i >= 0)
put_page(pages[i]);
kvfree(pages);
return NULL;
}
void shmem_unpin_map(struct file *file, void *ptr)
{
mapping_clear_unevictable(file->f_mapping);
__shmem_unpin_map(file, ptr, shmem_npte(file));
vfree(ptr);
}
static int __shmem_rw(struct file *file, loff_t off,

View File

@@ -5,7 +5,7 @@ obj-$(CONFIG_MAILBOX) += mailbox.o
obj-$(CONFIG_MAILBOX_TEST) += mailbox-test.o
obj-$(CONFIG_ARM_MHU) += arm_mhu.o
obj-$(CONFIG_ARM_MHU) += arm_mhu.o arm_mhu_db.o
obj-$(CONFIG_IMX_MBOX) += imx-mailbox.o

View File

@@ -113,6 +113,9 @@ static int mhu_probe(struct amba_device *adev, const struct amba_id *id)
struct device *dev = &adev->dev;
int mhu_reg[MHU_CHANS] = {MHU_LP_OFFSET, MHU_HP_OFFSET, MHU_SEC_OFFSET};
if (!of_device_is_compatible(dev->of_node, "arm,mhu"))
return -ENODEV;
/* Allocate memory for device */
mhu = devm_kzalloc(dev, sizeof(*mhu), GFP_KERNEL);
if (!mhu)

View File

@@ -0,0 +1,354 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright (C) 2013-2015 Fujitsu Semiconductor Ltd.
* Copyright (C) 2015 Linaro Ltd.
* Based on ARM MHU driver by Jassi Brar <jaswinder.singh@linaro.org>
* Copyright (C) 2020 ARM Ltd.
*/
#include <linux/amba/bus.h>
#include <linux/device.h>
#include <linux/err.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/kernel.h>
#include <linux/mailbox_controller.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_device.h>
#define INTR_STAT_OFS 0x0
#define INTR_SET_OFS 0x8
#define INTR_CLR_OFS 0x10
#define MHU_LP_OFFSET 0x0
#define MHU_HP_OFFSET 0x20
#define MHU_SEC_OFFSET 0x200
#define TX_REG_OFFSET 0x100
#define MHU_CHANS 3 /* Secure, Non-Secure High and Low Priority */
#define MHU_CHAN_MAX 20 /* Max channels to save on unused RAM */
#define MHU_NUM_DOORBELLS 32
struct mhu_db_link {
unsigned int irq;
void __iomem *tx_reg;
void __iomem *rx_reg;
};
struct arm_mhu {
void __iomem *base;
struct mhu_db_link mlink[MHU_CHANS];
struct mbox_controller mbox;
struct device *dev;
};
/**
* ARM MHU Mailbox allocated channel information
*
* @mhu: Pointer to parent mailbox device
* @pchan: Physical channel within which this doorbell resides in
* @doorbell: doorbell number pertaining to this channel
*/
struct mhu_db_channel {
struct arm_mhu *mhu;
unsigned int pchan;
unsigned int doorbell;
};
static inline struct mbox_chan *
mhu_db_mbox_to_channel(struct mbox_controller *mbox, unsigned int pchan,
unsigned int doorbell)
{
int i;
struct mhu_db_channel *chan_info;
for (i = 0; i < mbox->num_chans; i++) {
chan_info = mbox->chans[i].con_priv;
if (chan_info && chan_info->pchan == pchan &&
chan_info->doorbell == doorbell)
return &mbox->chans[i];
}
return NULL;
}
static void mhu_db_mbox_clear_irq(struct mbox_chan *chan)
{
struct mhu_db_channel *chan_info = chan->con_priv;
void __iomem *base = chan_info->mhu->mlink[chan_info->pchan].rx_reg;
writel_relaxed(BIT(chan_info->doorbell), base + INTR_CLR_OFS);
}
static unsigned int mhu_db_mbox_irq_to_pchan_num(struct arm_mhu *mhu, int irq)
{
unsigned int pchan;
for (pchan = 0; pchan < MHU_CHANS; pchan++)
if (mhu->mlink[pchan].irq == irq)
break;
return pchan;
}
static struct mbox_chan *
mhu_db_mbox_irq_to_channel(struct arm_mhu *mhu, unsigned int pchan)
{
unsigned long bits;
unsigned int doorbell;
struct mbox_chan *chan = NULL;
struct mbox_controller *mbox = &mhu->mbox;
void __iomem *base = mhu->mlink[pchan].rx_reg;
bits = readl_relaxed(base + INTR_STAT_OFS);
if (!bits)
/* No IRQs fired in specified physical channel */
return NULL;
/* An IRQ has fired, find the associated channel */
for (doorbell = 0; bits; doorbell++) {
if (!test_and_clear_bit(doorbell, &bits))
continue;
chan = mhu_db_mbox_to_channel(mbox, pchan, doorbell);
if (chan)
break;
dev_err(mbox->dev,
"Channel not registered: pchan: %d doorbell: %d\n",
pchan, doorbell);
}
return chan;
}
static irqreturn_t mhu_db_mbox_rx_handler(int irq, void *data)
{
struct mbox_chan *chan;
struct arm_mhu *mhu = data;
unsigned int pchan = mhu_db_mbox_irq_to_pchan_num(mhu, irq);
while (NULL != (chan = mhu_db_mbox_irq_to_channel(mhu, pchan))) {
mbox_chan_received_data(chan, NULL);
mhu_db_mbox_clear_irq(chan);
}
return IRQ_HANDLED;
}
static bool mhu_db_last_tx_done(struct mbox_chan *chan)
{
struct mhu_db_channel *chan_info = chan->con_priv;
void __iomem *base = chan_info->mhu->mlink[chan_info->pchan].tx_reg;
if (readl_relaxed(base + INTR_STAT_OFS) & BIT(chan_info->doorbell))
return false;
return true;
}
static int mhu_db_send_data(struct mbox_chan *chan, void *data)
{
struct mhu_db_channel *chan_info = chan->con_priv;
void __iomem *base = chan_info->mhu->mlink[chan_info->pchan].tx_reg;
/* Send event to co-processor */
writel_relaxed(BIT(chan_info->doorbell), base + INTR_SET_OFS);
return 0;
}
static int mhu_db_startup(struct mbox_chan *chan)
{
mhu_db_mbox_clear_irq(chan);
return 0;
}
static void mhu_db_shutdown(struct mbox_chan *chan)
{
struct mhu_db_channel *chan_info = chan->con_priv;
struct mbox_controller *mbox = &chan_info->mhu->mbox;
int i;
for (i = 0; i < mbox->num_chans; i++)
if (chan == &mbox->chans[i])
break;
if (mbox->num_chans == i) {
dev_warn(mbox->dev, "Request to free non-existent channel\n");
return;
}
/* Reset channel */
mhu_db_mbox_clear_irq(chan);
kfree(chan->con_priv);
chan->con_priv = NULL;
}
static struct mbox_chan *mhu_db_mbox_xlate(struct mbox_controller *mbox,
const struct of_phandle_args *spec)
{
struct arm_mhu *mhu = dev_get_drvdata(mbox->dev);
struct mhu_db_channel *chan_info;
struct mbox_chan *chan;
unsigned int pchan = spec->args[0];
unsigned int doorbell = spec->args[1];
int i;
/* Bounds checking */
if (pchan >= MHU_CHANS || doorbell >= MHU_NUM_DOORBELLS) {
dev_err(mbox->dev,
"Invalid channel requested pchan: %d doorbell: %d\n",
pchan, doorbell);
return ERR_PTR(-EINVAL);
}
/* Is requested channel free? */
chan = mhu_db_mbox_to_channel(mbox, pchan, doorbell);
if (chan) {
dev_err(mbox->dev, "Channel in use: pchan: %d doorbell: %d\n",
pchan, doorbell);
return ERR_PTR(-EBUSY);
}
/* Find the first free slot */
for (i = 0; i < mbox->num_chans; i++)
if (!mbox->chans[i].con_priv)
break;
if (mbox->num_chans == i) {
dev_err(mbox->dev, "No free channels left\n");
return ERR_PTR(-EBUSY);
}
chan = &mbox->chans[i];
chan_info = devm_kzalloc(mbox->dev, sizeof(*chan_info), GFP_KERNEL);
if (!chan_info)
return ERR_PTR(-ENOMEM);
chan_info->mhu = mhu;
chan_info->pchan = pchan;
chan_info->doorbell = doorbell;
chan->con_priv = chan_info;
dev_dbg(mbox->dev, "mbox: created channel phys: %d doorbell: %d\n",
pchan, doorbell);
return chan;
}
static const struct mbox_chan_ops mhu_db_ops = {
.send_data = mhu_db_send_data,
.startup = mhu_db_startup,
.shutdown = mhu_db_shutdown,
.last_tx_done = mhu_db_last_tx_done,
};
static int mhu_db_probe(struct amba_device *adev, const struct amba_id *id)
{
u32 cell_count;
int i, err, max_chans;
struct arm_mhu *mhu;
struct mbox_chan *chans;
struct device *dev = &adev->dev;
struct device_node *np = dev->of_node;
int mhu_reg[MHU_CHANS] = {
MHU_LP_OFFSET, MHU_HP_OFFSET, MHU_SEC_OFFSET,
};
if (!of_device_is_compatible(np, "arm,mhu-doorbell"))
return -ENODEV;
err = of_property_read_u32(np, "#mbox-cells", &cell_count);
if (err) {
dev_err(dev, "failed to read #mbox-cells in '%pOF'\n", np);
return err;
}
if (cell_count == 2) {
max_chans = MHU_CHAN_MAX;
} else {
dev_err(dev, "incorrect value of #mbox-cells in '%pOF'\n", np);
return -EINVAL;
}
mhu = devm_kzalloc(dev, sizeof(*mhu), GFP_KERNEL);
if (!mhu)
return -ENOMEM;
mhu->base = devm_ioremap_resource(dev, &adev->res);
if (IS_ERR(mhu->base)) {
dev_err(dev, "ioremap failed\n");
return PTR_ERR(mhu->base);
}
chans = devm_kcalloc(dev, max_chans, sizeof(*chans), GFP_KERNEL);
if (!chans)
return -ENOMEM;
mhu->dev = dev;
mhu->mbox.dev = dev;
mhu->mbox.chans = chans;
mhu->mbox.num_chans = max_chans;
mhu->mbox.txdone_irq = false;
mhu->mbox.txdone_poll = true;
mhu->mbox.txpoll_period = 1;
mhu->mbox.of_xlate = mhu_db_mbox_xlate;
amba_set_drvdata(adev, mhu);
mhu->mbox.ops = &mhu_db_ops;
err = devm_mbox_controller_register(dev, &mhu->mbox);
if (err) {
dev_err(dev, "Failed to register mailboxes %d\n", err);
return err;
}
for (i = 0; i < MHU_CHANS; i++) {
int irq = mhu->mlink[i].irq = adev->irq[i];
if (irq <= 0) {
dev_dbg(dev, "No IRQ found for Channel %d\n", i);
continue;
}
mhu->mlink[i].rx_reg = mhu->base + mhu_reg[i];
mhu->mlink[i].tx_reg = mhu->mlink[i].rx_reg + TX_REG_OFFSET;
err = devm_request_threaded_irq(dev, irq, NULL,
mhu_db_mbox_rx_handler,
IRQF_ONESHOT, "mhu_db_link", mhu);
if (err) {
dev_err(dev, "Can't claim IRQ %d\n", irq);
mbox_controller_unregister(&mhu->mbox);
return err;
}
}
dev_info(dev, "ARM MHU Doorbell mailbox registered\n");
return 0;
}
static struct amba_id mhu_ids[] = {
{
.id = 0x1bb098,
.mask = 0xffffff,
},
{ 0, 0 },
};
MODULE_DEVICE_TABLE(amba, mhu_ids);
static struct amba_driver arm_mhu_db_driver = {
.drv = {
.name = "mhu-doorbell",
},
.id_table = mhu_ids,
.probe = mhu_db_probe,
};
module_amba_driver(arm_mhu_db_driver);
MODULE_LICENSE("GPL v2");
MODULE_DESCRIPTION("ARM MHU Doorbell Driver");
MODULE_AUTHOR("Sudeep Holla <sudeep.holla@arm.com>");

View File

@@ -962,9 +962,9 @@ static irqreturn_t pdc_irq_handler(int irq, void *data)
* a DMA receive interrupt. Reenables the receive interrupt.
* @data: PDC state structure
*/
static void pdc_tasklet_cb(unsigned long data)
static void pdc_tasklet_cb(struct tasklet_struct *t)
{
struct pdc_state *pdcs = (struct pdc_state *)data;
struct pdc_state *pdcs = from_tasklet(pdcs, t, rx_tasklet);
pdc_receive(pdcs);
@@ -1589,7 +1589,7 @@ static int pdc_probe(struct platform_device *pdev)
pdc_hw_init(pdcs);
/* Init tasklet for deferred DMA rx processing */
tasklet_init(&pdcs->rx_tasklet, pdc_tasklet_cb, (unsigned long)pdcs);
tasklet_setup(&pdcs->rx_tasklet, pdc_tasklet_cb);
err = pdc_interrupts_init(pdcs);
if (err)

View File

@@ -82,9 +82,12 @@ static void msg_submit(struct mbox_chan *chan)
exit:
spin_unlock_irqrestore(&chan->lock, flags);
if (!err && (chan->txdone_method & TXDONE_BY_POLL))
/* kick start the timer immediately to avoid delays */
if (!err && (chan->txdone_method & TXDONE_BY_POLL)) {
/* but only if not already active */
if (!hrtimer_active(&chan->mbox->poll_hrt))
hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL);
}
}
static void tx_tick(struct mbox_chan *chan, int r)
@@ -122,11 +125,10 @@ static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
struct mbox_chan *chan = &mbox->chans[i];
if (chan->active_req && chan->cl) {
resched = true;
txdone = chan->mbox->ops->last_tx_done(chan);
if (txdone)
tx_tick(chan, 0);
else
resched = true;
}
}

View File

@@ -69,7 +69,7 @@ struct cmdq_task {
struct cmdq {
struct mbox_controller mbox;
void __iomem *base;
u32 irq;
int irq;
u32 thread_nr;
u32 irq_mask;
struct cmdq_thread *thread;
@@ -525,10 +525,8 @@ static int cmdq_probe(struct platform_device *pdev)
}
cmdq->irq = platform_get_irq(pdev, 0);
if (!cmdq->irq) {
dev_err(dev, "failed to get irq\n");
return -EINVAL;
}
if (cmdq->irq < 0)
return cmdq->irq;
plat_data = (struct gce_plat *)of_device_get_match_data(dev);
if (!plat_data) {

View File

@@ -1639,6 +1639,19 @@ int ubi_thread(void *u)
!ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
set_current_state(TASK_INTERRUPTIBLE);
spin_unlock(&ubi->wl_lock);
/*
* Check kthread_should_stop() after we set the task
* state to guarantee that we either see the stop bit
* and exit or the task state is reset to runnable such
* that it's not scheduled out indefinitely and detects
* the stop bit at kthread_should_stop().
*/
if (kthread_should_stop()) {
set_current_state(TASK_RUNNING);
break;
}
schedule();
continue;
}

View File

@@ -73,16 +73,13 @@ struct map_ring_valloc {
struct xenbus_map_node *node;
/* Why do we need two arrays? See comment of __xenbus_map_ring */
union {
unsigned long addrs[XENBUS_MAX_RING_GRANTS];
pte_t *ptes[XENBUS_MAX_RING_GRANTS];
};
phys_addr_t phys_addrs[XENBUS_MAX_RING_GRANTS];
struct gnttab_map_grant_ref map[XENBUS_MAX_RING_GRANTS];
struct gnttab_unmap_grant_ref unmap[XENBUS_MAX_RING_GRANTS];
unsigned int idx; /* HVM only. */
unsigned int idx;
};
static DEFINE_SPINLOCK(xenbus_valloc_lock);
@@ -686,6 +683,14 @@ int xenbus_unmap_ring_vfree(struct xenbus_device *dev, void *vaddr)
EXPORT_SYMBOL_GPL(xenbus_unmap_ring_vfree);
#ifdef CONFIG_XEN_PV
static int map_ring_apply(pte_t *pte, unsigned long addr, void *data)
{
struct map_ring_valloc *info = data;
info->phys_addrs[info->idx++] = arbitrary_virt_to_machine(pte).maddr;
return 0;
}
static int xenbus_map_ring_pv(struct xenbus_device *dev,
struct map_ring_valloc *info,
grant_ref_t *gnt_refs,
@@ -694,18 +699,15 @@ static int xenbus_map_ring_pv(struct xenbus_device *dev,
{
struct xenbus_map_node *node = info->node;
struct vm_struct *area;
int err = GNTST_okay;
int i;
bool leaked;
bool leaked = false;
int err = -ENOMEM;
area = alloc_vm_area(XEN_PAGE_SIZE * nr_grefs, info->ptes);
area = get_vm_area(XEN_PAGE_SIZE * nr_grefs, VM_IOREMAP);
if (!area)
return -ENOMEM;
for (i = 0; i < nr_grefs; i++)
info->phys_addrs[i] =
arbitrary_virt_to_machine(info->ptes[i]).maddr;
if (apply_to_page_range(&init_mm, (unsigned long)area->addr,
XEN_PAGE_SIZE * nr_grefs, map_ring_apply, info))
goto failed;
err = __xenbus_map_ring(dev, gnt_refs, nr_grefs, node->handles,
info, GNTMAP_host_map | GNTMAP_contains_pte,
&leaked);

View File

@@ -310,7 +310,10 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
* Grow the stack manually; some architectures have a limit on how
* far ahead a user-space access may be in order to grow the stack.
*/
if (mmap_read_lock_killable(mm))
return -EINTR;
vma = find_extend_vma(mm, bprm->p);
mmap_read_unlock(mm);
if (!vma)
return -EFAULT;

View File

@@ -842,13 +842,13 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size,
struct buffer_head *bh, *head;
gfp_t gfp = GFP_NOFS | __GFP_ACCOUNT;
long offset;
struct mem_cgroup *memcg;
struct mem_cgroup *memcg, *old_memcg;
if (retry)
gfp |= __GFP_NOFAIL;
memcg = get_mem_cgroup_from_page(page);
memalloc_use_memcg(memcg);
old_memcg = set_active_memcg(memcg);
head = NULL;
offset = PAGE_SIZE;
@@ -867,7 +867,7 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size,
set_bh_page(bh, page, offset);
}
out:
memalloc_unuse_memcg();
set_active_memcg(old_memcg);
mem_cgroup_put(memcg);
return head;
/*

View File

@@ -3994,7 +3994,7 @@ static int io_madvise(struct io_kiocb *req, bool force_nonblock)
if (force_nonblock)
return -EAGAIN;
ret = do_madvise(ma->addr, ma->len, ma->advice);
ret = do_madvise(current->mm, ma->addr, ma->len, ma->advice);
if (ret < 0)
req_set_fail_links(req);
io_req_complete(req, ret);

View File

@@ -531,6 +531,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
struct inode *dirid = fanotify_dfid_inode(mask, data, data_type, dir);
const struct path *path = fsnotify_data_path(data, data_type);
unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
struct mem_cgroup *old_memcg;
struct inode *child = NULL;
bool name_event = false;
@@ -580,7 +581,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
gfp |= __GFP_RETRY_MAYFAIL;
/* Whoever is interested in the event, pays for the allocation. */
memalloc_use_memcg(group->memcg);
old_memcg = set_active_memcg(group->memcg);
if (fanotify_is_perm_event(mask)) {
event = fanotify_alloc_perm_event(path, gfp);
@@ -608,7 +609,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
event->pid = get_pid(task_tgid(current));
out:
memalloc_unuse_memcg();
set_active_memcg(old_memcg);
return event;
}

View File

@@ -66,6 +66,7 @@ static int inotify_one_event(struct fsnotify_group *group, u32 mask,
int ret;
int len = 0;
int alloc_len = sizeof(struct inotify_event_info);
struct mem_cgroup *old_memcg;
if ((inode_mark->mask & FS_EXCL_UNLINK) &&
path && d_unlinked(path->dentry))
@@ -87,9 +88,9 @@ static int inotify_one_event(struct fsnotify_group *group, u32 mask,
* trigger OOM killer in the target monitoring memcg as it may have
* security repercussion.
*/
memalloc_use_memcg(group->memcg);
old_memcg = set_active_memcg(group->memcg);
event = kmalloc(alloc_len, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);
memalloc_unuse_memcg();
set_active_memcg(old_memcg);
if (unlikely(!event)) {
/*

View File

@@ -54,7 +54,7 @@ static int ubifs_hash_calc_hmac(const struct ubifs_info *c, const u8 *hash,
* ubifs_prepare_auth_node - Prepare an authentication node
* @c: UBIFS file-system description object
* @node: the node to calculate a hash for
* @hash: input hash of previous nodes
* @inhash: input hash of previous nodes
*
* This function prepares an authentication node for writing onto flash.
* It creates a HMAC from the given input hash and writes it to the node.

View File

@@ -1123,6 +1123,7 @@ int dbg_check_dir(struct ubifs_info *c, const struct inode *dir)
err = PTR_ERR(dent);
if (err == -ENOENT)
break;
kfree(pdent);
return err;
}

View File

@@ -57,10 +57,6 @@
/**
* switch_gc_head - switch the garbage collection journal head.
* @c: UBIFS file-system description object
* @buf: buffer to write
* @len: length of the buffer to write
* @lnum: LEB number written is returned here
* @offs: offset written is returned here
*
* This function switch the GC head to the next LEB which is reserved in
* @c->gc_lnum. Returns %0 in case of success, %-EAGAIN if commit is required,

View File

@@ -134,7 +134,6 @@ static int setflags(struct inode *inode, int flags)
return err;
out_unlock:
ubifs_err(c, "can't modify inode %lu attributes", inode->i_ino);
mutex_unlock(&ui->ui_mutex);
ubifs_release_budget(c, &req);
return err;

View File

@@ -894,6 +894,7 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode)
if (err == -ENOENT)
break;
kfree(pxent);
goto out_release;
}
@@ -906,6 +907,7 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode)
ubifs_err(c, "dead directory entry '%s', error %d",
xent->name, err);
ubifs_ro_mode(c, err);
kfree(pxent);
kfree(xent);
goto out_release;
}
@@ -936,8 +938,6 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode)
inode->i_ino);
release_head(c, BASEHD);
ubifs_add_auth_dirt(c, lnum);
if (last_reference) {
err = ubifs_tnc_remove_ino(c, inode->i_ino);
if (err)
@@ -947,6 +947,8 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, const struct inode *inode)
} else {
union ubifs_key key;
ubifs_add_auth_dirt(c, lnum);
ino_key_init(c, &key, inode->i_ino);
err = ubifs_tnc_add(c, &key, lnum, offs, ilen, hash);
}
@@ -1798,7 +1800,6 @@ int ubifs_jnl_change_xattr(struct ubifs_info *c, const struct inode *inode,
u8 hash[UBIFS_HASH_ARR_SZ];
dbg_jnl("ino %lu, ino %lu", host->i_ino, inode->i_ino);
ubifs_assert(c, host->i_nlink > 0);
ubifs_assert(c, inode->i_nlink > 0);
ubifs_assert(c, mutex_is_locked(&host_ui->ui_mutex));

View File

@@ -173,6 +173,7 @@ int ubifs_add_orphan(struct ubifs_info *c, ino_t inum)
err = PTR_ERR(xent);
if (err == -ENOENT)
break;
kfree(pxent);
return err;
}
@@ -182,6 +183,7 @@ int ubifs_add_orphan(struct ubifs_info *c, ino_t inum)
xattr_orphan = orphan_add(c, xattr_inum, orphan);
if (IS_ERR(xattr_orphan)) {
kfree(pxent);
kfree(xent);
return PTR_ERR(xattr_orphan);
}

View File

@@ -931,8 +931,6 @@ out:
* validate_ref - validate a reference node.
* @c: UBIFS file-system description object
* @ref: the reference node to validate
* @ref_lnum: LEB number of the reference node
* @ref_offs: reference node offset
*
* This function returns %1 if a bud reference already exists for the LEB. %0 is
* returned if the reference node is new, otherwise %-EINVAL is returned if

View File

@@ -1110,14 +1110,20 @@ static int ubifs_parse_options(struct ubifs_info *c, char *options,
break;
}
case Opt_auth_key:
c->auth_key_name = kstrdup(args[0].from, GFP_KERNEL);
if (!is_remount) {
c->auth_key_name = kstrdup(args[0].from,
GFP_KERNEL);
if (!c->auth_key_name)
return -ENOMEM;
}
break;
case Opt_auth_hash_name:
c->auth_hash_name = kstrdup(args[0].from, GFP_KERNEL);
if (!is_remount) {
c->auth_hash_name = kstrdup(args[0].from,
GFP_KERNEL);
if (!c->auth_hash_name)
return -ENOMEM;
}
break;
case Opt_ignore:
break;
@@ -1141,6 +1147,18 @@ static int ubifs_parse_options(struct ubifs_info *c, char *options,
return 0;
}
/*
* ubifs_release_options - release mount parameters which have been dumped.
* @c: UBIFS file-system description object
*/
static void ubifs_release_options(struct ubifs_info *c)
{
kfree(c->auth_key_name);
c->auth_key_name = NULL;
kfree(c->auth_hash_name);
c->auth_hash_name = NULL;
}
/**
* destroy_journal - destroy journal data structures.
* @c: UBIFS file-system description object
@@ -1313,7 +1331,7 @@ static int mount_ubifs(struct ubifs_info *c)
err = ubifs_read_superblock(c);
if (err)
goto out_free;
goto out_auth;
c->probing = 0;
@@ -1325,18 +1343,18 @@ static int mount_ubifs(struct ubifs_info *c)
ubifs_err(c, "'compressor \"%s\" is not compiled in",
ubifs_compr_name(c, c->default_compr));
err = -ENOTSUPP;
goto out_free;
goto out_auth;
}
err = init_constants_sb(c);
if (err)
goto out_free;
goto out_auth;
sz = ALIGN(c->max_idx_node_sz, c->min_io_size) * 2;
c->cbuf = kmalloc(sz, GFP_NOFS);
if (!c->cbuf) {
err = -ENOMEM;
goto out_free;
goto out_auth;
}
err = alloc_wbufs(c);
@@ -1611,6 +1629,8 @@ out_wbufs:
free_wbufs(c);
out_cbuf:
kfree(c->cbuf);
out_auth:
ubifs_exit_authentication(c);
out_free:
kfree(c->write_reserve_buf);
kfree(c->bu.buf);
@@ -1650,8 +1670,7 @@ static void ubifs_umount(struct ubifs_info *c)
ubifs_lpt_free(c, 0);
ubifs_exit_authentication(c);
kfree(c->auth_key_name);
kfree(c->auth_hash_name);
ubifs_release_options(c);
kfree(c->cbuf);
kfree(c->rcvrd_mst_node);
kfree(c->mst_node);
@@ -2221,6 +2240,7 @@ out_umount:
out_unlock:
mutex_unlock(&c->umount_mutex);
out_close:
ubifs_release_options(c);
ubi_close_volume(c->ubi);
out:
return err;

View File

@@ -360,7 +360,6 @@ static int lnc_add_directly(struct ubifs_info *c, struct ubifs_zbranch *zbr,
/**
* lnc_free - remove a leaf node from the leaf node cache.
* @zbr: zbranch of leaf node
* @node: leaf node
*/
static void lnc_free(struct ubifs_zbranch *zbr)
{
@@ -2885,6 +2884,7 @@ int ubifs_tnc_remove_ino(struct ubifs_info *c, ino_t inum)
err = PTR_ERR(xent);
if (err == -ENOENT)
break;
kfree(pxent);
return err;
}
@@ -2898,6 +2898,7 @@ int ubifs_tnc_remove_ino(struct ubifs_info *c, ino_t inum)
fname_len(&nm) = le16_to_cpu(xent->nlen);
err = ubifs_tnc_remove_nm(c, &key1, &nm);
if (err) {
kfree(pxent);
kfree(xent);
return err;
}
@@ -2906,6 +2907,7 @@ int ubifs_tnc_remove_ino(struct ubifs_info *c, ino_t inum)
highest_ino_key(c, &key2, xattr_inum);
err = ubifs_tnc_remove_range(c, &key1, &key2);
if (err) {
kfree(pxent);
kfree(xent);
return err;
}
@@ -3466,7 +3468,7 @@ out_unlock:
/**
* dbg_check_inode_size - check if inode size is correct.
* @c: UBIFS file-system description object
* @inum: inode number
* @inode: inode to check
* @size: inode size
*
* This function makes sure that the inode size (@size) is correct and it does

View File

@@ -522,6 +522,7 @@ int ubifs_purge_xattrs(struct inode *host)
xent->name, err);
ubifs_ro_mode(c, err);
kfree(pxent);
kfree(xent);
return err;
}
@@ -531,6 +532,7 @@ int ubifs_purge_xattrs(struct inode *host)
err = remove_xattr(c, host, xino, &nm);
if (err) {
kfree(pxent);
kfree(xent);
iput(xino);
ubifs_err(c, "cannot remove xattr, error %d", err);
return err;

View File

@@ -24,6 +24,39 @@
#include "zonefs.h"
static inline int zonefs_zone_mgmt(struct inode *inode,
enum req_opf op)
{
struct zonefs_inode_info *zi = ZONEFS_I(inode);
int ret;
lockdep_assert_held(&zi->i_truncate_mutex);
ret = blkdev_zone_mgmt(inode->i_sb->s_bdev, op, zi->i_zsector,
zi->i_zone_size >> SECTOR_SHIFT, GFP_NOFS);
if (ret) {
zonefs_err(inode->i_sb,
"Zone management operation %s at %llu failed %d\n",
blk_op_str(op), zi->i_zsector, ret);
return ret;
}
return 0;
}
static inline void zonefs_i_size_write(struct inode *inode, loff_t isize)
{
struct zonefs_inode_info *zi = ZONEFS_I(inode);
i_size_write(inode, isize);
/*
* A full zone is no longer open/active and does not need
* explicit closing.
*/
if (isize >= zi->i_max_size)
zi->i_flags &= ~ZONEFS_ZONE_OPEN;
}
static int zonefs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
unsigned int flags, struct iomap *iomap,
struct iomap *srcmap)
@@ -301,6 +334,17 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx,
}
}
/*
* If the filesystem is mounted with the explicit-open mount option, we
* need to clear the ZONEFS_ZONE_OPEN flag if the zone transitioned to
* the read-only or offline condition, to avoid attempting an explicit
* close of the zone when the inode file is closed.
*/
if ((sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) &&
(zone->cond == BLK_ZONE_COND_OFFLINE ||
zone->cond == BLK_ZONE_COND_READONLY))
zi->i_flags &= ~ZONEFS_ZONE_OPEN;
/*
* If error=remount-ro was specified, any error result in remounting
* the volume as read-only.
@@ -315,7 +359,7 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx,
* invalid data.
*/
zonefs_update_stats(inode, data_size);
i_size_write(inode, data_size);
zonefs_i_size_write(inode, data_size);
zi->i_wpoffset = data_size;
return 0;
@@ -328,7 +372,7 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx,
* eventually correct the file size and zonefs inode write pointer offset
* (which can be out of sync with the drive due to partial write failures).
*/
static void zonefs_io_error(struct inode *inode, bool write)
static void __zonefs_io_error(struct inode *inode, bool write)
{
struct zonefs_inode_info *zi = ZONEFS_I(inode);
struct super_block *sb = inode->i_sb;
@@ -342,8 +386,6 @@ static void zonefs_io_error(struct inode *inode, bool write)
};
int ret;
mutex_lock(&zi->i_truncate_mutex);
/*
* Memory allocations in blkdev_report_zones() can trigger a memory
* reclaim which may in turn cause a recursion into zonefs as well as
@@ -359,7 +401,14 @@ static void zonefs_io_error(struct inode *inode, bool write)
zonefs_err(sb, "Get inode %lu zone information failed %d\n",
inode->i_ino, ret);
memalloc_noio_restore(noio_flag);
}
static void zonefs_io_error(struct inode *inode, bool write)
{
struct zonefs_inode_info *zi = ZONEFS_I(inode);
mutex_lock(&zi->i_truncate_mutex);
__zonefs_io_error(inode, write);
mutex_unlock(&zi->i_truncate_mutex);
}
@@ -397,13 +446,27 @@ static int zonefs_file_truncate(struct inode *inode, loff_t isize)
if (isize == old_isize)
goto unlock;
ret = blkdev_zone_mgmt(inode->i_sb->s_bdev, op, zi->i_zsector,
zi->i_zone_size >> SECTOR_SHIFT, GFP_NOFS);
if (ret) {
zonefs_err(inode->i_sb,
"Zone management operation at %llu failed %d",
zi->i_zsector, ret);
ret = zonefs_zone_mgmt(inode, op);
if (ret)
goto unlock;
/*
* If the mount option ZONEFS_MNTOPT_EXPLICIT_OPEN is set,
* take care of open zones.
*/
if (zi->i_flags & ZONEFS_ZONE_OPEN) {
/*
* Truncating a zone to EMPTY or FULL is the equivalent of
* closing the zone. For a truncation to 0, we need to
* re-open the zone to ensure new writes can be processed.
* For a truncation to the maximum file size, the zone is
* closed and writes cannot be accepted anymore, so clear
* the open flag.
*/
if (!isize)
ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_OPEN);
else
zi->i_flags &= ~ZONEFS_ZONE_OPEN;
}
zonefs_update_stats(inode, isize);
@@ -584,7 +647,7 @@ static int zonefs_file_write_dio_end_io(struct kiocb *iocb, ssize_t size,
mutex_lock(&zi->i_truncate_mutex);
if (i_size_read(inode) < iocb->ki_pos + size) {
zonefs_update_stats(inode, iocb->ki_pos + size);
i_size_write(inode, iocb->ki_pos + size);
zonefs_i_size_write(inode, iocb->ki_pos + size);
}
mutex_unlock(&zi->i_truncate_mutex);
}
@@ -865,8 +928,128 @@ inode_unlock:
return ret;
}
static inline bool zonefs_file_use_exp_open(struct inode *inode, struct file *file)
{
struct zonefs_inode_info *zi = ZONEFS_I(inode);
struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb);
if (!(sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN))
return false;
if (zi->i_ztype != ZONEFS_ZTYPE_SEQ)
return false;
if (!(file->f_mode & FMODE_WRITE))
return false;
return true;
}
static int zonefs_open_zone(struct inode *inode)
{
struct zonefs_inode_info *zi = ZONEFS_I(inode);
struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb);
int ret = 0;
mutex_lock(&zi->i_truncate_mutex);
zi->i_wr_refcnt++;
if (zi->i_wr_refcnt == 1) {
if (atomic_inc_return(&sbi->s_open_zones) > sbi->s_max_open_zones) {
atomic_dec(&sbi->s_open_zones);
ret = -EBUSY;
goto unlock;
}
if (i_size_read(inode) < zi->i_max_size) {
ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_OPEN);
if (ret) {
zi->i_wr_refcnt--;
atomic_dec(&sbi->s_open_zones);
goto unlock;
}
zi->i_flags |= ZONEFS_ZONE_OPEN;
}
}
unlock:
mutex_unlock(&zi->i_truncate_mutex);
return ret;
}
static int zonefs_file_open(struct inode *inode, struct file *file)
{
int ret;
ret = generic_file_open(inode, file);
if (ret)
return ret;
if (zonefs_file_use_exp_open(inode, file))
return zonefs_open_zone(inode);
return 0;
}
static void zonefs_close_zone(struct inode *inode)
{
struct zonefs_inode_info *zi = ZONEFS_I(inode);
int ret = 0;
mutex_lock(&zi->i_truncate_mutex);
zi->i_wr_refcnt--;
if (!zi->i_wr_refcnt) {
struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb);
struct super_block *sb = inode->i_sb;
/*
* If the file zone is full, it is not open anymore and we only
* need to decrement the open count.
*/
if (!(zi->i_flags & ZONEFS_ZONE_OPEN))
goto dec;
ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_CLOSE);
if (ret) {
__zonefs_io_error(inode, false);
/*
* Leaving zones explicitly open may lead to a state
* where most zones cannot be written (zone resources
* exhausted). So take preventive action by remounting
* read-only.
*/
if (zi->i_flags & ZONEFS_ZONE_OPEN &&
!(sb->s_flags & SB_RDONLY)) {
zonefs_warn(sb, "closing zone failed, remounting filesystem read-only\n");
sb->s_flags |= SB_RDONLY;
}
}
zi->i_flags &= ~ZONEFS_ZONE_OPEN;
dec:
atomic_dec(&sbi->s_open_zones);
}
mutex_unlock(&zi->i_truncate_mutex);
}
static int zonefs_file_release(struct inode *inode, struct file *file)
{
/*
* If we explicitly open a zone we must close it again as well, but the
* zone management operation can fail (either due to an IO error or as
* the zone has gone offline or read-only). Make sure we don't fail the
* close(2) for user-space.
*/
if (zonefs_file_use_exp_open(inode, file))
zonefs_close_zone(inode);
return 0;
}
static const struct file_operations zonefs_file_operations = {
.open = generic_file_open,
.open = zonefs_file_open,
.release = zonefs_file_release,
.fsync = zonefs_file_fsync,
.mmap = zonefs_file_mmap,
.llseek = zonefs_file_llseek,
@@ -890,6 +1073,7 @@ static struct inode *zonefs_alloc_inode(struct super_block *sb)
inode_init_once(&zi->i_vnode);
mutex_init(&zi->i_truncate_mutex);
init_rwsem(&zi->i_mmap_sem);
zi->i_wr_refcnt = 0;
return &zi->i_vnode;
}
@@ -940,7 +1124,7 @@ static int zonefs_statfs(struct dentry *dentry, struct kstatfs *buf)
enum {
Opt_errors_ro, Opt_errors_zro, Opt_errors_zol, Opt_errors_repair,
Opt_err,
Opt_explicit_open, Opt_err,
};
static const match_table_t tokens = {
@@ -948,6 +1132,7 @@ static const match_table_t tokens = {
{ Opt_errors_zro, "errors=zone-ro"},
{ Opt_errors_zol, "errors=zone-offline"},
{ Opt_errors_repair, "errors=repair"},
{ Opt_explicit_open, "explicit-open" },
{ Opt_err, NULL}
};
@@ -984,6 +1169,9 @@ static int zonefs_parse_options(struct super_block *sb, char *options)
sbi->s_mount_opts &= ~ZONEFS_MNTOPT_ERRORS_MASK;
sbi->s_mount_opts |= ZONEFS_MNTOPT_ERRORS_REPAIR;
break;
case Opt_explicit_open:
sbi->s_mount_opts |= ZONEFS_MNTOPT_EXPLICIT_OPEN;
break;
default:
return -EINVAL;
}
@@ -1403,6 +1591,13 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
sbi->s_gid = GLOBAL_ROOT_GID;
sbi->s_perm = 0640;
sbi->s_mount_opts = ZONEFS_MNTOPT_ERRORS_RO;
sbi->s_max_open_zones = bdev_max_open_zones(sb->s_bdev);
atomic_set(&sbi->s_open_zones, 0);
if (!sbi->s_max_open_zones &&
sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) {
zonefs_info(sb, "No open zones limit. Ignoring explicit_open mount option\n");
sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN;
}
ret = zonefs_read_super(sb);
if (ret)

View File

@@ -38,6 +38,8 @@ static inline enum zonefs_ztype zonefs_zone_type(struct blk_zone *zone)
return ZONEFS_ZTYPE_SEQ;
}
#define ZONEFS_ZONE_OPEN (1 << 0)
/*
* In-memory inode data.
*/
@@ -74,6 +76,10 @@ struct zonefs_inode_info {
*/
struct mutex i_truncate_mutex;
struct rw_semaphore i_mmap_sem;
/* guarded by i_truncate_mutex */
unsigned int i_wr_refcnt;
unsigned int i_flags;
};
static inline struct zonefs_inode_info *ZONEFS_I(struct inode *inode)
@@ -154,6 +160,7 @@ enum zonefs_features {
#define ZONEFS_MNTOPT_ERRORS_MASK \
(ZONEFS_MNTOPT_ERRORS_RO | ZONEFS_MNTOPT_ERRORS_ZRO | \
ZONEFS_MNTOPT_ERRORS_ZOL | ZONEFS_MNTOPT_ERRORS_REPAIR)
#define ZONEFS_MNTOPT_EXPLICIT_OPEN (1 << 4) /* Explicit open/close of zones on open/close */
/*
* In-memory Super block information.
@@ -175,6 +182,9 @@ struct zonefs_sb_info {
loff_t s_blocks;
loff_t s_used_blocks;
unsigned int s_max_open_zones;
atomic_t s_open_zones;
};
static inline struct zonefs_sb_info *ZONEFS_SB(struct super_block *sb)

View File

@@ -734,7 +734,8 @@
THERMAL_TABLE(governor) \
EARLYCON_TABLE() \
LSM_TABLE() \
EARLY_LSM_TABLE()
EARLY_LSM_TABLE() \
KUNIT_TABLE()
#define INIT_TEXT \
*(.init.text .init.text.*) \
@@ -932,6 +933,13 @@
KEEP(*(.con_initcall.init)) \
__con_initcall_end = .;
/* Alignment must be consistent with (kunit_suite *) in include/kunit/test.h */
#define KUNIT_TABLE() \
. = ALIGN(8); \
__kunit_suites_start = .; \
KEEP(*(.kunit_test_suites)) \
__kunit_suites_end = .;
#ifdef CONFIG_BLK_DEV_INITRD
#define INIT_RAM_FS \
. = ALIGN(4); \

View File

@@ -239,10 +239,19 @@ size_t kunit_suite_num_test_cases(struct kunit_suite *suite);
unsigned int kunit_test_case_num(struct kunit_suite *suite,
struct kunit_case *test_case);
int __kunit_test_suites_init(struct kunit_suite **suites);
int __kunit_test_suites_init(struct kunit_suite * const * const suites);
void __kunit_test_suites_exit(struct kunit_suite **suites);
#if IS_BUILTIN(CONFIG_KUNIT)
int kunit_run_all_tests(void);
#else
static inline int kunit_run_all_tests(void)
{
return 0;
}
#endif /* IS_BUILTIN(CONFIG_KUNIT) */
/**
* kunit_test_suites() - used to register one or more &struct kunit_suite
* with KUnit.
@@ -252,34 +261,57 @@ void __kunit_test_suites_exit(struct kunit_suite **suites);
* Registers @suites_list with the test framework. See &struct kunit_suite for
* more information.
*
* When builtin, KUnit tests are all run as late_initcalls; this means
* that they cannot test anything where tests must run at a different init
* phase. One significant restriction resulting from this is that KUnit
* cannot reliably test anything that is initialize in the late_init phase;
* another is that KUnit is useless to test things that need to be run in
* an earlier init phase.
*
* An alternative is to build the tests as a module. Because modules
* do not support multiple late_initcall()s, we need to initialize an
* array of suites for a module.
*
* TODO(brendanhiggins@google.com): Don't run all KUnit tests as
* late_initcalls. I have some future work planned to dispatch all KUnit
* tests from the same place, and at the very least to do so after
* everything else is definitely initialized.
* If a test suite is built-in, module_init() gets translated into
* an initcall which we don't want as the idea is that for builtins
* the executor will manage execution. So ensure we do not define
* module_{init|exit} functions for the builtin case when registering
* suites via kunit_test_suites() below.
*/
#define kunit_test_suites(suites_list...) \
static struct kunit_suite *suites[] = {suites_list, NULL}; \
static int kunit_test_suites_init(void) \
#ifdef MODULE
#define kunit_test_suites_for_module(__suites) \
static int __init kunit_test_suites_init(void) \
{ \
return __kunit_test_suites_init(suites); \
return __kunit_test_suites_init(__suites); \
} \
late_initcall(kunit_test_suites_init); \
module_init(kunit_test_suites_init); \
\
static void __exit kunit_test_suites_exit(void) \
{ \
return __kunit_test_suites_exit(suites); \
return __kunit_test_suites_exit(__suites); \
} \
module_exit(kunit_test_suites_exit)
#else
#define kunit_test_suites_for_module(__suites)
#endif /* MODULE */
#define __kunit_test_suites(unique_array, unique_suites, ...) \
static struct kunit_suite *unique_array[] = { __VA_ARGS__, NULL }; \
kunit_test_suites_for_module(unique_array); \
static struct kunit_suite **unique_suites \
__used __section(.kunit_test_suites) = unique_array
/**
* kunit_test_suites() - used to register one or more &struct kunit_suite
* with KUnit.
*
* @suites: a statically allocated list of &struct kunit_suite.
*
* Registers @suites with the test framework. See &struct kunit_suite for
* more information.
*
* When builtin, KUnit tests are all run via executor; this is done
* by placing the array of struct kunit_suite * in the .kunit_test_suites
* ELF section.
*
* An alternative is to build the tests as a module. Because modules do not
* support multiple initcall()s, we need to initialize an array of suites for a
* module.
*
*/
#define kunit_test_suites(...) \
__kunit_test_suites(__UNIQUE_ID(array), \
__UNIQUE_ID(suites), \
__VA_ARGS__)
#define kunit_test_suite(suite) kunit_test_suites(&suite)

View File

@@ -1531,18 +1531,6 @@ static inline bool memcg_kmem_enabled(void)
return static_branch_likely(&memcg_kmem_enabled_key);
}
static inline bool memcg_kmem_bypass(void)
{
if (in_interrupt())
return true;
/* Allow remote memcg charging in kthread contexts. */
if ((!current->mm || (current->flags & PF_KTHREAD)) &&
!current->active_memcg)
return true;
return false;
}
static inline int memcg_kmem_charge_page(struct page *page, gfp_t gfp,
int order)
{

View File

@@ -2580,7 +2580,7 @@ extern int __do_munmap(struct mm_struct *, unsigned long, size_t,
struct list_head *uf, bool downgrade);
extern int do_munmap(struct mm_struct *, unsigned long, size_t,
struct list_head *uf);
extern int do_madvise(unsigned long start, size_t len_in, int behavior);
extern int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int behavior);
#ifdef CONFIG_MMU
extern int __mm_populate(unsigned long addr, unsigned long len,

View File

@@ -77,6 +77,7 @@ extern const struct file_operations pidfd_fops;
struct file;
extern struct pid *pidfd_pid(const struct file *file);
struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags);
static inline struct pid *get_pid(struct pid *pid)
{

View File

@@ -63,9 +63,17 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
RCU_LOCKDEP_WARN(!(cond) && !rcu_read_lock_any_held(), \
"RCU-list traversed in non-reader section!"); \
})
#define __list_check_srcu(cond) \
({ \
RCU_LOCKDEP_WARN(!(cond), \
"RCU-list traversed without holding the required lock!");\
})
#else
#define __list_check_rcu(dummy, cond, extra...) \
({ check_arg_count_one(extra); })
#define __list_check_srcu(cond) ({ })
#endif
/*
@@ -385,6 +393,25 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
&pos->member != (head); \
pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
/**
* list_for_each_entry_srcu - iterate over rcu list of given type
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the list_head within the struct.
* @cond: lockdep expression for the lock required to traverse the list.
*
* This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as list_add_rcu()
* as long as the traversal is guarded by srcu_read_lock().
* The lockdep expression srcu_read_lock_held() can be passed as the
* cond argument from read side.
*/
#define list_for_each_entry_srcu(pos, head, member, cond) \
for (__list_check_srcu(cond), \
pos = list_entry_rcu((head)->next, typeof(*pos), member); \
&pos->member != (head); \
pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
/**
* list_entry_lockless - get the struct for this entry
* @ptr: the &struct list_head pointer.
@@ -683,6 +710,27 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
&(pos)->member)), typeof(*(pos)), member))
/**
* hlist_for_each_entry_srcu - iterate over rcu list of given type
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the hlist_node within the struct.
* @cond: lockdep expression for the lock required to traverse the list.
*
* This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as hlist_add_head_rcu()
* as long as the traversal is guarded by srcu_read_lock().
* The lockdep expression srcu_read_lock_held() can be passed as the
* cond argument from read side.
*/
#define hlist_for_each_entry_srcu(pos, head, member, cond) \
for (__list_check_srcu(cond), \
pos = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)),\
typeof(*(pos)), member); \
pos; \
pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
&(pos)->member)), typeof(*(pos)), member))
/**
* hlist_for_each_entry_rcu_notrace - iterate over rcu list of given type (for tracing)
* @pos: the type * to use as a loop cursor.

View File

@@ -55,6 +55,12 @@ void __rcu_read_unlock(void);
#else /* #ifdef CONFIG_PREEMPT_RCU */
#ifdef CONFIG_TINY_RCU
#define rcu_read_unlock_strict() do { } while (0)
#else
void rcu_read_unlock_strict(void);
#endif
static inline void __rcu_read_lock(void)
{
preempt_disable();
@@ -63,6 +69,7 @@ static inline void __rcu_read_lock(void)
static inline void __rcu_read_unlock(void)
{
preempt_enable();
rcu_read_unlock_strict();
}
static inline int rcu_preempt_depth(void)
@@ -709,8 +716,8 @@ static inline void rcu_read_lock_bh(void)
"rcu_read_lock_bh() used illegally while idle");
}
/*
* rcu_read_unlock_bh - marks the end of a softirq-only RCU critical section
/**
* rcu_read_unlock_bh() - marks the end of a softirq-only RCU critical section
*
* See rcu_read_lock_bh() for more information.
*/
@@ -751,10 +758,10 @@ static inline notrace void rcu_read_lock_sched_notrace(void)
__acquire(RCU_SCHED);
}
/*
* rcu_read_unlock_sched - marks the end of a RCU-classic critical section
/**
* rcu_read_unlock_sched() - marks the end of a RCU-classic critical section
*
* See rcu_read_lock_sched for more information.
* See rcu_read_lock_sched() for more information.
*/
static inline void rcu_read_unlock_sched(void)
{
@@ -945,7 +952,7 @@ static inline void rcu_head_init(struct rcu_head *rhp)
}
/**
* rcu_head_after_call_rcu - Has this rcu_head been passed to call_rcu()?
* rcu_head_after_call_rcu() - Has this rcu_head been passed to call_rcu()?
* @rhp: The rcu_head structure to test.
* @f: The function passed to call_rcu() along with @rhp.
*

View File

@@ -103,7 +103,6 @@ static inline void rcu_scheduler_starting(void) { }
static inline void rcu_end_inkernel_boot(void) { }
static inline bool rcu_inkernel_boot_has_ended(void) { return true; }
static inline bool rcu_is_watching(void) { return true; }
static inline bool __rcu_is_watching(void) { return true; }
static inline void rcu_momentary_dyntick_idle(void) { }
static inline void kfree_rcu_scheduler_running(void) { }
static inline bool rcu_gp_might_be_stalled(void) { return false; }

View File

@@ -64,7 +64,6 @@ extern int rcu_scheduler_active __read_mostly;
void rcu_end_inkernel_boot(void);
bool rcu_inkernel_boot_has_ended(void);
bool rcu_is_watching(void);
bool __rcu_is_watching(void);
#ifndef CONFIG_PREEMPTION
void rcu_all_qs(void);
#endif

View File

@@ -279,39 +279,38 @@ static inline void memalloc_nocma_restore(unsigned int flags)
#endif
#ifdef CONFIG_MEMCG
DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg);
/**
* memalloc_use_memcg - Starts the remote memcg charging scope.
* set_active_memcg - Starts the remote memcg charging scope.
* @memcg: memcg to charge.
*
* This function marks the beginning of the remote memcg charging scope. All the
* __GFP_ACCOUNT allocations till the end of the scope will be charged to the
* given memcg.
*
* NOTE: This function is not nesting safe.
* NOTE: This function can nest. Users must save the return value and
* reset the previous value after their own charging scope is over.
*/
static inline void memalloc_use_memcg(struct mem_cgroup *memcg)
static inline struct mem_cgroup *
set_active_memcg(struct mem_cgroup *memcg)
{
WARN_ON_ONCE(current->active_memcg);
current->active_memcg = memcg;
}
struct mem_cgroup *old;
/**
* memalloc_unuse_memcg - Ends the remote memcg charging scope.
*
* This function marks the end of the remote memcg charging scope started by
* memalloc_use_memcg().
*/
static inline void memalloc_unuse_memcg(void)
{
current->active_memcg = NULL;
if (in_interrupt()) {
old = this_cpu_read(int_active_memcg);
this_cpu_write(int_active_memcg, memcg);
} else {
old = current->active_memcg;
current->active_memcg = memcg;
}
return old;
}
#else
static inline void memalloc_use_memcg(struct mem_cgroup *memcg)
{
}
static inline void memalloc_unuse_memcg(void)
static inline struct mem_cgroup *
set_active_memcg(struct mem_cgroup *memcg)
{
return NULL;
}
#endif

View File

@@ -26,6 +26,9 @@ struct __call_single_data {
struct {
struct llist_node llist;
unsigned int flags;
#ifdef CONFIG_64BIT
u16 src, dst;
#endif
};
};
smp_call_func_t func;

View File

@@ -61,6 +61,9 @@ struct __call_single_node {
unsigned int u_flags;
atomic_t a_flags;
};
#ifdef CONFIG_64BIT
u16 src, dst;
#endif
};
#endif /* __LINUX_SMP_TYPES_H */

View File

@@ -879,6 +879,8 @@ asmlinkage long sys_munlockall(void);
asmlinkage long sys_mincore(unsigned long start, size_t len,
unsigned char __user * vec);
asmlinkage long sys_madvise(unsigned long start, size_t len, int behavior);
asmlinkage long sys_process_madvise(int pidfd, const struct iovec __user *vec,
size_t vlen, int behavior, unsigned int flags);
asmlinkage long sys_remap_file_pages(unsigned long start, unsigned long size,
unsigned long prot, unsigned long pgoff,
unsigned long flags);

View File

@@ -24,6 +24,7 @@ struct notifier_block; /* in notifier.h */
#define VM_UNINITIALIZED 0x00000020 /* vm_struct is not fully initialized */
#define VM_NO_GUARD 0x00000040 /* don't add guard page */
#define VM_KASAN 0x00000080 /* has allocated kasan shadow memory */
#define VM_MAP_PUT_PAGES 0x00000100 /* put pages and free array in vfree */
/*
* VM_KASAN is used slighly differently depending on CONFIG_KASAN_VMALLOC.
@@ -121,6 +122,7 @@ extern void vfree_atomic(const void *addr);
extern void *vmap(struct page **pages, unsigned int count,
unsigned long flags, pgprot_t prot);
void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot);
extern void vunmap(const void *addr);
extern int remap_vmalloc_range_partial(struct vm_area_struct *vma,
@@ -167,6 +169,7 @@ extern struct vm_struct *__get_vm_area_caller(unsigned long size,
unsigned long flags,
unsigned long start, unsigned long end,
const void *caller);
void free_vm_area(struct vm_struct *area);
extern struct vm_struct *remove_vm_area(const void *addr);
extern struct vm_struct *find_vm_area(const void *addr);
@@ -202,10 +205,6 @@ static inline void set_vm_flush_reset_perms(void *addr)
}
#endif
/* Allocate/destroy a 'vmalloc' VM area. */
extern struct vm_struct *alloc_vm_area(size_t size, pte_t **ptes);
extern void free_vm_area(struct vm_struct *area);
/* for /dev/kmem */
extern long vread(char *buf, char *addr, unsigned long count);
extern long vwrite(char *buf, char *addr, unsigned long count);

View File

@@ -74,17 +74,17 @@ TRACE_EVENT_RCU(rcu_grace_period,
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gp_seq)
__field(long, gp_seq)
__field(const char *, gpevent)
),
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gp_seq = gp_seq;
__entry->gp_seq = (long)gp_seq;
__entry->gpevent = gpevent;
),
TP_printk("%s %lu %s",
TP_printk("%s %ld %s",
__entry->rcuname, __entry->gp_seq, __entry->gpevent)
);
@@ -114,8 +114,8 @@ TRACE_EVENT_RCU(rcu_future_grace_period,
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gp_seq)
__field(unsigned long, gp_seq_req)
__field(long, gp_seq)
__field(long, gp_seq_req)
__field(u8, level)
__field(int, grplo)
__field(int, grphi)
@@ -124,16 +124,16 @@ TRACE_EVENT_RCU(rcu_future_grace_period,
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gp_seq = gp_seq;
__entry->gp_seq_req = gp_seq_req;
__entry->gp_seq = (long)gp_seq;
__entry->gp_seq_req = (long)gp_seq_req;
__entry->level = level;
__entry->grplo = grplo;
__entry->grphi = grphi;
__entry->gpevent = gpevent;
),
TP_printk("%s %lu %lu %u %d %d %s",
__entry->rcuname, __entry->gp_seq, __entry->gp_seq_req, __entry->level,
TP_printk("%s %ld %ld %u %d %d %s",
__entry->rcuname, (long)__entry->gp_seq, (long)__entry->gp_seq_req, __entry->level,
__entry->grplo, __entry->grphi, __entry->gpevent)
);
@@ -153,7 +153,7 @@ TRACE_EVENT_RCU(rcu_grace_period_init,
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gp_seq)
__field(long, gp_seq)
__field(u8, level)
__field(int, grplo)
__field(int, grphi)
@@ -162,14 +162,14 @@ TRACE_EVENT_RCU(rcu_grace_period_init,
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gp_seq = gp_seq;
__entry->gp_seq = (long)gp_seq;
__entry->level = level;
__entry->grplo = grplo;
__entry->grphi = grphi;
__entry->qsmask = qsmask;
),
TP_printk("%s %lu %u %d %d %lx",
TP_printk("%s %ld %u %d %d %lx",
__entry->rcuname, __entry->gp_seq, __entry->level,
__entry->grplo, __entry->grphi, __entry->qsmask)
);
@@ -197,17 +197,17 @@ TRACE_EVENT_RCU(rcu_exp_grace_period,
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gpseq)
__field(long, gpseq)
__field(const char *, gpevent)
),
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gpseq = gpseq;
__entry->gpseq = (long)gpseq;
__entry->gpevent = gpevent;
),
TP_printk("%s %lu %s",
TP_printk("%s %ld %s",
__entry->rcuname, __entry->gpseq, __entry->gpevent)
);
@@ -316,17 +316,17 @@ TRACE_EVENT_RCU(rcu_preempt_task,
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gp_seq)
__field(long, gp_seq)
__field(int, pid)
),
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gp_seq = gp_seq;
__entry->gp_seq = (long)gp_seq;
__entry->pid = pid;
),
TP_printk("%s %lu %d",
TP_printk("%s %ld %d",
__entry->rcuname, __entry->gp_seq, __entry->pid)
);
@@ -343,17 +343,17 @@ TRACE_EVENT_RCU(rcu_unlock_preempted_task,
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gp_seq)
__field(long, gp_seq)
__field(int, pid)
),
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gp_seq = gp_seq;
__entry->gp_seq = (long)gp_seq;
__entry->pid = pid;
),
TP_printk("%s %lu %d", __entry->rcuname, __entry->gp_seq, __entry->pid)
TP_printk("%s %ld %d", __entry->rcuname, __entry->gp_seq, __entry->pid)
);
/*
@@ -374,7 +374,7 @@ TRACE_EVENT_RCU(rcu_quiescent_state_report,
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gp_seq)
__field(long, gp_seq)
__field(unsigned long, mask)
__field(unsigned long, qsmask)
__field(u8, level)
@@ -385,7 +385,7 @@ TRACE_EVENT_RCU(rcu_quiescent_state_report,
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gp_seq = gp_seq;
__entry->gp_seq = (long)gp_seq;
__entry->mask = mask;
__entry->qsmask = qsmask;
__entry->level = level;
@@ -394,7 +394,7 @@ TRACE_EVENT_RCU(rcu_quiescent_state_report,
__entry->gp_tasks = gp_tasks;
),
TP_printk("%s %lu %lx>%lx %u %d %d %u",
TP_printk("%s %ld %lx>%lx %u %d %d %u",
__entry->rcuname, __entry->gp_seq,
__entry->mask, __entry->qsmask, __entry->level,
__entry->grplo, __entry->grphi, __entry->gp_tasks)
@@ -415,19 +415,19 @@ TRACE_EVENT_RCU(rcu_fqs,
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gp_seq)
__field(long, gp_seq)
__field(int, cpu)
__field(const char *, qsevent)
),
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gp_seq = gp_seq;
__entry->gp_seq = (long)gp_seq;
__entry->cpu = cpu;
__entry->qsevent = qsevent;
),
TP_printk("%s %lu %d %s",
TP_printk("%s %ld %d %s",
__entry->rcuname, __entry->gp_seq,
__entry->cpu, __entry->qsevent)
);

View File

@@ -859,9 +859,11 @@ __SYSCALL(__NR_openat2, sys_openat2)
__SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
#define __NR_faccessat2 439
__SYSCALL(__NR_faccessat2, sys_faccessat2)
#define __NR_process_madvise 440
__SYSCALL(__NR_process_madvise, sys_process_madvise)
#undef __NR_syscalls
#define __NR_syscalls 440
#define __NR_syscalls 441
/*
* 32 bit systems traditionally used different

View File

@@ -108,6 +108,8 @@
#define CREATE_TRACE_POINTS
#include <trace/events/initcall.h>
#include <kunit/test.h>
static int kernel_init(void *);
extern void init_IRQ(void);
@@ -1513,6 +1515,8 @@ static noinline void __init kernel_init_freeable(void)
do_basic_setup();
kunit_run_all_tests();
console_on_rootfs();
/*

View File

@@ -134,6 +134,8 @@ KASAN_SANITIZE_stackleak.o := n
KCSAN_SANITIZE_stackleak.o := n
KCOV_INSTRUMENT_stackleak.o := n
obj-$(CONFIG_SCF_TORTURE_TEST) += scftorture.o
$(obj)/configs.o: $(obj)/config_data.gz
targets += config_data.gz

View File

@@ -304,7 +304,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
* terminate a grace period, if and only if the timer interrupt is
* not nested into another interrupt.
*
* Checking for __rcu_is_watching() here would prevent the nesting
* Checking for rcu_is_watching() here would prevent the nesting
* interrupt to invoke rcu_irq_enter(). If that nested interrupt is
* the tick then rcu_flavor_sched_clock_irq() would wrongfully
* assume that it is the first interupt and eventually claim

View File

@@ -1474,25 +1474,6 @@ end:
return retval;
}
static struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags)
{
struct fd f;
struct pid *pid;
f = fdget(fd);
if (!f.file)
return ERR_PTR(-EBADF);
pid = pidfd_pid(f.file);
if (!IS_ERR(pid)) {
get_pid(pid);
*flags = f.file->f_flags;
}
fdput(f);
return pid;
}
static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop,
int options, struct rusage *ru)
{

View File

@@ -566,7 +566,7 @@ static struct lock_torture_ops rwsem_lock_ops = {
#include <linux/percpu-rwsem.h>
static struct percpu_rw_semaphore pcpu_rwsem;
void torture_percpu_rwsem_init(void)
static void torture_percpu_rwsem_init(void)
{
BUG_ON(percpu_init_rwsem(&pcpu_rwsem));
}

View File

@@ -521,6 +521,25 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns)
return idr_get_next(&ns->idr, &nr);
}
struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags)
{
struct fd f;
struct pid *pid;
f = fdget(fd);
if (!f.file)
return ERR_PTR(-EBADF);
pid = pidfd_pid(f.file);
if (!IS_ERR(pid)) {
get_pid(pid);
*flags = f.file->f_flags;
}
fdput(f);
return pid;
}
/**
* pidfd_create() - Create a new pid file descriptor.
*

View File

@@ -135,10 +135,12 @@ config RCU_FANOUT
config RCU_FANOUT_LEAF
int "Tree-based hierarchical RCU leaf-level fanout value"
range 2 64 if 64BIT
range 2 32 if !64BIT
range 2 64 if 64BIT && !RCU_STRICT_GRACE_PERIOD
range 2 32 if !64BIT && !RCU_STRICT_GRACE_PERIOD
range 2 3 if RCU_STRICT_GRACE_PERIOD
depends on TREE_RCU && RCU_EXPERT
default 16
default 16 if !RCU_STRICT_GRACE_PERIOD
default 2 if RCU_STRICT_GRACE_PERIOD
help
This option controls the leaf-level fanout of hierarchical
implementations of RCU, and allows trading off cache misses

View File

@@ -23,7 +23,7 @@ config TORTURE_TEST
tristate
default n
config RCU_PERF_TEST
config RCU_SCALE_TEST
tristate "performance tests for RCU"
depends on DEBUG_KERNEL
select TORTURE_TEST
@@ -114,4 +114,19 @@ config RCU_EQS_DEBUG
Say N here if you need ultimate kernel/user switch latencies
Say Y if you are unsure
config RCU_STRICT_GRACE_PERIOD
bool "Provide debug RCU implementation with short grace periods"
depends on DEBUG_KERNEL && RCU_EXPERT
default n
select PREEMPT_COUNT if PREEMPT=n
help
Select this option to build an RCU variant that is strict about
grace periods, making them as short as it can. This limits
scalability, destroys real-time response, degrades battery
lifetime and kills performance. Don't try this on large
machines, as in systems with more than about 10 or 20 CPUs.
But in conjunction with tools like KASAN, it can be helpful
when looking for certain types of RCU usage bugs, for example,
too-short RCU read-side critical sections.
endmenu # "RCU Debugging"

View File

@@ -11,7 +11,7 @@ obj-y += update.o sync.o
obj-$(CONFIG_TREE_SRCU) += srcutree.o
obj-$(CONFIG_TINY_SRCU) += srcutiny.o
obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
obj-$(CONFIG_RCU_PERF_TEST) += rcuperf.o
obj-$(CONFIG_RCU_SCALE_TEST) += rcuscale.o
obj-$(CONFIG_RCU_REF_SCALE_TEST) += refscale.o
obj-$(CONFIG_TREE_RCU) += tree.o
obj-$(CONFIG_TINY_RCU) += tiny.o

Some files were not shown because too many files have changed in this diff Show More