In this context the uobject is not allowed to be NULL, so type is the same
as uobject->type, and at least for IDR, id is the same as uobject->id.
FD objects should never handle the FD number outside the uAPI boundary
code.
Suggested-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Current DPCM is caring only FE channel configuration. Sometimes
it will be trouble if user selects channel which isn't supported
by BE.
This patch adds new .dpcm_merged_chan on struct snd_soc_dai_link.
DPCM will use FE / BE merged channel if struct snd_soc_dai_link
has it.
Signed-off-by: Jiada Wang <jiada_wang@mentor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
For dependencies, branch based on 'mellanox/mlx5-next' of
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git
Pull RoCE ICRC counters from Leon Romanovsky:
====================
This series exposes RoCE ICRC counter through existing RDMA hw_counters
sysfs interface.
The first patch has all HW definitions in mlx5_ifc.h file and second patch
is the actual counter implementation.
====================
* branch 'icrc-counter':
IB/mlx5: Support RoCE ICRC encapsulated error counter
net/mlx5: Add RoCE RX ICRC encapsulated counter
Export the clock ids dt-bindings usable by the consumers of the clock
controller and add the documentation for the device tree bindings of
the audio clock controller of the A113 based SoCs.
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
This commit makes BBR use only the MSS (without any headers) to
calculate pacing rates when internal TCP-layer pacing is used.
This is necessary to achieve the correct pacing behavior in this case,
since tcp_internal_pacing() uses only the payload length to calculate
pacing delays.
Signed-off-by: Kevin Yang <yyd@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When receiving multiple packets with the same ts ecr value, only try
to compute rcv_rtt sample with the earliest received packet.
This is because the rcv_rtt calculated by later received packets
could possibly include long idle time or other types of delay.
For example:
(1) server sends last packet of reply with TS val V1
(2) client ACKs last packet of reply with TS ecr V1
(3) long idle time passes
(4) client sends next request data packet with TS ecr V1 (again!)
At this time, the rcv_rtt computed on server with TS ecr V1 will be
inflated with the idle time and should get ignored.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using rht_dereference_bucket() to dereference
->future_tbl looks like a type error, and could be confusing.
Using rht_dereference_rcu() to test a pointer for NULL
adds an unnecessary barrier - rcu_access_pointer() is preferred
for NULL tests when no lock is held.
This uses 3 different ways to access ->future_tbl.
- if we know the mutex is held, use rht_dereference()
- if we don't hold the mutex, and are only testing for NULL,
use rcu_access_pointer()
- otherwise (using RCU protection for true dereference),
use rht_dereference_rcu().
Note that this includes a simplification of the call to
rhashtable_last_table() - we don't do an extra dereference
before the call any more.
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'ht' and 'hash' arguments to INIT_RHT_NULLS_HEAD() are
no longer used - so drop them. This allows us to also
remove the nhash argument from nested_table_alloc().
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This "feature" is unused, undocumented, and untested and so doesn't
really belong. A patch is under development to properly implement
support for detecting when a search gets diverted down a different
chain, which the common purpose of nulls markers.
This patch actually fixes a bug too. The table resizing allows a
table to grow to 2^31 buckets, but the hash is truncated to 27 bits -
any growth beyond 2^27 is wasteful an ineffective.
This patch results in NULLS_MARKER(0) being used for all chains,
and leaves the use of rht_is_a_null() to test for it.
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the use of rhashtables in net namespaces,
rhashtable.h is included in lots of the kernel,
so a small changes can required a large recompilation.
This makes development painful.
This patch splits out rhashtable-types.h which just includes
the major type declarations, and does not include (non-trivial)
inline code. rhashtable.h is no longer included by anything
in the include/ directory.
Common include files only include rhashtable-types.h so a large
recompilation is only triggered when that changes.
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First feature request for 4.19. Highlights:
- Add initial amdgpu documentation
- Add initial GPU scheduler documention
- GPU scheduler fixes for dying processes
- Add support for the JPEG engine on VCN
- Switch CI to use powerplay by default
- EDC support for CZ
- More powerplay cleanups
- Misc DC fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180621161138.3008-1-alexander.deucher@amd.com
One of my tests compiles the kernel with gcc 4.5.3, and I hit the
following build error:
include/linux/semaphore.h: In function 'sema_init':
include/linux/semaphore.h:35:17: error: unknown field 'val' specified in initializer
include/linux/semaphore.h:35:17: warning: missing braces around initializer
include/linux/semaphore.h:35:17: warning: (near initialization for '(anonymous).raw_lock.<anonymous>.val')
I bisected it down to:
625e88be1f ("locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'")
... which makes qspinlock have an anonymous union, which makes initializing it special
for older compilers. By adding strategic brackets, it makes the build
happy again.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Waiman Long <longman@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: 625e88be1f ("locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'")
Link: http://lkml.kernel.org/r/20180621203526.172ab5c4@vmware.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Recent Raspberry Pi firmware provides a mailbox property to detect
under-voltage conditions. Here is the current definition.
The u32 value returned by the firmware is divided into 2 parts:
- lower 16-bits are the live value
- upper 16-bits are the history or sticky value
Bits:
0: undervoltage
1: arm frequency capped
2: currently throttled
16: undervoltage has occurred
17: arm frequency capped has occurred
18: throttling has occurred
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Pull NFS client bugfixes from Trond Myklebust:
"Hightlights include:
- fix an rcu deadlock in nfs_delegation_find_inode()
- fix NFSv4 deadlocks due to not freeing the session slot in
layoutget
- don't send layoutreturn if the layout is already invalid
- prevent duplicate XID allocation
- flexfiles: Don't tie up all the rpciod threads in resends"
* tag 'nfs-for-4.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
pNFS/flexfiles: Process writeback resends from nfsiod context as well
pNFS/flexfiles: Don't tie up all the rpciod threads in resends
sunrpc: Prevent duplicate XID allocation
pNFS: Don't send layoutreturn if the layout is already invalid
pNFS: Always free the session slot on error in nfs4_layoutget_handle_exception
NFS: Fix an rcu deadlock in nfs_delegation_find_inode()
Pull ACPI fixes from Rafael Wysocki:
"These fix a suspend/resume regression in the ACPI driver for Intel
SoCs (LPSS), add a new system wakeup quirk to the ACPI EC driver and
fix an inline stub of a function in the ACPI processor driver that
diverged from the original.
Specifics:
- Fix a suspend/resume regression in the ACPI driver for Intel SoCs
(LPSS) to make it work on systems where some power management
quirks should only be applied for runtime PM and suspend-to-idle
and not for suspend-to-RAM (Rafael Wysocki).
- Add a system wakeup quirk for Thinkpad X1 Carbon 6th to the ACPI EC
driver to avoid drainig battery too fast while suspended to idle on
those systems (Mika Westerberg).
- Fix an inline stub of acpi_processor_ppc_has_changed() to match the
original function definition (Brian Norris)"
* tag 'acpi-4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / processor: Finish making acpi_processor_ppc_has_changed() void
ACPI / EC: Use ec_no_wakeup on Thinkpad X1 Carbon 6th
ACPI / LPSS: Avoid PM quirks on suspend and resume from S3
Add capability bit in PCAM register and RoCE ICRC error counter
to PPCNT register.
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
The meaning of the mode_config max_width/height fields has not been
entirely clear. They are used both as the max framebuffer dimensions,
and they are also used by drm_mode_getconnector() to filter out
any mode whose hdisplay/vdisplay exceed those limits.
Let's put it in writing that max_width/height only refrer to the max
framebuffer dimensions, and should those be higher than the hardware
limits for display timings the driver must validate the latter using
some other means.
We'll keep the max_width/height usage in drm_mode_getconnector()
because setcrtc treats hdisplay/vdisplay also as the primary plane
width, and having a plane bigger than the max fb size doesn't make
much sense (if we ignore scaling that is). It all works out fine
as long as the max fb dimensions are at least equal to the max
timing limits. If the opposite were true we may want to rethink
what drm_mode_getconnector() does. Maybe do the mode filtering
only for non-atomic userspace?
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180615173939.11353-1-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
The ifdeffery for atomic*_{fetch_,}andnot() is unlike that for all the
other atomics. If atomic*_andnot() is not defined, the corresponding
atomic*_fetch_andnot() is assumed to not be defined.
Additionally, the fallbacks for the various ordering cases are written
much later in atomic.h as static inlines.
This isn't problematic today, but gets in the way of scripting the
generation of atomics. To prepare for scripting, this patch:
* Switches to separate ifdefs for atomic*_andnot() and
atomic*_fetch_andnot(), updating implementations as appropriate.
* Moves the fallbacks into the standards ifdefs, as macro expansions
rather than static inlines.
* Removes trivial andnot implementations from architectures, where these
are superseded by core code.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180621121321.4761-19-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The conditional inc/dec ops differ for atomic_t and atomic64_t:
- atomic_inc_unless_positive() is optional for atomic_t, and doesn't exist for atomic64_t.
- atomic_dec_unless_negative() is optional for atomic_t, and doesn't exist for atomic64_t.
- atomic_dec_if_positive is optional for atomic_t, and is mandatory for atomic64_t.
Let's make these consistently optional for both. At the same time, let's
clean up the existing fallbacks to use atomic_try_cmpxchg().
The instrumented atomics are updated accordingly.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180621121321.4761-18-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some of the atomics return the result of a test applied after the atomic
operation, and almost all architectures implement these as trivial
wrappers around the underlying atomic. Specifically:
* <atomic>_inc_and_test(v) is (<atomic>_inc_return(v) == 0)
* <atomic>_dec_and_test(v) is (<atomic>_dec_return(v) == 0)
* <atomic>_sub_and_test(i, v) is (<atomic>_sub_return(i, v) == 0)
* <atomic>_add_negative(i, v) is (<atomic>_add_return(i, v) < 0)
Rather than have these definitions duplicated in all architectures, with
minor inconsistencies in formatting and documentation, let's make these
operations optional, with default fallbacks as above. Implementations
must now provide a preprocessor symbol.
The instrumented atomics are updated accordingly.
Both x86 and m68k have custom implementations, which are left as-is,
given preprocessor symbols to avoid being overridden.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Palmer Dabbelt <palmer@sifive.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180621121321.4761-16-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Architectures with atomic64_fetch_add_unless() provide a preprocessor
symbol if they do so, and all other architectures have trivial C
implementations of atomic64_add_unless() which are near-identical.
Let's unify the trivial definitions of atomic64_fetch_add_unless() in
<linux/atomic.h>, so that we always have both
atomic64_fetch_add_unless() and atomic64_add_unless() with less
boilerplate code.
This means that atomic64_add_unless() is always implemented in core
code, and the instrumented atomics are updated accordingly.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180621121321.4761-15-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Several architectures these have a near-identical implementation based
on atomic_read() and atomic_cmpxchg() which we can instead define in
<linux/atomic.h>, so let's do so, using something close to the existing
x86 implementation with try_cmpxchg().
Where an architecture provides its own atomic_fetch_add_unless(), it
must define a preprocessor symbol for it. The instrumented atomics are
updated accordingly.
Note that arch/arc's existing atomic_fetch_add_unless() had redundant
barriers, as these are already present in its atomic_cmpxchg()
implementation.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Palmer Dabbelt <palmer@sifive.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vineet Gupta <vgupta@synopsys.com>
Link: https://lore.kernel.org/lkml/20180621121321.4761-7-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We define a trivial fallback for atomic_inc_not_zero(), but don't do
the same for atomic64_inc_not_zero(), leading most architectures to
define the same boilerplate.
Let's add a fallback in <linux/atomic.h>, and remove the redundant
implementations. Note that atomic64_add_unless() is always defined in
<linux/atomic.h>, and promotes its arguments to the requisite types, so
we need not do this explicitly.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Palmer Dabbelt <palmer@sifive.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180621121321.4761-6-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some of the atomics return a status value, which is a boolean value
describing whether the operation was performed. To make it clear that
this is a boolean value, let's update the common fallbacks to return
bool, fixing up the return values and comments likewise.
At the same time, let's simplify the description of the operations in
their respective comments.
The instrumented atomics and generic atomic64 implementation are updated
accordingly.
Note that atomic64_dec_if_positive() doesn't follow the usual test op
pattern, and returns the would-be decremented value. This is not
changed.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180621121321.4761-5-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While documentation suggests atomic_inc_not_zero_hint() will perform better
than atomic_inc_not_zero(), this is unlikely to be the case. No architectures
implement atomic_inc_not_zero_hint() directly, and thus it either falls back to
atomic_inc_not_zero(), or a loop using atomic_cmpxchg().
Whenever the hint does not match the value in memory, the repeated use of
atomic_cmpxchg() will be more expensive than the read that
atomic_inc_not_zero_hint() attempts to avoid. For architectures with LL/SC
atomics, a read cannot be avoided, and it would always be better to use
atomic_inc_not_zero() directly. For other architectures, their own
atomic_inc_not_zero() is likely to be more optimal than an atomic_cmpxchg()
loop regardless.
Generally, atomic_inc_not_zero_hint() is liable to perform worse than
atomic_inc_not_zero(). Further, atomic_inc_not_zero_hint() only exists
for atomic_t, and not atomic64_t or atomic_long_t, and there is only one
user in the kernel tree.
Given all this, let's remove atomic_inc_not_zero_hint(), and migrate the
existing user over to atomic_inc_not_zero().
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180621121321.4761-4-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While __atomic_add_unless() was originally intended as a building-block
for atomic_add_unless(), it's now used in a number of places around the
kernel. It's the only common atomic operation named __atomic*(), rather
than atomic_*(), and for consistency it would be better named
atomic_fetch_add_unless().
This lack of consistency is slightly confusing, and gets in the way of
scripting atomics. Given that, let's clean things up and promote it to
an official part of the atomics API, in the form of
atomic_fetch_add_unless().
This patch converts definitions and invocations over to the new name,
including the instrumented version, using the following script:
----
git grep -w __atomic_add_unless | while read line; do
sed -i '{s/\<__atomic_add_unless\>/atomic_fetch_add_unless/}' "${line%%:*}";
done
git grep -w __arch_atomic_add_unless | while read line; do
sed -i '{s/\<__arch_atomic_add_unless\>/arch_atomic_fetch_add_unless/}' "${line%%:*}";
done
----
Note that we do not have atomic{64,_long}_fetch_add_unless(), which will
be introduced by later patches.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Palmer Dabbelt <palmer@sifive.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180621121321.4761-2-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Provide a command line and a sysfs knob to control SMT.
The command line options are:
'nosmt': Enumerate secondary threads, but do not online them
'nosmt=force': Ignore secondary threads completely during enumeration
via MP table and ACPI/MADT.
The sysfs control file has the following states (read/write):
'on': SMT is enabled. Secondary threads can be freely onlined
'off': SMT is disabled. Secondary threads, even if enumerated
cannot be onlined
'forceoff': SMT is permanentely disabled. Writes to the control
file are rejected.
'notsupported': SMT is not supported by the CPU
The command line option 'nosmt' sets the sysfs control to 'off'. This
can be changed to 'on' to reenable SMT during runtime.
The command line option 'nosmt=force' sets the sysfs control to
'forceoff'. This cannot be changed during runtime.
When SMT is 'on' and the control file is changed to 'off' then all online
secondary threads are offlined and attempts to online a secondary thread
later on are rejected.
When SMT is 'off' and the control file is changed to 'on' then secondary
threads can be onlined again. The 'off' -> 'on' transition does not
automatically online the secondary threads.
When the control file is set to 'forceoff', the behaviour is the same as
setting it to 'off', but the operation is irreversible and later writes to
the control file are rejected.
When the control status is 'notsupported' then writes to the control file
are rejected.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
After commit 9facc33687 ("bpf: reject any prog that failed read-only lock")
offsetof(struct bpf_binary_header, image) became 3 instead of 4,
breaking powerpc BPF badly, since instructions need to be word aligned.
Fixes: 9facc33687 ("bpf: reject any prog that failed read-only lock")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If NBD_DISCONNECT_ON_CLOSE is set on a device, then the driver will
issue a disconnect from nbd_release if the device has no remaining
bdev->bd_openers.
Fix ret val so reconfigure with only setting the flag succeeds.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull rdma fixes from Jason Gunthorpe:
"Here are eight fairly small fixes collected over the last two weeks.
Regression and crashing bug fixes:
- mlx4/5: Fixes for issues found from various checkers
- A resource tracking and uverbs regression in the core code
- qedr: NULL pointer regression found during testing
- rxe: Various small bugs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/rxe: Fix missing completion for mem_reg work requests
RDMA/core: Save kernel caller name when creating CQ using ib_create_cq()
IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write
IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()'
RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM
IB/mlx5: Fix return value check in flow_counters_set_data()
IB/mlx5: Fix memory leak in mlx5_ib_create_flow
IB/rxe: avoid double kfree skb
Pull networking fixes from David Miller:
1) Fix crash on bpf_prog_load() errors, from Daniel Borkmann.
2) Fix ATM VCC memory accounting, from David Woodhouse.
3) fib6_info objects need RCU freeing, from Eric Dumazet.
4) Fix SO_BINDTODEVICE handling for TCP sockets, from David Ahern.
5) Fix clobbered error code in enic_open() failure path, from
Govindarajulu Varadarajan.
6) Propagate dev_get_valid_name() error returns properly, from Li
RongQing.
7) Fix suspend/resume in davinci_emac driver, from Bartosz Golaszewski.
8) Various act_ife fixes (recursive locking, IDR leaks, etc.) from
Davide Caratti.
9) Fix buggy checksum handling in sungem driver, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
ip: limit use of gso_size to udp
stmmac: fix DMA channel hang in half-duplex mode
net: stmmac: socfpga: add additional ocp reset line for Stratix10
net: sungem: fix rx checksum support
bpfilter: ignore binary files
bpfilter: fix build error
net/usb/drivers: Remove useless hrtimer_active check
net/sched: act_ife: preserve the action control in case of error
net/sched: act_ife: fix recursive lock and idr leak
net: ethernet: fix suspend/resume in davinci_emac
net: propagate dev_get_valid_name return code
enic: do not overwrite error code
net/tcp: Fix socket lookups with SO_BINDTODEVICE
ptp: replace getnstimeofday64() with ktime_get_real_ts64()
net/ipv6: respect rcu grace period before freeing fib6_info
net: net_failover: fix typo in net_failover_slave_register()
ipvlan: use ETH_MAX_MTU as max mtu
net: hamradio: use eth_broadcast_addr
enic: initialize enic->rfs_h.lock in enic_probe
MAINTAINERS: Add Sam as the maintainer for NCSI
...
For the L1TF workaround its necessary to limit the swap file size to below
MAX_PA/2, so that the higher bits of the swap offset inverted never point
to valid memory.
Add a mechanism for the architecture to override the swap file size check
in swapfile.c and add a x86 specific max swapfile check function that
enforces that limit.
The check is only enabled if the CPU is vulnerable to L1TF.
In VMs with 42bit MAX_PA the typical limit is 2TB now, on a native system
with 46bit PA it is 32TB. The limit is only per individual swap file, so
it's always possible to exceed these limits with multiple swap files or
partitions.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
For L1TF PROT_NONE mappings are protected by inverting the PFN in the page
table entry. This sets the high bits in the CPU's address space, thus
making sure to point to not point an unmapped entry to valid cached memory.
Some server system BIOSes put the MMIO mappings high up in the physical
address space. If such an high mapping was mapped to unprivileged users
they could attack low memory by setting such a mapping to PROT_NONE. This
could happen through a special device driver which is not access
protected. Normal /dev/mem is of course access protected.
To avoid this forbid PROT_NONE mappings or mprotect for high MMIO mappings.
Valid page mappings are allowed because the system is then unsafe anyways.
It's not expected that users commonly use PROT_NONE on MMIO. But to
minimize any impact this is only enforced if the mapping actually refers to
a high MMIO address (defined as the MAX_PA-1 bit being set), and also skip
the check for root.
For mmaps this is straight forward and can be handled in vm_insert_pfn and
in remap_pfn_range().
For mprotect it's a bit trickier. At the point where the actual PTEs are
accessed a lot of state has been changed and it would be difficult to undo
on an error. Since this is a uncommon case use a separate early page talk
walk pass for MMIO PROT_NONE mappings that checks for this condition
early. For non MMIO and non PROT_NONE there are no changes.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>