Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:
 "This cycle's RCU changes were:

   - A few more RCU flavor consolidation cleanups.

   - Updates to RCU's list-traversal macros improving lockdep usability.

   - Forward-progress improvements for no-CBs CPUs: Avoid ignoring
     incoming callbacks during grace-period waits.

   - Forward-progress improvements for no-CBs CPUs: Use ->cblist
     structure to take advantage of others' grace periods.

   - Also added a small commit that avoids needlessly inflicting
     scheduler-clock ticks on callback-offloaded CPUs.

   - Forward-progress improvements for no-CBs CPUs: Reduce contention on
     ->nocb_lock guarding ->cblist.

   - Forward-progress improvements for no-CBs CPUs: Add ->nocb_bypass
     list to further reduce contention on ->nocb_lock guarding ->cblist.

   - Miscellaneous fixes.

   - Torture-test updates.

   - minor LKMM updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (86 commits)
  MAINTAINERS: Update from paulmck@linux.ibm.com to paulmck@kernel.org
  rcu: Don't include <linux/ktime.h> in rcutiny.h
  rcu: Allow rcu_do_batch() to dynamically adjust batch sizes
  rcu/nocb: Don't wake no-CBs GP kthread if timer posted under overload
  rcu/nocb: Reduce __call_rcu_nocb_wake() leaf rcu_node ->lock contention
  rcu/nocb: Reduce nocb_cb_wait() leaf rcu_node ->lock contention
  rcu/nocb: Advance CBs after merge in rcutree_migrate_callbacks()
  rcu/nocb: Avoid synchronous wakeup in __call_rcu_nocb_wake()
  rcu/nocb: Print no-CBs diagnostics when rcutorture writer unduly delayed
  rcu/nocb: EXP Check use and usefulness of ->nocb_lock_contended
  rcu/nocb: Add bypass callback queueing
  rcu/nocb: Atomic ->len field in rcu_segcblist structure
  rcu/nocb: Unconditionally advance and wake for excessive CBs
  rcu/nocb: Reduce ->nocb_lock contention with separate ->nocb_gp_lock
  rcu/nocb: Reduce contention at no-CBs invocation-done time
  rcu/nocb: Reduce contention at no-CBs registry-time CB advancement
  rcu/nocb: Round down for number of no-CBs grace-period kthreads
  rcu/nocb: Avoid ->nocb_lock capture by corresponding CPU
  rcu/nocb: Avoid needless wakeups of no-CBs grace-period kthread
  rcu/nocb: Make __call_rcu_nocb_wake() safe for many callbacks
  ...
This commit is contained in:
Linus Torvalds
2019-09-16 16:28:19 -07:00
49 changed files with 1499 additions and 805 deletions

View File

@@ -42,7 +42,8 @@ linux-kernel.bell and linux-kernel.cat files that make up the formal
version of the model; they are extremely terse and their meanings are
far from clear.
This document describes the ideas underlying the LKMM. It is meant
This document describes the ideas underlying the LKMM, but excluding
the modeling of bare C (or plain) shared memory accesses. It is meant
for people who want to understand how the model was designed. It does
not go into the details of the code in the .bell and .cat files;
rather, it explains in English what the code expresses symbolically.
@@ -354,31 +355,25 @@ be extremely complex.
Optimizing compilers have great freedom in the way they translate
source code to object code. They are allowed to apply transformations
that add memory accesses, eliminate accesses, combine them, split them
into pieces, or move them around. Faced with all these possibilities,
the LKMM basically gives up. It insists that the code it analyzes
must contain no ordinary accesses to shared memory; all accesses must
be performed using READ_ONCE(), WRITE_ONCE(), or one of the other
atomic or synchronization primitives. These primitives prevent a
large number of compiler optimizations. In particular, it is
guaranteed that the compiler will not remove such accesses from the
generated code (unless it can prove the accesses will never be
executed), it will not change the order in which they occur in the
code (within limits imposed by the C standard), and it will not
introduce extraneous accesses.
into pieces, or move them around. The use of READ_ONCE(), WRITE_ONCE(),
or one of the other atomic or synchronization primitives prevents a
large number of compiler optimizations. In particular, it is guaranteed
that the compiler will not remove such accesses from the generated code
(unless it can prove the accesses will never be executed), it will not
change the order in which they occur in the code (within limits imposed
by the C standard), and it will not introduce extraneous accesses.
This explains why the MP and SB examples above used READ_ONCE() and
WRITE_ONCE() rather than ordinary memory accesses. Thanks to this
usage, we can be certain that in the MP example, P0's write event to
buf really is po-before its write event to flag, and similarly for the
other shared memory accesses in the examples.
The MP and SB examples above used READ_ONCE() and WRITE_ONCE() rather
than ordinary memory accesses. Thanks to this usage, we can be certain
that in the MP example, the compiler won't reorder P0's write event to
buf and P0's write event to flag, and similarly for the other shared
memory accesses in the examples.
Private variables are not subject to this restriction. Since they are
not shared between CPUs, they can be accessed normally without
READ_ONCE() or WRITE_ONCE(), and there will be no ill effects. In
fact, they need not even be stored in normal memory at all -- in
principle a private variable could be stored in a CPU register (hence
the convention that these variables have names starting with the
letter 'r').
Since private variables are not shared between CPUs, they can be
accessed normally without READ_ONCE() or WRITE_ONCE(). In fact, they
need not even be stored in normal memory at all -- in principle a
private variable could be stored in a CPU register (hence the convention
that these variables have names starting with the letter 'r').
A WARNING
@@ -1302,7 +1297,7 @@ followed by an arbitrary number of cumul-fence links, ending with an
rfe link. You can concoct more exotic examples, containing more than
one fence, although this quickly leads to diminishing returns in terms
of complexity. For instance, here's an example containing a coe link
followed by two fences and an rfe link, utilizing the fact that
followed by two cumul-fences and an rfe link, utilizing the fact that
release fences are A-cumulative:
int x, y, z;
@@ -1334,10 +1329,10 @@ If x = 2, r0 = 1, and r2 = 1 after this code runs then there is a prop
link from P0's store to its load. This is because P0's store gets
overwritten by P1's store since x = 2 at the end (a coe link), the
smp_wmb() ensures that P1's store to x propagates to P2 before the
store to y does (the first fence), the store to y propagates to P2
store to y does (the first cumul-fence), the store to y propagates to P2
before P2's load and store execute, P2's smp_store_release()
guarantees that the stores to x and y both propagate to P0 before the
store to z does (the second fence), and P0's load executes after the
store to z does (the second cumul-fence), and P0's load executes after the
store to z has propagated to P0 (an rfe link).
In summary, the fact that the hb relation links memory access events

View File

@@ -167,15 +167,15 @@ scripts Various scripts, see scripts/README.
LIMITATIONS
===========
The Linux-kernel memory model has the following limitations:
The Linux-kernel memory model (LKMM) has the following limitations:
1. Compiler optimizations are not modeled. Of course, the use
of READ_ONCE() and WRITE_ONCE() limits the compiler's ability
to optimize, but there is Linux-kernel code that uses bare C
memory accesses. Handling this code is on the to-do list.
For more information, see Documentation/explanation.txt (in
particular, the "THE PROGRAM ORDER RELATION: po AND po-loc"
and "A WARNING" sections).
1. Compiler optimizations are not accurately modeled. Of course,
the use of READ_ONCE() and WRITE_ONCE() limits the compiler's
ability to optimize, but under some circumstances it is possible
for the compiler to undermine the memory model. For more
information, see Documentation/explanation.txt (in particular,
the "THE PROGRAM ORDER RELATION: po AND po-loc" and "A WARNING"
sections).
Note that this limitation in turn limits LKMM's ability to
accurately model address, control, and data dependencies.

0
tools/memory-model/scripts/checkghlitmus.sh Normal file → Executable file
View File

0
tools/memory-model/scripts/checklitmushist.sh Normal file → Executable file
View File

0
tools/memory-model/scripts/cmplitmushist.sh Normal file → Executable file
View File

0
tools/memory-model/scripts/initlitmushist.sh Normal file → Executable file
View File

0
tools/memory-model/scripts/judgelitmus.sh Normal file → Executable file
View File

0
tools/memory-model/scripts/newlitmushist.sh Normal file → Executable file
View File

0
tools/memory-model/scripts/parseargs.sh Normal file → Executable file
View File

0
tools/memory-model/scripts/runlitmushist.sh Normal file → Executable file
View File

View File

@@ -227,7 +227,7 @@ then
must_continue=yes
fi
last_ts="`tail $resdir/console.log | grep '^\[ *[0-9]\+\.[0-9]\+]' | tail -1 | sed -e 's/^\[ *//' -e 's/\..*$//'`"
if test -z "last_ts"
if test -z "$last_ts"
then
last_ts=0
fi

View File

@@ -3,3 +3,4 @@ rcutree.gp_preinit_delay=12
rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3
rcutree.kthread_prio=2
threadirqs