Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (52 commits) sched: fix RCU lockdep splat from task_group() rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value sched: suppress RCU lockdep splat in task_fork_fair net: suppress RCU lockdep false positive in sock_update_classid rcu: move check from rcu_dereference_bh to rcu_read_lock_bh_held rcu: Add advice to PROVE_RCU_REPEATEDLY kernel config parameter rcu: Add tracing data to support queueing models rcu: fix sparse errors in rcutorture.c rcu: only one evaluation of arg in rcu_dereference_check() unless sparse kernel: Remove undead ifdef CONFIG_DEBUG_LOCK_ALLOC rcu: fix _oddness handling of verbose stall warnings rcu: performance fixes to TINY_PREEMPT_RCU callback checking rcu: upgrade stallwarn.txt documentation for CPU-bound RT processes vhost: add __rcu annotations rcu: add comment stating that list_empty() applies to RCU-protected lists rcu: apply TINY_PREEMPT_RCU read-side speedup to TREE_PREEMPT_RCU rcu: combine duplicate code, courtesy of CONFIG_PREEMPT_RCU rcu: Upgrade srcu_read_lock() docbook about SRCU grace periods rcu: document ways of stalling updates in low-memory situations rcu: repair code-duplication FIXMEs ...
This commit is contained in:
@@ -1645,7 +1645,9 @@ the amount of locking which needs to be done.
|
||||
all the readers who were traversing the list when we deleted the
|
||||
element are finished. We use <function>call_rcu()</function> to
|
||||
register a callback which will actually destroy the object once
|
||||
the readers are finished.
|
||||
all pre-existing readers are finished. Alternatively,
|
||||
<function>synchronize_rcu()</function> may be used to block until
|
||||
all pre-existing are finished.
|
||||
</para>
|
||||
<para>
|
||||
But how does Read Copy Update know when the readers are
|
||||
@@ -1714,7 +1716,7 @@ the amount of locking which needs to be done.
|
||||
- object_put(obj);
|
||||
+ list_del_rcu(&obj->list);
|
||||
cache_num--;
|
||||
+ call_rcu(&obj->rcu, cache_delete_rcu, obj);
|
||||
+ call_rcu(&obj->rcu, cache_delete_rcu);
|
||||
}
|
||||
|
||||
/* Must be holding cache_lock */
|
||||
@@ -1725,14 +1727,6 @@ the amount of locking which needs to be done.
|
||||
if (++cache_num > MAX_CACHE_SIZE) {
|
||||
struct object *i, *outcast = NULL;
|
||||
list_for_each_entry(i, &cache, list) {
|
||||
@@ -85,6 +94,7 @@
|
||||
obj->popularity = 0;
|
||||
atomic_set(&obj->refcnt, 1); /* The cache holds a reference */
|
||||
spin_lock_init(&obj->lock);
|
||||
+ INIT_RCU_HEAD(&obj->rcu);
|
||||
|
||||
spin_lock_irqsave(&cache_lock, flags);
|
||||
__cache_add(obj);
|
||||
@@ -104,12 +114,11 @@
|
||||
struct object *cache_find(int id)
|
||||
{
|
||||
|
||||
@@ -218,13 +218,22 @@ over a rather long period of time, but improvements are always welcome!
|
||||
include:
|
||||
|
||||
a. Keeping a count of the number of data-structure elements
|
||||
used by the RCU-protected data structure, including those
|
||||
waiting for a grace period to elapse. Enforce a limit
|
||||
on this number, stalling updates as needed to allow
|
||||
previously deferred frees to complete.
|
||||
used by the RCU-protected data structure, including
|
||||
those waiting for a grace period to elapse. Enforce a
|
||||
limit on this number, stalling updates as needed to allow
|
||||
previously deferred frees to complete. Alternatively,
|
||||
limit only the number awaiting deferred free rather than
|
||||
the total number of elements.
|
||||
|
||||
Alternatively, limit only the number awaiting deferred
|
||||
free rather than the total number of elements.
|
||||
One way to stall the updates is to acquire the update-side
|
||||
mutex. (Don't try this with a spinlock -- other CPUs
|
||||
spinning on the lock could prevent the grace period
|
||||
from ever ending.) Another way to stall the updates
|
||||
is for the updates to use a wrapper function around
|
||||
the memory allocator, so that this wrapper function
|
||||
simulates OOM when there is too much memory awaiting an
|
||||
RCU grace period. There are of course many other
|
||||
variations on this theme.
|
||||
|
||||
b. Limiting update rate. For example, if updates occur only
|
||||
once per hour, then no explicit rate limiting is required,
|
||||
@@ -365,3 +374,26 @@ over a rather long period of time, but improvements are always welcome!
|
||||
and the compiler to freely reorder code into and out of RCU
|
||||
read-side critical sections. It is the responsibility of the
|
||||
RCU update-side primitives to deal with this.
|
||||
|
||||
17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and
|
||||
the __rcu sparse checks to validate your RCU code. These
|
||||
can help find problems as follows:
|
||||
|
||||
CONFIG_PROVE_RCU: check that accesses to RCU-protected data
|
||||
structures are carried out under the proper RCU
|
||||
read-side critical section, while holding the right
|
||||
combination of locks, or whatever other conditions
|
||||
are appropriate.
|
||||
|
||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD: check that you don't pass the
|
||||
same object to call_rcu() (or friends) before an RCU
|
||||
grace period has elapsed since the last time that you
|
||||
passed that same object to call_rcu() (or friends).
|
||||
|
||||
__rcu sparse checks: tag the pointer to the RCU-protected data
|
||||
structure with __rcu, and sparse will warn you if you
|
||||
access that pointer without the services of one of the
|
||||
variants of rcu_dereference().
|
||||
|
||||
These debugging aids can help you find problems that are
|
||||
otherwise extremely difficult to spot.
|
||||
|
||||
@@ -80,6 +80,24 @@ o A CPU looping with bottom halves disabled. This condition can
|
||||
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
|
||||
without invoking schedule().
|
||||
|
||||
o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
|
||||
happen to preempt a low-priority task in the middle of an RCU
|
||||
read-side critical section. This is especially damaging if
|
||||
that low-priority task is not permitted to run on any other CPU,
|
||||
in which case the next RCU grace period can never complete, which
|
||||
will eventually cause the system to run out of memory and hang.
|
||||
While the system is in the process of running itself out of
|
||||
memory, you might see stall-warning messages.
|
||||
|
||||
o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
|
||||
is running at a higher priority than the RCU softirq threads.
|
||||
This will prevent RCU callbacks from ever being invoked,
|
||||
and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent
|
||||
RCU grace periods from ever completing. Either way, the
|
||||
system will eventually run out of memory and hang. In the
|
||||
CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
|
||||
messages.
|
||||
|
||||
o A bug in the RCU implementation.
|
||||
|
||||
o A hardware failure. This is quite unlikely, but has occurred
|
||||
|
||||
@@ -125,6 +125,17 @@ o "b" is the batch limit for this CPU. If more than this number
|
||||
of RCU callbacks is ready to invoke, then the remainder will
|
||||
be deferred.
|
||||
|
||||
o "ci" is the number of RCU callbacks that have been invoked for
|
||||
this CPU. Note that ci+ql is the number of callbacks that have
|
||||
been registered in absence of CPU-hotplug activity.
|
||||
|
||||
o "co" is the number of RCU callbacks that have been orphaned due to
|
||||
this CPU going offline.
|
||||
|
||||
o "ca" is the number of RCU callbacks that have been adopted due to
|
||||
other CPUs going offline. Note that ci+co-ca+ql is the number of
|
||||
RCU callbacks registered on this CPU.
|
||||
|
||||
There is also an rcu/rcudata.csv file with the same information in
|
||||
comma-separated-variable spreadsheet format.
|
||||
|
||||
@@ -180,7 +191,7 @@ o "s" is the "signaled" state that drives force_quiescent_state()'s
|
||||
|
||||
o "jfq" is the number of jiffies remaining for this grace period
|
||||
before force_quiescent_state() is invoked to help push things
|
||||
along. Note that CPUs in dyntick-idle mode thoughout the grace
|
||||
along. Note that CPUs in dyntick-idle mode throughout the grace
|
||||
period will not report on their own, but rather must be check by
|
||||
some other CPU via force_quiescent_state().
|
||||
|
||||
|
||||
Reference in New Issue
Block a user