rcu: Reduce overhead of cond_resched() checks for RCU

Commit ac1bea8578 (Make cond_resched() report RCU quiescent states)
fixed a problem where a CPU looping in the kernel with but one runnable
task would give RCU CPU stall warnings, even if the in-kernel loop
contained cond_resched() calls.  Unfortunately, in so doing, it introduced
performance regressions in Anton Blanchard's will-it-scale "open1" test.
The problem appears to be not so much the increased cond_resched() path
length as an increase in the rate at which grace periods complete, which
increased per-update grace-period overhead.

This commit takes a different approach to fixing this bug, mainly by
moving the RCU-visible quiescent state from cond_resched() to
rcu_note_context_switch(), and by further reducing the check to a
simple non-zero test of a single per-CPU variable.  However, this
approach requires that the force-quiescent-state processing send
resched IPIs to the offending CPUs.  These will be sent only once
the grace period has reached an age specified by the boot/sysfs
parameter rcutree.jiffies_till_sched_qs, or once the grace period
reaches an age halfway to the point at which RCU CPU stall warnings
will be emitted, whichever comes first.

Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Lameter <cl@gentwo.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
[ paulmck: Made rcu_momentary_dyntick_idle() as suggested by the
  ktest build robot.  Also fixed smp_mb() comment as noted by
  Oleg Nesterov. ]

Merge with e552592e (Reduce overhead of cond_resched() checks for RCU)

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit is contained in:
Paul E. McKenney
2014-06-20 16:49:01 -07:00
parent 546a9d8519
commit 4a81e8328d
7 changed files with 125 additions and 90 deletions

View File

@@ -44,7 +44,6 @@
#include <linux/debugobjects.h>
#include <linux/bug.h>
#include <linux/compiler.h>
#include <linux/percpu.h>
#include <asm/barrier.h>
extern int rcu_expedited; /* for sysctl */
@@ -299,41 +298,6 @@ static inline void rcu_user_hooks_switch(struct task_struct *prev,
bool __rcu_is_watching(void);
#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP) */
/*
* Hooks for cond_resched() and friends to avoid RCU CPU stall warnings.
*/
#define RCU_COND_RESCHED_LIM 256 /* ms vs. 100s of ms. */
DECLARE_PER_CPU(int, rcu_cond_resched_count);
void rcu_resched(void);
/*
* Is it time to report RCU quiescent states?
*
* Note unsynchronized access to rcu_cond_resched_count. Yes, we might
* increment some random CPU's count, and possibly also load the result from
* yet another CPU's count. We might even clobber some other CPU's attempt
* to zero its counter. This is all OK because the goal is not precision,
* but rather reasonable amortization of rcu_note_context_switch() overhead
* and extremely high probability of avoiding RCU CPU stall warnings.
* Note that this function has to be preempted in just the wrong place,
* many thousands of times in a row, for anything bad to happen.
*/
static inline bool rcu_should_resched(void)
{
return raw_cpu_inc_return(rcu_cond_resched_count) >=
RCU_COND_RESCHED_LIM;
}
/*
* Report quiscent states to RCU if it is time to do so.
*/
static inline void rcu_cond_resched(void)
{
if (unlikely(rcu_should_resched()))
rcu_resched();
}
/*
* Infrastructure to implement the synchronize_() primitives in
* TREE_RCU and rcu_barrier_() primitives in TINY_RCU.