rcu/nocb: Add bypass callback queueing

Use of the rcu_data structure's segmented ->cblist for no-CBs CPUs
takes advantage of unrelated grace periods, thus reducing the memory
footprint in the face of floods of call_rcu() invocations.  However,
the ->cblist field is a more-complex rcu_segcblist structure which must
be protected via locking.  Even though there are only three entities
which can acquire this lock (the CPU invoking call_rcu(), the no-CBs
grace-period kthread, and the no-CBs callbacks kthread), the contention
on this lock is excessive under heavy stress.

This commit therefore greatly reduces contention by provisioning
an rcu_cblist structure field named ->nocb_bypass within the
rcu_data structure.  Each no-CBs CPU is permitted only a limited
number of enqueues onto the ->cblist per jiffy, controlled by a new
nocb_nobypass_lim_per_jiffy kernel boot parameter that defaults to
about 16 enqueues per millisecond (16 * 1000 / HZ).  When that limit is
exceeded, the CPU instead enqueues onto the new ->nocb_bypass.

The ->nocb_bypass is flushed into the ->cblist every jiffy or when
the number of callbacks on ->nocb_bypass exceeds qhimark, whichever
happens first.  During call_rcu() floods, this flushing is carried out
by the CPU during the course of its call_rcu() invocations.  However,
a CPU could simply stop invoking call_rcu() at any time.  The no-CBs
grace-period kthread therefore carries out less-aggressive flushing
(every few jiffies or when the number of callbacks on ->nocb_bypass
exceeds (2 * qhimark), whichever comes first).  This means that the
no-CBs grace-period kthread cannot be permitted to do unbounded waits
while there are callbacks on ->nocb_bypass.  A ->nocb_bypass_timer is
used to provide the needed wakeups.

[ paulmck: Apply Coverity feedback reported by Colin Ian King. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
This commit is contained in:
Paul E. McKenney
2019-07-02 16:03:33 -07:00
parent eda669a6a2
commit d1b222c6be
5 changed files with 396 additions and 42 deletions

View File

@@ -36,6 +36,36 @@ void rcu_cblist_enqueue(struct rcu_cblist *rclp, struct rcu_head *rhp)
WRITE_ONCE(rclp->len, rclp->len + 1);
}
/*
* Flush the second rcu_cblist structure onto the first one, obliterating
* any contents of the first. If rhp is non-NULL, enqueue it as the sole
* element of the second rcu_cblist structure, but ensuring that the second
* rcu_cblist structure, if initially non-empty, always appears non-empty
* throughout the process. If rdp is NULL, the second rcu_cblist structure
* is instead initialized to empty.
*/
void rcu_cblist_flush_enqueue(struct rcu_cblist *drclp,
struct rcu_cblist *srclp,
struct rcu_head *rhp)
{
drclp->head = srclp->head;
if (drclp->head)
drclp->tail = srclp->tail;
else
drclp->tail = &drclp->head;
drclp->len = srclp->len;
drclp->len_lazy = srclp->len_lazy;
if (!rhp) {
rcu_cblist_init(srclp);
} else {
rhp->next = NULL;
srclp->head = rhp;
srclp->tail = &rhp->next;
WRITE_ONCE(srclp->len, 1);
srclp->len_lazy = 0;
}
}
/*
* Dequeue the oldest rcu_head structure from the specified callback
* list. This function assumes that the callback is non-lazy, but