kernel.h: Add non_block_start/end()

In some special cases we must not block, but there's not a spinlock,
preempt-off, irqs-off or similar critical section already that arms the
might_sleep() debug checks. Add a non_block_start/end() pair to annotate
these.

This will be used in the oom paths of mmu-notifiers, where blocking is not
allowed to make sure there's forward progress. Quoting Michal:

"The notifier is called from quite a restricted context - oom_reaper -
which shouldn't depend on any locks or sleepable conditionals. The code
should be swift as well but we mostly do care about it to make a forward
progress. Checking for sleepable context is the best thing we could come
up with that would describe these demands at least partially."

Peter also asked whether we want to catch spinlocks on top, but Michal
said those are less of a problem because spinlocks can't have an indirect
dependency upon the page allocator and hence close the loop with the oom
reaper.

Suggested by Michal Hocko.

Link: https://lore.kernel.org/r/20190826201425.17547-4-daniel.vetter@ffwll.ch
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This commit is contained in:
Daniel Vetter
2019-08-26 22:14:23 +02:00
committed by Jason Gunthorpe
parent f2bc09e951
commit 312364f353
3 changed files with 40 additions and 6 deletions

View File

@@ -3700,13 +3700,22 @@ static noinline void __schedule_bug(struct task_struct *prev)
/*
* Various schedule()-time debugging checks and statistics:
*/
static inline void schedule_debug(struct task_struct *prev)
static inline void schedule_debug(struct task_struct *prev, bool preempt)
{
#ifdef CONFIG_SCHED_STACK_END_CHECK
if (task_stack_end_corrupted(prev))
panic("corrupted stack end detected inside scheduler\n");
#endif
#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
if (!preempt && prev->state && prev->non_block_count) {
printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
prev->comm, prev->pid, prev->non_block_count);
dump_stack();
add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
}
#endif
if (unlikely(in_atomic_preempt_off())) {
__schedule_bug(prev);
preempt_count_set(PREEMPT_DISABLED);
@@ -3813,7 +3822,7 @@ static void __sched notrace __schedule(bool preempt)
rq = cpu_rq(cpu);
prev = rq->curr;
schedule_debug(prev);
schedule_debug(prev, preempt);
if (sched_feat(HRTICK))
hrtick_clear(rq);
@@ -6570,7 +6579,7 @@ void ___might_sleep(const char *file, int line, int preempt_offset)
rcu_sleep_check();
if ((preempt_count_equals(preempt_offset) && !irqs_disabled() &&
!is_idle_task(current)) ||
!is_idle_task(current) && !current->non_block_count) ||
system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
oops_in_progress)
return;
@@ -6586,8 +6595,8 @@ void ___might_sleep(const char *file, int line, int preempt_offset)
"BUG: sleeping function called from invalid context at %s:%d\n",
file, line);
printk(KERN_ERR
"in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
in_atomic(), irqs_disabled(),
"in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
in_atomic(), irqs_disabled(), current->non_block_count,
current->pid, current->comm);
if (task_stack_end_corrupted(current))