sched/numa: Skip some page migrations after a shared fault

Shared faults can lead to lots of unnecessary page migrations,
slowing down the system, and causing private faults to hit the
per-pgdat migration ratelimit.

This patch adds sysctl numa_balancing_migrate_deferred, which specifies
how many shared page migrations to skip unconditionally, after each page
migration that is skipped because it is a shared fault.

This reduces the number of page migrations back and forth in
shared fault situations. It also gives a strong preference to
the tasks that are already running where most of the memory is,
and to moving the other tasks to near the memory.

Testing this with a much higher scan rate than the default
still seems to result in fewer page migrations than before.

Memory seems to be somewhat better consolidated than previously,
with multi-instance specjbb runs on a 4 node system.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-62-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Этот коммит содержится в:
Rik van Riel
2013-10-07 11:29:39 +01:00
коммит произвёл Ingo Molnar
родитель 1e3646ffc6
Коммит de1c9ce6f0
5 изменённых файлов: 75 добавлений и 3 удалений

Просмотреть файл

@@ -833,6 +833,14 @@ unsigned int sysctl_numa_balancing_scan_size = 256;
/* Scan @scan_size MB every @scan_period after an initial @scan_delay in ms */
unsigned int sysctl_numa_balancing_scan_delay = 1000;
/*
* After skipping a page migration on a shared page, skip N more numa page
* migrations unconditionally. This reduces the number of NUMA migrations
* in shared memory workloads, and has the effect of pulling tasks towards
* where their memory lives, over pulling the memory towards the task.
*/
unsigned int sysctl_numa_balancing_migrate_deferred = 16;
static unsigned int task_nr_scan_windows(struct task_struct *p)
{
unsigned long rss = 0;