mm: avoid false sharing of mm_counter
Considering the nature of per mm stats, it's the shared object among threads and can be a cache-miss point in the page fault path. This patch adds per-thread cache for mm_counter. RSS value will be counted into a struct in task_struct and synchronized with mm's one at events. Now, in this patch, the event is the number of calls to handle_mm_fault. Per-thread value is added to mm at each 64 calls. rough estimation with small benchmark on parallel thread (2threads) shows [before] 4.5 cache-miss/faults [after] 4.0 cache-miss/faults Anyway, the most contended object is mmap_sem if the number of threads grows. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:

committed by
Linus Torvalds

parent
d559db086f
commit
34e55232e5
@@ -188,6 +188,12 @@ memory usage. Its seven fields are explained in Table 1-3. The stat file
|
||||
contains details information about the process itself. Its fields are
|
||||
explained in Table 1-4.
|
||||
|
||||
(for SMP CONFIG users)
|
||||
For making accounting scalable, RSS related information are handled in
|
||||
asynchronous manner and the vaule may not be very precise. To see a precise
|
||||
snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table.
|
||||
It's slow but very precise.
|
||||
|
||||
Table 1-2: Contents of the statm files (as of 2.6.30-rc7)
|
||||
..............................................................................
|
||||
Field Content
|
||||
|
Reference in New Issue
Block a user