Files
android_kernel_xiaomi_sm8450/include/linux
KAMEZAWA Hiroyuki 89c06bd52f memcg: use new logic for page stat accounting
Now, page-stat-per-memcg is recorded into per page_cgroup flag by
duplicating page's status into the flag.  The reason is that memcg has a
feature to move a page from a group to another group and we have race
between "move" and "page stat accounting",

Under current logic, assume CPU-A and CPU-B.  CPU-A does "move" and CPU-B
does "page stat accounting".

When CPU-A goes 1st,

            CPU-A                           CPU-B
                                    update "struct page" info.
    move_lock_mem_cgroup(memcg)
    see pc->flags
    copy page stat to new group
    overwrite pc->mem_cgroup.
    move_unlock_mem_cgroup(memcg)
                                    move_lock_mem_cgroup(mem)
                                    set pc->flags
                                    update page stat accounting
                                    move_unlock_mem_cgroup(mem)

stat accounting is guarded by move_lock_mem_cgroup() and "move" logic
(CPU-A) doesn't see changes in "struct page" information.

But it's costly to have the same information both in 'struct page' and
'struct page_cgroup'.  And, there is a potential problem.

For example, assume we have PG_dirty accounting in memcg.
PG_..is a flag for struct page.
PCG_ is a flag for struct page_cgroup.
(This is just an example. The same problem can be found in any
 kind of page stat accounting.)

	  CPU-A                               CPU-B
      TestSet PG_dirty
      (delay)                        TestClear PG_dirty
                                     if (TestClear(PCG_dirty))
                                          memcg->nr_dirty--
      if (TestSet(PCG_dirty))
          memcg->nr_dirty++

Here, memcg->nr_dirty = +1, this is wrong.  This race was reported by Greg
Thelen <gthelen@google.com>.  Now, only FILE_MAPPED is supported but
fortunately, it's serialized by page table lock and this is not real bug,
_now_,

If this potential problem is caused by having duplicated information in
struct page and struct page_cgroup, we may be able to fix this by using
original 'struct page' information.  But we'll have a problem in "move
account"

Assume we use only PG_dirty.

         CPU-A                   CPU-B
    TestSet PG_dirty
    (delay)                    move_lock_mem_cgroup()
                               if (PageDirty(page))
                                      new_memcg->nr_dirty++
                               pc->mem_cgroup = new_memcg;
                               move_unlock_mem_cgroup()
    move_lock_mem_cgroup()
    memcg = pc->mem_cgroup
    new_memcg->nr_dirty++

accounting information may be double-counted.  This was original reason to
have PCG_xxx flags but it seems PCG_xxx has another problem.

I think we need a bigger lock as

     move_lock_mem_cgroup(page)
     TestSetPageDirty(page)
     update page stats (without any checks)
     move_unlock_mem_cgroup(page)

This fixes both of problems and we don't have to duplicate page flag into
page_cgroup.  Please note: move_lock_mem_cgroup() is held only when there
are possibility of "account move" under the system.  So, in most path,
status update will go without atomic locks.

This patch introduces mem_cgroup_begin_update_page_stat() and
mem_cgroup_end_update_page_stat() both should be called at modifying
'struct page' information if memcg takes care of it.  as

     mem_cgroup_begin_update_page_stat()
     modify page information
     mem_cgroup_update_page_stat()
     => never check any 'struct page' info, just update counters.
     mem_cgroup_end_update_page_stat().

This patch is slow because we need to call begin_update_page_stat()/
end_update_page_stat() regardless of accounted will be changed or not.  A
following patch adds an easy optimization and reduces the cost.

[akpm@linux-foundation.org: s/lock/locked/]
[hughd@google.com: fix deadlock by avoiding stat lock when anon]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Greg Thelen <gthelen@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-21 17:55:01 -07:00
..
2011-07-22 08:25:37 -07:00
2011-11-02 16:07:03 -07:00
2011-07-20 20:47:43 -04:00
2011-10-26 15:43:25 -04:00
2011-07-26 16:49:47 -07:00
2011-07-26 16:49:47 -07:00
2011-08-03 11:30:42 -04:00
2011-07-31 22:05:09 +02:00
2011-07-26 16:49:47 -07:00
2012-03-21 17:55:01 -07:00
2012-02-28 16:02:54 +01:00
2012-01-03 22:54:57 -05:00
2011-10-29 21:20:22 +02:00
2011-07-06 14:44:42 -07:00
2011-07-25 20:57:16 -07:00
2011-12-13 09:26:45 +00:00
2011-07-05 23:42:17 -07:00
2011-07-26 16:49:47 -07:00
2012-02-20 19:46:36 +11:00
2012-01-04 08:56:31 -06:00
2011-10-31 20:19:04 +00:00
2011-11-26 14:59:39 -05:00
2011-12-09 17:35:51 -08:00
2011-12-11 18:25:16 -05:00
2012-01-12 20:13:04 -08:00
2011-09-14 15:24:51 -04:00
2012-01-03 22:54:58 -05:00
2011-07-26 16:49:47 -07:00
2011-07-26 16:49:47 -07:00
2011-12-13 11:58:49 +01:00
2011-07-01 15:34:45 -07:00
2012-03-20 21:48:30 +08:00
2011-06-28 10:48:34 +02:00
2011-07-01 10:37:15 +02:00
2012-01-03 22:54:56 -05:00
2011-07-21 13:47:54 -07:00
2012-01-03 22:55:17 -05:00
2012-01-09 13:52:09 +01:00
2012-03-08 10:50:35 -08:00
2012-03-21 17:54:57 -07:00
2012-03-08 10:50:35 -08:00
2011-07-26 16:49:47 -07:00
2012-01-17 15:40:51 -08:00
2011-12-27 11:26:41 +02:00
2011-09-16 19:20:20 -04:00
2012-03-15 21:41:34 +01:00
2012-03-03 15:04:45 -05:00
2011-07-26 16:49:47 -07:00
2011-07-26 16:49:47 -07:00
2011-11-14 00:47:54 -05:00
2011-07-31 12:18:16 -04:00
2012-03-21 17:54:58 -07:00
2012-01-06 12:10:26 -08:00
2012-03-09 08:26:05 +01:00
2012-01-12 15:23:04 -08:00
2012-03-16 21:49:24 +01:00
2011-12-13 09:26:45 +00:00
2011-11-02 16:07:02 -07:00
2012-01-03 22:55:07 -05:00
2012-01-03 22:54:56 -05:00
2011-07-26 14:50:01 -07:00
2012-01-03 22:52:40 -05:00
2012-03-08 11:38:50 -08:00
2012-01-09 09:33:57 +09:00
2011-07-30 08:44:19 -10:00
2012-03-19 16:53:08 -04:00
2011-07-26 16:49:47 -07:00
2011-12-13 09:26:45 +00:00
2011-07-26 16:49:47 -07:00
2011-07-25 20:57:11 -07:00
2011-10-31 17:30:47 -07:00
2011-08-16 00:16:49 -07:00
2011-08-03 14:25:22 -10:00
2012-01-03 22:54:56 -05:00
2011-06-27 20:30:08 +02:00
2012-02-02 14:55:45 -08:00
2011-11-02 16:07:02 -07:00
2011-07-26 16:49:47 -07:00
2012-03-08 10:50:35 -08:00
2011-09-14 15:24:51 -04:00