mm, thp: add new defer+madvise defrag option
There is no thp defrag option that currently allows MADV_HUGEPAGE regions to do direct compaction and reclaim while all other thp allocations simply trigger kswapd and kcompactd in the background and fail immediately. The "defer" setting simply triggers background reclaim and compaction for all regions, regardless of MADV_HUGEPAGE, which makes it unusable for our userspace where MADV_HUGEPAGE is being used to indicate the application is willing to wait for work for thp memory to be available. The "madvise" setting will do direct compaction and reclaim for these MADV_HUGEPAGE regions, but does not trigger kswapd and kcompactd in the background for anybody else. For reasonable usage, there needs to be a mesh between the two options. This patch introduces a fifth mode, "defer+madvise", that will do direct reclaim and compaction for MADV_HUGEPAGE regions and trigger background reclaim and compaction for everybody else so that hugepages may be available in the near future. A proposal to allow direct reclaim and compaction for MADV_HUGEPAGE regions as part of the "defer" mode, making it a very powerful setting and avoids breaking userspace, was offered: http://marc.info/?t=148236612700003 This additional mode is a compromise. A second proposal to allow both "defer" and "madvise" to be selected at the same time was also offered: http://marc.info/?t=148357345300001. This is possible, but there was a concern that it might break existing userspaces the parse the output of the defrag mode, so the fifth option was introduced instead. This patch also cleans up the helper function for storing to "enabled" and "defrag" since the former supports three modes while the latter supports five and triple_flag_store() was getting unnecessarily messy. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1701101614330.41805@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:

committed by
Linus Torvalds

parent
ba81f83842
commit
21440d7eb9
@@ -110,6 +110,7 @@ MADV_HUGEPAGE region.
|
||||
|
||||
echo always >/sys/kernel/mm/transparent_hugepage/defrag
|
||||
echo defer >/sys/kernel/mm/transparent_hugepage/defrag
|
||||
echo defer+madvise >/sys/kernel/mm/transparent_hugepage/defrag
|
||||
echo madvise >/sys/kernel/mm/transparent_hugepage/defrag
|
||||
echo never >/sys/kernel/mm/transparent_hugepage/defrag
|
||||
|
||||
@@ -120,10 +121,15 @@ that benefit heavily from THP use and are willing to delay the VM start
|
||||
to utilise them.
|
||||
|
||||
"defer" means that an application will wake kswapd in the background
|
||||
to reclaim pages and wake kcompact to compact memory so that THP is
|
||||
to reclaim pages and wake kcompactd to compact memory so that THP is
|
||||
available in the near future. It's the responsibility of khugepaged
|
||||
to then install the THP pages later.
|
||||
|
||||
"defer+madvise" will enter direct reclaim and compaction like "always", but
|
||||
only for regions that have used madvise(MADV_HUGEPAGE); all other regions
|
||||
will wake kswapd in the background to reclaim pages and wake kcompactd to
|
||||
compact memory so that THP is available in the near future.
|
||||
|
||||
"madvise" will enter direct reclaim like "always" but only for regions
|
||||
that are have used madvise(MADV_HUGEPAGE). This is the default behaviour.
|
||||
|
||||
|
Reference in New Issue
Block a user