mm: make swapin readahead skip over holes

Ever since abandoning the virtual scan of processes, for scalability
reasons, swap space has been a little more fragmented than before.  This
can lead to the situation where a large memory user is killed, swap space
ends up full of "holes" and swapin readahead is totally ineffective.

On my home system, after killing a leaky firefox it took over an hour to
page just under 2GB of memory back in, slowing the virtual machines down
to a crawl.

This patch makes swapin readahead simply skip over holes, instead of
stopping at them.  This allows the system to swap things back in at rates
of several MB/second, instead of a few hundred kB/second.

The checks done in valid_swaphandles are already done in
read_swap_cache_async as well, allowing us to remove a fair amount of
code.

[akpm@linux-foundation.org: fix it for page_cluster >= 32]
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Adrian Drzewiecki <z@drze.net>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
Rik van Riel
2012-03-21 16:33:50 -07:00
committed by Linus Torvalds
parent c38446cc65
commit 67f96aa252
3 changed files with 11 additions and 66 deletions

View File

@@ -372,25 +372,23 @@ struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask,
struct vm_area_struct *vma, unsigned long addr)
{
int nr_pages;
struct page *page;
unsigned long offset;
unsigned long end_offset;
unsigned long offset = swp_offset(entry);
unsigned long start_offset, end_offset;
unsigned long mask = (1UL << page_cluster) - 1;
/*
* Get starting offset for readaround, and number of pages to read.
* Adjust starting address by readbehind (for NUMA interleave case)?
* No, it's very unlikely that swap layout would follow vma layout,
* more likely that neighbouring swap pages came from the same node:
* so use the same "addr" to choose the same node for each swap read.
*/
nr_pages = valid_swaphandles(entry, &offset);
for (end_offset = offset + nr_pages; offset < end_offset; offset++) {
/* Read a page_cluster sized and aligned cluster around offset. */
start_offset = offset & ~mask;
end_offset = offset | mask;
if (!start_offset) /* First page is swap header. */
start_offset++;
for (offset = start_offset; offset <= end_offset ; offset++) {
/* Ok, do the async read-ahead now */
page = read_swap_cache_async(swp_entry(swp_type(entry), offset),
gfp_mask, vma, addr);
if (!page)
break;
continue;
page_cache_release(page);
}
lru_add_drain(); /* Push any new pages onto the LRU now */