Files
android_kernel_xiaomi_sm8450/include/linux
Naoya Horiguchi e66f17ff71 mm/hugetlb: take page table lock in follow_huge_pmd()
We have a race condition between move_pages() and freeing hugepages, where
move_pages() calls follow_page(FOLL_GET) for hugepages internally and
tries to get its refcount without preventing concurrent freeing.  This
race crashes the kernel, so this patch fixes it by moving FOLL_GET code
for hugepages into follow_huge_pmd() with taking the page table lock.

This patch intentionally removes page==NULL check after pte_page.
This is justified because pte_page() never returns NULL for any
architectures or configurations.

This patch changes the behavior of follow_huge_pmd() for tail pages and
then tail pages can be pinned/returned.  So the caller must be changed to
properly handle the returned tail pages.

We could have a choice to add the similar locking to
follow_huge_(addr|pud) for consistency, but it's not necessary because
currently these functions don't support FOLL_GET flag, so let's leave it
for future development.

Here is the reproducer:

  $ cat movepages.c
  #include <stdio.h>
  #include <stdlib.h>
  #include <numaif.h>

  #define ADDR_INPUT      0x700000000000UL
  #define HPS             0x200000
  #define PS              0x1000

  int main(int argc, char *argv[]) {
          int i;
          int nr_hp = strtol(argv[1], NULL, 0);
          int nr_p  = nr_hp * HPS / PS;
          int ret;
          void **addrs;
          int *status;
          int *nodes;
          pid_t pid;

          pid = strtol(argv[2], NULL, 0);
          addrs  = malloc(sizeof(char *) * nr_p + 1);
          status = malloc(sizeof(char *) * nr_p + 1);
          nodes  = malloc(sizeof(char *) * nr_p + 1);

          while (1) {
                  for (i = 0; i < nr_p; i++) {
                          addrs[i] = (void *)ADDR_INPUT + i * PS;
                          nodes[i] = 1;
                          status[i] = 0;
                  }
                  ret = numa_move_pages(pid, nr_p, addrs, nodes, status,
                                        MPOL_MF_MOVE_ALL);
                  if (ret == -1)
                          err("move_pages");

                  for (i = 0; i < nr_p; i++) {
                          addrs[i] = (void *)ADDR_INPUT + i * PS;
                          nodes[i] = 0;
                          status[i] = 0;
                  }
                  ret = numa_move_pages(pid, nr_p, addrs, nodes, status,
                                        MPOL_MF_MOVE_ALL);
                  if (ret == -1)
                          err("move_pages");
          }
          return 0;
  }

  $ cat hugepage.c
  #include <stdio.h>
  #include <sys/mman.h>
  #include <string.h>

  #define ADDR_INPUT      0x700000000000UL
  #define HPS             0x200000

  int main(int argc, char *argv[]) {
          int nr_hp = strtol(argv[1], NULL, 0);
          char *p;

          while (1) {
                  p = mmap((void *)ADDR_INPUT, nr_hp * HPS, PROT_READ | PROT_WRITE,
                           MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0);
                  if (p != (void *)ADDR_INPUT) {
                          perror("mmap");
                          break;
                  }
                  memset(p, 0, nr_hp * HPS);
                  munmap(p, nr_hp * HPS);
          }
  }

  $ sysctl vm.nr_hugepages=40
  $ ./hugepage 10 &
  $ ./movepages 10 $(pgrep -f hugepage)

Fixes: e632a938d9 ("mm: migrate: add hugepage migration code to move_pages()")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Hugh Dickins <hughd@google.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org>	[3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11 17:06:01 -08:00
..
2014-09-22 16:48:47 +09:00
2014-12-02 16:20:59 -08:00
2014-11-24 17:24:08 -05:00
2014-12-15 18:19:03 -08:00
2014-11-25 20:11:12 -08:00
2015-01-29 10:49:28 +02:00
2014-12-18 19:08:10 -08:00
2014-12-19 22:55:06 +01:00
2014-07-08 13:36:52 -07:00
2015-01-21 19:21:30 +01:00
2014-11-10 09:27:30 -07:00
2014-11-04 17:34:15 -08:00
2014-12-31 13:06:50 -05:00
2014-10-09 11:35:48 +03:00
2014-10-08 16:01:41 -04:00
2014-08-07 14:40:08 -04:00
2015-02-10 16:45:56 -08:00
2014-08-06 18:01:24 -07:00
2015-01-27 11:09:13 +01:00
2014-09-29 15:37:01 -04:00
2015-01-15 10:34:54 +01:00
2015-01-15 10:34:54 +01:00
2014-07-22 21:55:45 +01:00
2014-12-18 09:39:51 +01:00
2015-01-04 23:11:43 -05:00
2014-12-17 08:26:51 -05:00
2014-10-09 22:25:58 -04:00
2014-10-24 00:14:36 +02:00
2014-11-25 16:38:32 -05:00
2014-11-25 16:38:32 -05:00
2014-11-04 13:18:52 -07:00
2014-11-04 13:29:38 +00:00
2014-07-09 14:58:37 +01:00
2015-01-06 11:01:13 -08:00
2014-08-08 15:57:26 -07:00
2014-08-08 15:57:31 -07:00
2015-01-22 15:10:56 +01:00
2015-01-03 14:32:57 -05:00
2014-08-08 15:57:24 -07:00
2014-11-06 14:57:27 -08:00
2014-09-23 21:40:48 -07:00
2014-11-28 16:08:16 +01:00
2014-09-16 15:02:55 -06:00
2014-12-09 12:05:24 +02:00
2014-12-15 23:49:28 +02:00