Files
android_kernel_xiaomi_sm8450/include/linux
Mel Gorman c67fe3752a mm: compaction: Abort async compaction if locks are contended or taking too long
Jim Schutt reported a problem that pointed at compaction contending
heavily on locks.  The workload is straight-forward and in his own words;

	The systems in question have 24 SAS drives spread across 3 HBAs,
	running 24 Ceph OSD instances, one per drive.  FWIW these servers
	are dual-socket Intel 5675 Xeons w/48 GB memory.  I've got ~160
	Ceph Linux clients doing dd simultaneously to a Ceph file system
	backed by 12 of these servers.

Early in the test everything looks fine

  procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
   r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
  31 15          0     287216        576   38606628    0    0     2  1158    2   14   1  3  95  0  0
  27 15          0     225288        576   38583384    0    0    18 2222016 203357 134876  11 56  17 15  0
  28 17          0     219256        576   38544736    0    0    11 2305932 203141 146296  11 49  23 17  0
   6 18          0     215596        576   38552872    0    0     7 2363207 215264 166502  12 45  22 20  0
  22 18          0     226984        576   38596404    0    0     3 2445741 223114 179527  12 43  23 22  0

and then it goes to pot

  procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
   r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
  163  8          0     464308        576   36791368    0    0    11 22210  866  536   3 13  79  4  0
  207 14          0     917752        576   36181928    0    0   712 1345376 134598 47367   7 90   1  2  0
  123 12          0     685516        576   36296148    0    0   429 1386615 158494 60077   8 84   5  3  0
  123 12          0     598572        576   36333728    0    0  1107 1233281 147542 62351   7 84   5  4  0
  622  7          0     660768        576   36118264    0    0   557 1345548 151394 59353   7 85   4  3  0
  223 11          0     283960        576   36463868    0    0    46 1107160 121846 33006   6 93   1  1  0

Note that system CPU usage is very high blocks being written out has
dropped by 42%. He analysed this with perf and found

  perf record -g -a sleep 10
  perf report --sort symbol --call-graph fractal,5
    34.63%  [k] _raw_spin_lock_irqsave
            |
            |--97.30%-- isolate_freepages
            |          compaction_alloc
            |          unmap_and_move
            |          migrate_pages
            |          compact_zone
            |          compact_zone_order
            |          try_to_compact_pages
            |          __alloc_pages_direct_compact
            |          __alloc_pages_slowpath
            |          __alloc_pages_nodemask
            |          alloc_pages_vma
            |          do_huge_pmd_anonymous_page
            |          handle_mm_fault
            |          do_page_fault
            |          page_fault
            |          |
            |          |--87.39%-- skb_copy_datagram_iovec
            |          |          tcp_recvmsg
            |          |          inet_recvmsg
            |          |          sock_recvmsg
            |          |          sys_recvfrom
            |          |          system_call
            |          |          __recv
            |          |          |
            |          |           --100.00%-- (nil)
            |          |
            |           --12.61%-- memcpy
             --2.70%-- [...]

There was other data but primarily it is all showing that compaction is
contended heavily on the zone->lock and zone->lru_lock.

commit [b2eef8c0: mm: compaction: minimise the time IRQs are disabled
while isolating pages for migration] noted that it was possible for
migration to hold the lru_lock for an excessive amount of time. Very
broadly speaking this patch expands the concept.

This patch introduces compact_checklock_irqsave() to check if a lock
is contended or the process needs to be scheduled. If either condition
is true then async compaction is aborted and the caller is informed.
The page allocator will fail a THP allocation if compaction failed due
to contention. This patch also introduces compact_trylock_irqsave()
which will acquire the lock only if it is not contended and the process
does not need to schedule.

Reported-by: Jim Schutt <jaschut@sandia.gov>
Tested-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-21 16:45:03 -07:00
..
2012-08-02 13:51:46 -04:00
2012-07-30 18:16:01 -07:00
2012-07-30 18:15:23 -07:00
2012-04-23 14:23:32 +03:00
2012-05-15 17:30:30 -04:00
2012-06-06 15:20:22 -04:00
2012-07-30 17:25:21 -07:00
2012-06-29 11:38:17 -04:00
2012-03-20 21:29:46 -04:00
2012-08-07 10:10:57 +02:00
2012-07-31 18:42:43 -07:00
2012-06-06 19:12:30 -07:00
2012-07-30 17:25:11 -07:00
2012-02-28 16:02:54 +01:00
2012-03-23 16:58:38 -07:00
2012-04-14 15:24:26 -04:00
2012-05-02 14:15:27 -05:00
2012-05-25 12:46:23 +05:30
2012-05-10 12:00:56 +02:00
2012-06-27 15:42:24 -07:00
2012-05-29 23:28:33 -04:00
2012-07-22 23:57:58 +04:00
2012-08-04 01:24:44 +04:00
2012-04-27 10:46:45 +08:00
2012-07-05 15:04:10 +02:00
2012-03-26 21:47:19 +02:00
2012-03-26 21:47:19 +02:00
2012-03-26 21:47:19 +02:00
2012-05-12 14:28:14 +02:00
2012-03-26 21:47:19 +02:00
2012-07-02 15:11:12 +02:00
2012-03-27 22:45:26 -04:00
2012-07-20 11:07:00 -07:00
2012-03-20 12:47:48 +01:00
2012-05-07 10:58:57 -06:00
2012-03-20 12:47:47 +01:00
2012-08-06 13:33:21 -07:00
2012-07-19 10:38:32 -04:00
2012-04-09 11:16:55 -07:00
2012-03-08 10:50:35 -08:00
2012-05-31 17:49:32 -07:00
2012-07-31 08:16:42 -05:00
2012-05-31 17:49:30 -07:00
2012-03-08 10:50:35 -08:00
2012-05-11 10:56:56 +01:00
2012-05-29 23:28:41 -04:00
2012-03-15 21:41:34 +01:00
2012-05-09 13:58:06 -07:00
2012-05-22 15:20:28 -04:00
2012-07-29 21:24:13 +04:00
2012-07-09 16:42:24 -04:00
2012-05-26 14:17:30 -04:00
2012-07-30 19:06:52 -04:00
2012-07-30 19:06:52 -04:00
2012-07-30 19:06:52 -04:00
2012-07-10 10:32:06 -05:00
2012-05-21 14:31:48 +01:00
2012-07-31 18:42:43 -07:00
2012-06-20 14:39:36 -07:00
2012-05-14 14:15:32 -07:00
2012-07-12 07:54:46 -07:00
2012-05-12 15:53:42 -04:00
2012-06-15 12:56:57 +02:00
2012-03-20 21:29:38 -04:00
2012-07-19 10:38:32 -04:00
2012-03-28 18:30:03 +01:00
2012-07-30 17:25:20 -07:00
2012-06-01 12:58:52 -04:00
2012-07-02 13:40:06 +03:00
2012-06-05 17:32:30 +02:00
2012-07-16 22:31:34 -07:00
2012-06-13 21:16:42 +02:00
2012-05-14 18:53:19 -04:00
2012-07-23 00:58:46 -07:00
2012-07-22 23:57:55 +04:00
2012-06-18 13:42:03 +02:00
2012-07-31 08:16:24 -06:00
2012-05-22 12:16:16 +09:30
2012-07-31 18:42:50 -07:00
2012-03-28 18:30:03 +01:00
2012-08-04 12:15:37 +04:00