sched-debug.rst 2.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
  1. =================
  2. Scheduler debugfs
  3. =================
  4. Booting a kernel with CONFIG_SCHED_DEBUG=y will give access to
  5. scheduler specific debug files under /sys/kernel/debug/sched. Some of
  6. those files are described below.
  7. numa_balancing
  8. ==============
  9. `numa_balancing` directory is used to hold files to control NUMA
  10. balancing feature. If the system overhead from the feature is too
  11. high then the rate the kernel samples for NUMA hinting faults may be
  12. controlled by the `scan_period_min_ms, scan_delay_ms,
  13. scan_period_max_ms, scan_size_mb` files.
  14. scan_period_min_ms, scan_delay_ms, scan_period_max_ms, scan_size_mb
  15. -------------------------------------------------------------------
  16. Automatic NUMA balancing scans tasks address space and unmaps pages to
  17. detect if pages are properly placed or if the data should be migrated to a
  18. memory node local to where the task is running. Every "scan delay" the task
  19. scans the next "scan size" number of pages in its address space. When the
  20. end of the address space is reached the scanner restarts from the beginning.
  21. In combination, the "scan delay" and "scan size" determine the scan rate.
  22. When "scan delay" decreases, the scan rate increases. The scan delay and
  23. hence the scan rate of every task is adaptive and depends on historical
  24. behaviour. If pages are properly placed then the scan delay increases,
  25. otherwise the scan delay decreases. The "scan size" is not adaptive but
  26. the higher the "scan size", the higher the scan rate.
  27. Higher scan rates incur higher system overhead as page faults must be
  28. trapped and potentially data must be migrated. However, the higher the scan
  29. rate, the more quickly a tasks memory is migrated to a local node if the
  30. workload pattern changes and minimises performance impact due to remote
  31. memory accesses. These files control the thresholds for scan delays and
  32. the number of pages scanned.
  33. ``scan_period_min_ms`` is the minimum time in milliseconds to scan a
  34. tasks virtual memory. It effectively controls the maximum scanning
  35. rate for each task.
  36. ``scan_delay_ms`` is the starting "scan delay" used for a task when it
  37. initially forks.
  38. ``scan_period_max_ms`` is the maximum time in milliseconds to scan a
  39. tasks virtual memory. It effectively controls the minimum scanning
  40. rate for each task.
  41. ``scan_size_mb`` is how many megabytes worth of pages are scanned for
  42. a given scan.