buslock.rst 4.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126
  1. .. SPDX-License-Identifier: GPL-2.0
  2. .. include:: <isonum.txt>
  3. ===============================
  4. Bus lock detection and handling
  5. ===============================
  6. :Copyright: |copy| 2021 Intel Corporation
  7. :Authors: - Fenghua Yu <[email protected]>
  8. - Tony Luck <[email protected]>
  9. Problem
  10. =======
  11. A split lock is any atomic operation whose operand crosses two cache lines.
  12. Since the operand spans two cache lines and the operation must be atomic,
  13. the system locks the bus while the CPU accesses the two cache lines.
  14. A bus lock is acquired through either split locked access to writeback (WB)
  15. memory or any locked access to non-WB memory. This is typically thousands of
  16. cycles slower than an atomic operation within a cache line. It also disrupts
  17. performance on other cores and brings the whole system to its knees.
  18. Detection
  19. =========
  20. Intel processors may support either or both of the following hardware
  21. mechanisms to detect split locks and bus locks.
  22. #AC exception for split lock detection
  23. --------------------------------------
  24. Beginning with the Tremont Atom CPU split lock operations may raise an
  25. Alignment Check (#AC) exception when a split lock operation is attemped.
  26. #DB exception for bus lock detection
  27. ------------------------------------
  28. Some CPUs have the ability to notify the kernel by an #DB trap after a user
  29. instruction acquires a bus lock and is executed. This allows the kernel to
  30. terminate the application or to enforce throttling.
  31. Software handling
  32. =================
  33. The kernel #AC and #DB handlers handle bus lock based on the kernel
  34. parameter "split_lock_detect". Here is a summary of different options:
  35. +------------------+----------------------------+-----------------------+
  36. |split_lock_detect=|#AC for split lock |#DB for bus lock |
  37. +------------------+----------------------------+-----------------------+
  38. |off |Do nothing |Do nothing |
  39. +------------------+----------------------------+-----------------------+
  40. |warn |Kernel OOPs |Warn once per task and |
  41. |(default) |Warn once per task and |and continues to run. |
  42. | |disable future checking | |
  43. | |When both features are | |
  44. | |supported, warn in #AC | |
  45. +------------------+----------------------------+-----------------------+
  46. |fatal |Kernel OOPs |Send SIGBUS to user. |
  47. | |Send SIGBUS to user | |
  48. | |When both features are | |
  49. | |supported, fatal in #AC | |
  50. +------------------+----------------------------+-----------------------+
  51. |ratelimit:N |Do nothing |Limit bus lock rate to |
  52. |(0 < N <= 1000) | |N bus locks per second |
  53. | | |system wide and warn on|
  54. | | |bus locks. |
  55. +------------------+----------------------------+-----------------------+
  56. Usages
  57. ======
  58. Detecting and handling bus lock may find usages in various areas:
  59. It is critical for real time system designers who build consolidated real
  60. time systems. These systems run hard real time code on some cores and run
  61. "untrusted" user processes on other cores. The hard real time cannot afford
  62. to have any bus lock from the untrusted processes to hurt real time
  63. performance. To date the designers have been unable to deploy these
  64. solutions as they have no way to prevent the "untrusted" user code from
  65. generating split lock and bus lock to block the hard real time code to
  66. access memory during bus locking.
  67. It's also useful for general computing to prevent guests or user
  68. applications from slowing down the overall system by executing instructions
  69. with bus lock.
  70. Guidance
  71. ========
  72. off
  73. ---
  74. Disable checking for split lock and bus lock. This option can be useful if
  75. there are legacy applications that trigger these events at a low rate so
  76. that mitigation is not needed.
  77. warn
  78. ----
  79. A warning is emitted when a bus lock is detected which allows to identify
  80. the offending application. This is the default behavior.
  81. fatal
  82. -----
  83. In this case, the bus lock is not tolerated and the process is killed.
  84. ratelimit
  85. ---------
  86. A system wide bus lock rate limit N is specified where 0 < N <= 1000. This
  87. allows a bus lock rate up to N bus locks per second. When the bus lock rate
  88. is exceeded then any task which is caught via the buslock #DB exception is
  89. throttled by enforced sleeps until the rate goes under the limit again.
  90. This is an effective mitigation in cases where a minimal impact can be
  91. tolerated, but an eventual Denial of Service attack has to be prevented. It
  92. allows to identify the offending processes and analyze whether they are
  93. malicious or just badly written.
  94. Selecting a rate limit of 1000 allows the bus to be locked for up to about
  95. seven million cycles each second (assuming 7000 cycles for each bus
  96. lock). On a 2 GHz processor that would be about 0.35% system slowdown.