123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126 |
- .. SPDX-License-Identifier: GPL-2.0
- .. include:: <isonum.txt>
- ===============================
- Bus lock detection and handling
- ===============================
- :Copyright: |copy| 2021 Intel Corporation
- :Authors: - Fenghua Yu <[email protected]>
- - Tony Luck <[email protected]>
- Problem
- =======
- A split lock is any atomic operation whose operand crosses two cache lines.
- Since the operand spans two cache lines and the operation must be atomic,
- the system locks the bus while the CPU accesses the two cache lines.
- A bus lock is acquired through either split locked access to writeback (WB)
- memory or any locked access to non-WB memory. This is typically thousands of
- cycles slower than an atomic operation within a cache line. It also disrupts
- performance on other cores and brings the whole system to its knees.
- Detection
- =========
- Intel processors may support either or both of the following hardware
- mechanisms to detect split locks and bus locks.
- #AC exception for split lock detection
- --------------------------------------
- Beginning with the Tremont Atom CPU split lock operations may raise an
- Alignment Check (#AC) exception when a split lock operation is attemped.
- #DB exception for bus lock detection
- ------------------------------------
- Some CPUs have the ability to notify the kernel by an #DB trap after a user
- instruction acquires a bus lock and is executed. This allows the kernel to
- terminate the application or to enforce throttling.
- Software handling
- =================
- The kernel #AC and #DB handlers handle bus lock based on the kernel
- parameter "split_lock_detect". Here is a summary of different options:
- +------------------+----------------------------+-----------------------+
- |split_lock_detect=|#AC for split lock |#DB for bus lock |
- +------------------+----------------------------+-----------------------+
- |off |Do nothing |Do nothing |
- +------------------+----------------------------+-----------------------+
- |warn |Kernel OOPs |Warn once per task and |
- |(default) |Warn once per task and |and continues to run. |
- | |disable future checking | |
- | |When both features are | |
- | |supported, warn in #AC | |
- +------------------+----------------------------+-----------------------+
- |fatal |Kernel OOPs |Send SIGBUS to user. |
- | |Send SIGBUS to user | |
- | |When both features are | |
- | |supported, fatal in #AC | |
- +------------------+----------------------------+-----------------------+
- |ratelimit:N |Do nothing |Limit bus lock rate to |
- |(0 < N <= 1000) | |N bus locks per second |
- | | |system wide and warn on|
- | | |bus locks. |
- +------------------+----------------------------+-----------------------+
- Usages
- ======
- Detecting and handling bus lock may find usages in various areas:
- It is critical for real time system designers who build consolidated real
- time systems. These systems run hard real time code on some cores and run
- "untrusted" user processes on other cores. The hard real time cannot afford
- to have any bus lock from the untrusted processes to hurt real time
- performance. To date the designers have been unable to deploy these
- solutions as they have no way to prevent the "untrusted" user code from
- generating split lock and bus lock to block the hard real time code to
- access memory during bus locking.
- It's also useful for general computing to prevent guests or user
- applications from slowing down the overall system by executing instructions
- with bus lock.
- Guidance
- ========
- off
- ---
- Disable checking for split lock and bus lock. This option can be useful if
- there are legacy applications that trigger these events at a low rate so
- that mitigation is not needed.
- warn
- ----
- A warning is emitted when a bus lock is detected which allows to identify
- the offending application. This is the default behavior.
- fatal
- -----
- In this case, the bus lock is not tolerated and the process is killed.
- ratelimit
- ---------
- A system wide bus lock rate limit N is specified where 0 < N <= 1000. This
- allows a bus lock rate up to N bus locks per second. When the bus lock rate
- is exceeded then any task which is caught via the buslock #DB exception is
- throttled by enforced sleeps until the rate goes under the limit again.
- This is an effective mitigation in cases where a minimal impact can be
- tolerated, but an eventual Denial of Service attack has to be prevented. It
- allows to identify the offending processes and analyze whether they are
- malicious or just badly written.
- Selecting a rate limit of 1000 allows the bus to be locked for up to about
- seven million cycles each second (assuming 7000 cycles for each bus
- lock). On a 2 GHz processor that would be about 0.35% system slowdown.
|