sysfs-mce 1.4 KB

12345678910111213141516171819202122232425262728293031323334353637
  1. What: /sys/devices/system/machinecheck/machinecheckX/tolerant
  2. Contact: Borislav Petkov <[email protected]>
  3. Date: Dec, 2021
  4. Description:
  5. Unused and obsolete after the advent of recoverable machine
  6. checks (see last sentence below) and those are present since
  7. 2010 (Nehalem).
  8. Original description:
  9. The entries appear for each CPU, but they are truly shared
  10. between all CPUs.
  11. Tolerance level. When a machine check exception occurs for a
  12. non corrected machine check the kernel can take different
  13. actions.
  14. Since machine check exceptions can happen any time it is
  15. sometimes risky for the kernel to kill a process because it
  16. defies normal kernel locking rules. The tolerance level
  17. configures how hard the kernel tries to recover even at some
  18. risk of deadlock. Higher tolerant values trade potentially
  19. better uptime with the risk of a crash or even corruption
  20. (for tolerant >= 3).
  21. == ===========================================================
  22. 0 always panic on uncorrected errors, log corrected errors
  23. 1 panic or SIGBUS on uncorrected errors, log corrected errors
  24. 2 SIGBUS or log uncorrected errors, log corrected errors
  25. 3 never panic or SIGBUS, log all errors (for testing only)
  26. == ===========================================================
  27. Default: 1
  28. Note this only makes a difference if the CPU allows recovery
  29. from a machine check exception. Current x86 CPUs generally
  30. do not.