12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697 |
- What: /sys/devices/system/machinecheck/machinecheckX/
- Contact: Andi Kleen <[email protected]>
- Date: Feb, 2007
- Description:
- (X = CPU number)
- Machine checks report internal hardware error conditions
- detected by the CPU. Uncorrected errors typically cause a
- machine check (often with panic), corrected ones cause a
- machine check log entry.
- For more details about the x86 machine check architecture
- see the Intel and AMD architecture manuals from their
- developer websites.
- For more details about the architecture
- see http://one.firstfloor.org/~andi/mce.pdf
- Each CPU has its own directory.
- What: /sys/devices/system/machinecheck/machinecheckX/bank<Y>
- Contact: Andi Kleen <[email protected]>
- Date: Feb, 2007
- Description:
- (Y bank number)
- 64bit Hex bitmask enabling/disabling specific subevents for
- bank Y.
- When a bit in the bitmask is zero then the respective
- subevent will not be reported.
- By default all events are enabled.
- Note that BIOS maintain another mask to disable specific events
- per bank. This is not visible here
- What: /sys/devices/system/machinecheck/machinecheckX/check_interval
- Contact: Andi Kleen <[email protected]>
- Date: Feb, 2007
- Description:
- The entries appear for each CPU, but they are truly shared
- between all CPUs.
- How often to poll for corrected machine check errors, in
- seconds (Note output is hexadecimal). Default 5 minutes.
- When the poller finds MCEs it triggers an exponential speedup
- (poll more often) on the polling interval. When the poller
- stops finding MCEs, it triggers an exponential backoff
- (poll less often) on the polling interval. The check_interval
- variable is both the initial and maximum polling interval.
- 0 means no polling for corrected machine check errors
- (but some corrected errors might be still reported
- in other ways)
- What: /sys/devices/system/machinecheck/machinecheckX/trigger
- Contact: Andi Kleen <[email protected]>
- Date: Feb, 2007
- Description:
- The entries appear for each CPU, but they are truly shared
- between all CPUs.
- Program to run when a machine check event is detected.
- This is an alternative to running mcelog regularly from cron
- and allows to detect events faster.
- What: /sys/devices/system/machinecheck/machinecheckX/monarch_timeout
- Contact: Andi Kleen <[email protected]>
- Date: Feb, 2007
- Description:
- How long to wait for the other CPUs to machine check too on a
- exception. 0 to disable waiting for other CPUs.
- Unit: us
- What: /sys/devices/system/machinecheck/machinecheckX/ignore_ce
- Contact: Hidetoshi Seto <[email protected]>
- Date: Jun 2009
- Description:
- Disables polling and CMCI for corrected errors.
- All corrected events are not cleared and kept in bank MSRs.
- What: /sys/devices/system/machinecheck/machinecheckX/dont_log_ce
- Contact: Hidetoshi Seto <[email protected]>
- Date: Jun 2009
- Description:
- Disables logging for corrected errors.
- All reported corrected errors will be cleared silently.
- This option will be useful if you never care about corrected
- errors.
- What: /sys/devices/system/machinecheck/machinecheckX/cmci_disabled
- Contact: Hidetoshi Seto <[email protected]>
- Date: Jun 2009
- Description:
- Disables the CMCI feature.
|