intel-hfi.rst 3.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
  1. .. SPDX-License-Identifier: GPL-2.0
  2. ============================================================
  3. Hardware-Feedback Interface for scheduling on Intel Hardware
  4. ============================================================
  5. Overview
  6. --------
  7. Intel has described the Hardware Feedback Interface (HFI) in the Intel 64 and
  8. IA-32 Architectures Software Developer's Manual (Intel SDM) Volume 3 Section
  9. 14.6 [1]_.
  10. The HFI gives the operating system a performance and energy efficiency
  11. capability data for each CPU in the system. Linux can use the information from
  12. the HFI to influence task placement decisions.
  13. The Hardware Feedback Interface
  14. -------------------------------
  15. The Hardware Feedback Interface provides to the operating system information
  16. about the performance and energy efficiency of each CPU in the system. Each
  17. capability is given as a unit-less quantity in the range [0-255]. Higher values
  18. indicate higher capability. Energy efficiency and performance are reported in
  19. separate capabilities. Even though on some systems these two metrics may be
  20. related, they are specified as independent capabilities in the Intel SDM.
  21. These capabilities may change at runtime as a result of changes in the
  22. operating conditions of the system or the action of external factors. The rate
  23. at which these capabilities are updated is specific to each processor model. On
  24. some models, capabilities are set at boot time and never change. On others,
  25. capabilities may change every tens of milliseconds. For instance, a remote
  26. mechanism may be used to lower Thermal Design Power. Such change can be
  27. reflected in the HFI. Likewise, if the system needs to be throttled due to
  28. excessive heat, the HFI may reflect reduced performance on specific CPUs.
  29. The kernel or a userspace policy daemon can use these capabilities to modify
  30. task placement decisions. For instance, if either the performance or energy
  31. capabilities of a given logical processor becomes zero, it is an indication that
  32. the hardware recommends to the operating system to not schedule any tasks on
  33. that processor for performance or energy efficiency reasons, respectively.
  34. Implementation details for Linux
  35. --------------------------------
  36. The infrastructure to handle thermal event interrupts has two parts. In the
  37. Local Vector Table of a CPU's local APIC, there exists a register for the
  38. Thermal Monitor Register. This register controls how interrupts are delivered
  39. to a CPU when the thermal monitor generates and interrupt. Further details
  40. can be found in the Intel SDM Vol. 3 Section 10.5 [1]_.
  41. The thermal monitor may generate interrupts per CPU or per package. The HFI
  42. generates package-level interrupts. This monitor is configured and initialized
  43. via a set of machine-specific registers. Specifically, the HFI interrupt and
  44. status are controlled via designated bits in the IA32_PACKAGE_THERM_INTERRUPT
  45. and IA32_PACKAGE_THERM_STATUS registers, respectively. There exists one HFI
  46. table per package. Further details can be found in the Intel SDM Vol. 3
  47. Section 14.9 [1]_.
  48. The hardware issues an HFI interrupt after updating the HFI table and is ready
  49. for the operating system to consume it. CPUs receive such interrupt via the
  50. thermal entry in the Local APIC's Local Vector Table.
  51. When servicing such interrupt, the HFI driver parses the updated table and
  52. relays the update to userspace using the thermal notification framework. Given
  53. that there may be many HFI updates every second, the updates relayed to
  54. userspace are throttled at a rate of CONFIG_HZ jiffies.
  55. References
  56. ----------
  57. .. [1] https://www.intel.com/sdm