overview.rst 9.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207
  1. .. SPDX-License-Identifier: GPL-2.0
  2. Overview
  3. ========
  4. The Linux kernel contains a variety of code for running as a fully
  5. enlightened guest on Microsoft's Hyper-V hypervisor. Hyper-V
  6. consists primarily of a bare-metal hypervisor plus a virtual machine
  7. management service running in the parent partition (roughly
  8. equivalent to KVM and QEMU, for example). Guest VMs run in child
  9. partitions. In this documentation, references to Hyper-V usually
  10. encompass both the hypervisor and the VMM service without making a
  11. distinction about which functionality is provided by which
  12. component.
  13. Hyper-V runs on x86/x64 and arm64 architectures, and Linux guests
  14. are supported on both. The functionality and behavior of Hyper-V is
  15. generally the same on both architectures unless noted otherwise.
  16. Linux Guest Communication with Hyper-V
  17. --------------------------------------
  18. Linux guests communicate with Hyper-V in four different ways:
  19. * Implicit traps: As defined by the x86/x64 or arm64 architecture,
  20. some guest actions trap to Hyper-V. Hyper-V emulates the action and
  21. returns control to the guest. This behavior is generally invisible
  22. to the Linux kernel.
  23. * Explicit hypercalls: Linux makes an explicit function call to
  24. Hyper-V, passing parameters. Hyper-V performs the requested action
  25. and returns control to the caller. Parameters are passed in
  26. processor registers or in memory shared between the Linux guest and
  27. Hyper-V. On x86/x64, hypercalls use a Hyper-V specific calling
  28. sequence. On arm64, hypercalls use the ARM standard SMCCC calling
  29. sequence.
  30. * Synthetic register access: Hyper-V implements a variety of
  31. synthetic registers. On x86/x64 these registers appear as MSRs in
  32. the guest, and the Linux kernel can read or write these MSRs using
  33. the normal mechanisms defined by the x86/x64 architecture. On
  34. arm64, these synthetic registers must be accessed using explicit
  35. hypercalls.
  36. * VMbus: VMbus is a higher-level software construct that is built on
  37. the other 3 mechanisms. It is a message passing interface between
  38. the Hyper-V host and the Linux guest. It uses memory that is shared
  39. between Hyper-V and the guest, along with various signaling
  40. mechanisms.
  41. The first three communication mechanisms are documented in the
  42. `Hyper-V Top Level Functional Spec (TLFS)`_. The TLFS describes
  43. general Hyper-V functionality and provides details on the hypercalls
  44. and synthetic registers. The TLFS is currently written for the
  45. x86/x64 architecture only.
  46. .. _Hyper-V Top Level Functional Spec (TLFS): https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
  47. VMbus is not documented. This documentation provides a high-level
  48. overview of VMbus and how it works, but the details can be discerned
  49. only from the code.
  50. Sharing Memory
  51. --------------
  52. Many aspects are communication between Hyper-V and Linux are based
  53. on sharing memory. Such sharing is generally accomplished as
  54. follows:
  55. * Linux allocates memory from its physical address space using
  56. standard Linux mechanisms.
  57. * Linux tells Hyper-V the guest physical address (GPA) of the
  58. allocated memory. Many shared areas are kept to 1 page so that a
  59. single GPA is sufficient. Larger shared areas require a list of
  60. GPAs, which usually do not need to be contiguous in the guest
  61. physical address space. How Hyper-V is told about the GPA or list
  62. of GPAs varies. In some cases, a single GPA is written to a
  63. synthetic register. In other cases, a GPA or list of GPAs is sent
  64. in a VMbus message.
  65. * Hyper-V translates the GPAs into "real" physical memory addresses,
  66. and creates a virtual mapping that it can use to access the memory.
  67. * Linux can later revoke sharing it has previously established by
  68. telling Hyper-V to set the shared GPA to zero.
  69. Hyper-V operates with a page size of 4 Kbytes. GPAs communicated to
  70. Hyper-V may be in the form of page numbers, and always describe a
  71. range of 4 Kbytes. Since the Linux guest page size on x86/x64 is
  72. also 4 Kbytes, the mapping from guest page to Hyper-V page is 1-to-1.
  73. On arm64, Hyper-V supports guests with 4/16/64 Kbyte pages as
  74. defined by the arm64 architecture. If Linux is using 16 or 64
  75. Kbyte pages, Linux code must be careful to communicate with Hyper-V
  76. only in terms of 4 Kbyte pages. HV_HYP_PAGE_SIZE and related macros
  77. are used in code that communicates with Hyper-V so that it works
  78. correctly in all configurations.
  79. As described in the TLFS, a few memory pages shared between Hyper-V
  80. and the Linux guest are "overlay" pages. With overlay pages, Linux
  81. uses the usual approach of allocating guest memory and telling
  82. Hyper-V the GPA of the allocated memory. But Hyper-V then replaces
  83. that physical memory page with a page it has allocated, and the
  84. original physical memory page is no longer accessible in the guest
  85. VM. Linux may access the memory normally as if it were the memory
  86. that it originally allocated. The "overlay" behavior is visible
  87. only because the contents of the page (as seen by Linux) change at
  88. the time that Linux originally establishes the sharing and the
  89. overlay page is inserted. Similarly, the contents change if Linux
  90. revokes the sharing, in which case Hyper-V removes the overlay page,
  91. and the guest page originally allocated by Linux becomes visible
  92. again.
  93. Before Linux does a kexec to a kdump kernel or any other kernel,
  94. memory shared with Hyper-V should be revoked. Hyper-V could modify
  95. a shared page or remove an overlay page after the new kernel is
  96. using the page for a different purpose, corrupting the new kernel.
  97. Hyper-V does not provide a single "set everything" operation to
  98. guest VMs, so Linux code must individually revoke all sharing before
  99. doing kexec. See hv_kexec_handler() and hv_crash_handler(). But
  100. the crash/panic path still has holes in cleanup because some shared
  101. pages are set using per-CPU synthetic registers and there's no
  102. mechanism to revoke the shared pages for CPUs other than the CPU
  103. running the panic path.
  104. CPU Management
  105. --------------
  106. Hyper-V does not have a ability to hot-add or hot-remove a CPU
  107. from a running VM. However, Windows Server 2019 Hyper-V and
  108. earlier versions may provide guests with ACPI tables that indicate
  109. more CPUs than are actually present in the VM. As is normal, Linux
  110. treats these additional CPUs as potential hot-add CPUs, and reports
  111. them as such even though Hyper-V will never actually hot-add them.
  112. Starting in Windows Server 2022 Hyper-V, the ACPI tables reflect
  113. only the CPUs actually present in the VM, so Linux does not report
  114. any hot-add CPUs.
  115. A Linux guest CPU may be taken offline using the normal Linux
  116. mechanisms, provided no VMbus channel interrupts are assigned to
  117. the CPU. See the section on VMbus Interrupts for more details
  118. on how VMbus channel interrupts can be re-assigned to permit
  119. taking a CPU offline.
  120. 32-bit and 64-bit
  121. -----------------
  122. On x86/x64, Hyper-V supports 32-bit and 64-bit guests, and Linux
  123. will build and run in either version. While the 32-bit version is
  124. expected to work, it is used rarely and may suffer from undetected
  125. regressions.
  126. On arm64, Hyper-V supports only 64-bit guests.
  127. Endian-ness
  128. -----------
  129. All communication between Hyper-V and guest VMs uses Little-Endian
  130. format on both x86/x64 and arm64. Big-endian format on arm64 is not
  131. supported by Hyper-V, and Linux code does not use endian-ness macros
  132. when accessing data shared with Hyper-V.
  133. Versioning
  134. ----------
  135. Current Linux kernels operate correctly with older versions of
  136. Hyper-V back to Windows Server 2012 Hyper-V. Support for running
  137. on the original Hyper-V release in Windows Server 2008/2008 R2
  138. has been removed.
  139. A Linux guest on Hyper-V outputs in dmesg the version of Hyper-V
  140. it is running on. This version is in the form of a Windows build
  141. number and is for display purposes only. Linux code does not
  142. test this version number at runtime to determine available features
  143. and functionality. Hyper-V indicates feature/function availability
  144. via flags in synthetic MSRs that Hyper-V provides to the guest,
  145. and the guest code tests these flags.
  146. VMbus has its own protocol version that is negotiated during the
  147. initial VMbus connection from the guest to Hyper-V. This version
  148. number is also output to dmesg during boot. This version number
  149. is checked in a few places in the code to determine if specific
  150. functionality is present.
  151. Furthermore, each synthetic device on VMbus also has a protocol
  152. version that is separate from the VMbus protocol version. Device
  153. drivers for these synthetic devices typically negotiate the device
  154. protocol version, and may test that protocol version to determine
  155. if specific device functionality is present.
  156. Code Packaging
  157. --------------
  158. Hyper-V related code appears in the Linux kernel code tree in three
  159. main areas:
  160. 1. drivers/hv
  161. 2. arch/x86/hyperv and arch/arm64/hyperv
  162. 3. individual device driver areas such as drivers/scsi, drivers/net,
  163. drivers/clocksource, etc.
  164. A few miscellaneous files appear elsewhere. See the full list under
  165. "Hyper-V/Azure CORE AND DRIVERS" and "DRM DRIVER FOR HYPERV
  166. SYNTHETIC VIDEO DEVICE" in the MAINTAINERS file.
  167. The code in #1 and #2 is built only when CONFIG_HYPERV is set.
  168. Similarly, the code for most Hyper-V related drivers is built only
  169. when CONFIG_HYPERV is set.
  170. Most Hyper-V related code in #1 and #3 can be built as a module.
  171. The architecture specific code in #2 must be built-in. Also,
  172. drivers/hv/hv_common.c is low-level code that is common across
  173. architectures and must be built-in.