livepatch.rst 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448
  1. =========
  2. Livepatch
  3. =========
  4. This document outlines basic information about kernel livepatching.
  5. .. Table of Contents:
  6. .. contents:: :local:
  7. 1. Motivation
  8. =============
  9. There are many situations where users are reluctant to reboot a system. It may
  10. be because their system is performing complex scientific computations or under
  11. heavy load during peak usage. In addition to keeping systems up and running,
  12. users want to also have a stable and secure system. Livepatching gives users
  13. both by allowing for function calls to be redirected; thus, fixing critical
  14. functions without a system reboot.
  15. 2. Kprobes, Ftrace, Livepatching
  16. ================================
  17. There are multiple mechanisms in the Linux kernel that are directly related
  18. to redirection of code execution; namely: kernel probes, function tracing,
  19. and livepatching:
  20. - The kernel probes are the most generic. The code can be redirected by
  21. putting a breakpoint instruction instead of any instruction.
  22. - The function tracer calls the code from a predefined location that is
  23. close to the function entry point. This location is generated by the
  24. compiler using the '-pg' gcc option.
  25. - Livepatching typically needs to redirect the code at the very beginning
  26. of the function entry before the function parameters or the stack
  27. are in any way modified.
  28. All three approaches need to modify the existing code at runtime. Therefore
  29. they need to be aware of each other and not step over each other's toes.
  30. Most of these problems are solved by using the dynamic ftrace framework as
  31. a base. A Kprobe is registered as a ftrace handler when the function entry
  32. is probed, see CONFIG_KPROBES_ON_FTRACE. Also an alternative function from
  33. a live patch is called with the help of a custom ftrace handler. But there are
  34. some limitations, see below.
  35. 3. Consistency model
  36. ====================
  37. Functions are there for a reason. They take some input parameters, get or
  38. release locks, read, process, and even write some data in a defined way,
  39. have return values. In other words, each function has a defined semantic.
  40. Many fixes do not change the semantic of the modified functions. For
  41. example, they add a NULL pointer or a boundary check, fix a race by adding
  42. a missing memory barrier, or add some locking around a critical section.
  43. Most of these changes are self contained and the function presents itself
  44. the same way to the rest of the system. In this case, the functions might
  45. be updated independently one by one.
  46. But there are more complex fixes. For example, a patch might change
  47. ordering of locking in multiple functions at the same time. Or a patch
  48. might exchange meaning of some temporary structures and update
  49. all the relevant functions. In this case, the affected unit
  50. (thread, whole kernel) need to start using all new versions of
  51. the functions at the same time. Also the switch must happen only
  52. when it is safe to do so, e.g. when the affected locks are released
  53. or no data are stored in the modified structures at the moment.
  54. The theory about how to apply functions a safe way is rather complex.
  55. The aim is to define a so-called consistency model. It attempts to define
  56. conditions when the new implementation could be used so that the system
  57. stays consistent.
  58. Livepatch has a consistency model which is a hybrid of kGraft and
  59. kpatch: it uses kGraft's per-task consistency and syscall barrier
  60. switching combined with kpatch's stack trace switching. There are also
  61. a number of fallback options which make it quite flexible.
  62. Patches are applied on a per-task basis, when the task is deemed safe to
  63. switch over. When a patch is enabled, livepatch enters into a
  64. transition state where tasks are converging to the patched state.
  65. Usually this transition state can complete in a few seconds. The same
  66. sequence occurs when a patch is disabled, except the tasks converge from
  67. the patched state to the unpatched state.
  68. An interrupt handler inherits the patched state of the task it
  69. interrupts. The same is true for forked tasks: the child inherits the
  70. patched state of the parent.
  71. Livepatch uses several complementary approaches to determine when it's
  72. safe to patch tasks:
  73. 1. The first and most effective approach is stack checking of sleeping
  74. tasks. If no affected functions are on the stack of a given task,
  75. the task is patched. In most cases this will patch most or all of
  76. the tasks on the first try. Otherwise it'll keep trying
  77. periodically. This option is only available if the architecture has
  78. reliable stacks (HAVE_RELIABLE_STACKTRACE).
  79. 2. The second approach, if needed, is kernel exit switching. A
  80. task is switched when it returns to user space from a system call, a
  81. user space IRQ, or a signal. It's useful in the following cases:
  82. a) Patching I/O-bound user tasks which are sleeping on an affected
  83. function. In this case you have to send SIGSTOP and SIGCONT to
  84. force it to exit the kernel and be patched.
  85. b) Patching CPU-bound user tasks. If the task is highly CPU-bound
  86. then it will get patched the next time it gets interrupted by an
  87. IRQ.
  88. 3. For idle "swapper" tasks, since they don't ever exit the kernel, they
  89. instead have a klp_update_patch_state() call in the idle loop which
  90. allows them to be patched before the CPU enters the idle state.
  91. (Note there's not yet such an approach for kthreads.)
  92. Architectures which don't have HAVE_RELIABLE_STACKTRACE solely rely on
  93. the second approach. It's highly likely that some tasks may still be
  94. running with an old version of the function, until that function
  95. returns. In this case you would have to signal the tasks. This
  96. especially applies to kthreads. They may not be woken up and would need
  97. to be forced. See below for more information.
  98. Unless we can come up with another way to patch kthreads, architectures
  99. without HAVE_RELIABLE_STACKTRACE are not considered fully supported by
  100. the kernel livepatching.
  101. The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
  102. is in transition. Only a single patch can be in transition at a given
  103. time. A patch can remain in transition indefinitely, if any of the tasks
  104. are stuck in the initial patch state.
  105. A transition can be reversed and effectively canceled by writing the
  106. opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
  107. the transition is in progress. Then all the tasks will attempt to
  108. converge back to the original patch state.
  109. There's also a /proc/<pid>/patch_state file which can be used to
  110. determine which tasks are blocking completion of a patching operation.
  111. If a patch is in transition, this file shows 0 to indicate the task is
  112. unpatched and 1 to indicate it's patched. Otherwise, if no patch is in
  113. transition, it shows -1. Any tasks which are blocking the transition
  114. can be signaled with SIGSTOP and SIGCONT to force them to change their
  115. patched state. This may be harmful to the system though. Sending a fake signal
  116. to all remaining blocking tasks is a better alternative. No proper signal is
  117. actually delivered (there is no data in signal pending structures). Tasks are
  118. interrupted or woken up, and forced to change their patched state. The fake
  119. signal is automatically sent every 15 seconds.
  120. Administrator can also affect a transition through
  121. /sys/kernel/livepatch/<patch>/force attribute. Writing 1 there clears
  122. TIF_PATCH_PENDING flag of all tasks and thus forces the tasks to the patched
  123. state. Important note! The force attribute is intended for cases when the
  124. transition gets stuck for a long time because of a blocking task. Administrator
  125. is expected to collect all necessary data (namely stack traces of such blocking
  126. tasks) and request a clearance from a patch distributor to force the transition.
  127. Unauthorized usage may cause harm to the system. It depends on the nature of the
  128. patch, which functions are (un)patched, and which functions the blocking tasks
  129. are sleeping in (/proc/<pid>/stack may help here). Removal (rmmod) of patch
  130. modules is permanently disabled when the force feature is used. It cannot be
  131. guaranteed there is no task sleeping in such module. It implies unbounded
  132. reference count if a patch module is disabled and enabled in a loop.
  133. Moreover, the usage of force may also affect future applications of live
  134. patches and cause even more harm to the system. Administrator should first
  135. consider to simply cancel a transition (see above). If force is used, reboot
  136. should be planned and no more live patches applied.
  137. 3.1 Adding consistency model support to new architectures
  138. ---------------------------------------------------------
  139. For adding consistency model support to new architectures, there are a
  140. few options:
  141. 1) Add CONFIG_HAVE_RELIABLE_STACKTRACE. This means porting objtool, and
  142. for non-DWARF unwinders, also making sure there's a way for the stack
  143. tracing code to detect interrupts on the stack.
  144. 2) Alternatively, ensure that every kthread has a call to
  145. klp_update_patch_state() in a safe location. Kthreads are typically
  146. in an infinite loop which does some action repeatedly. The safe
  147. location to switch the kthread's patch state would be at a designated
  148. point in the loop where there are no locks taken and all data
  149. structures are in a well-defined state.
  150. The location is clear when using workqueues or the kthread worker
  151. API. These kthreads process independent actions in a generic loop.
  152. It's much more complicated with kthreads which have a custom loop.
  153. There the safe location must be carefully selected on a case-by-case
  154. basis.
  155. In that case, arches without HAVE_RELIABLE_STACKTRACE would still be
  156. able to use the non-stack-checking parts of the consistency model:
  157. a) patching user tasks when they cross the kernel/user space
  158. boundary; and
  159. b) patching kthreads and idle tasks at their designated patch points.
  160. This option isn't as good as option 1 because it requires signaling
  161. user tasks and waking kthreads to patch them. But it could still be
  162. a good backup option for those architectures which don't have
  163. reliable stack traces yet.
  164. 4. Livepatch module
  165. ===================
  166. Livepatches are distributed using kernel modules, see
  167. samples/livepatch/livepatch-sample.c.
  168. The module includes a new implementation of functions that we want
  169. to replace. In addition, it defines some structures describing the
  170. relation between the original and the new implementation. Then there
  171. is code that makes the kernel start using the new code when the livepatch
  172. module is loaded. Also there is code that cleans up before the
  173. livepatch module is removed. All this is explained in more details in
  174. the next sections.
  175. 4.1. New functions
  176. ------------------
  177. New versions of functions are typically just copied from the original
  178. sources. A good practice is to add a prefix to the names so that they
  179. can be distinguished from the original ones, e.g. in a backtrace. Also
  180. they can be declared as static because they are not called directly
  181. and do not need the global visibility.
  182. The patch contains only functions that are really modified. But they
  183. might want to access functions or data from the original source file
  184. that may only be locally accessible. This can be solved by a special
  185. relocation section in the generated livepatch module, see
  186. Documentation/livepatch/module-elf-format.rst for more details.
  187. 4.2. Metadata
  188. -------------
  189. The patch is described by several structures that split the information
  190. into three levels:
  191. - struct klp_func is defined for each patched function. It describes
  192. the relation between the original and the new implementation of a
  193. particular function.
  194. The structure includes the name, as a string, of the original function.
  195. The function address is found via kallsyms at runtime.
  196. Then it includes the address of the new function. It is defined
  197. directly by assigning the function pointer. Note that the new
  198. function is typically defined in the same source file.
  199. As an optional parameter, the symbol position in the kallsyms database can
  200. be used to disambiguate functions of the same name. This is not the
  201. absolute position in the database, but rather the order it has been found
  202. only for a particular object ( vmlinux or a kernel module ). Note that
  203. kallsyms allows for searching symbols according to the object name.
  204. - struct klp_object defines an array of patched functions (struct
  205. klp_func) in the same object. Where the object is either vmlinux
  206. (NULL) or a module name.
  207. The structure helps to group and handle functions for each object
  208. together. Note that patched modules might be loaded later than
  209. the patch itself and the relevant functions might be patched
  210. only when they are available.
  211. - struct klp_patch defines an array of patched objects (struct
  212. klp_object).
  213. This structure handles all patched functions consistently and eventually,
  214. synchronously. The whole patch is applied only when all patched
  215. symbols are found. The only exception are symbols from objects
  216. (kernel modules) that have not been loaded yet.
  217. For more details on how the patch is applied on a per-task basis,
  218. see the "Consistency model" section.
  219. 5. Livepatch life-cycle
  220. =======================
  221. Livepatching can be described by five basic operations:
  222. loading, enabling, replacing, disabling, removing.
  223. Where the replacing and the disabling operations are mutually
  224. exclusive. They have the same result for the given patch but
  225. not for the system.
  226. 5.1. Loading
  227. ------------
  228. The only reasonable way is to enable the patch when the livepatch kernel
  229. module is being loaded. For this, klp_enable_patch() has to be called
  230. in the module_init() callback. There are two main reasons:
  231. First, only the module has an easy access to the related struct klp_patch.
  232. Second, the error code might be used to refuse loading the module when
  233. the patch cannot get enabled.
  234. 5.2. Enabling
  235. -------------
  236. The livepatch gets enabled by calling klp_enable_patch() from
  237. the module_init() callback. The system will start using the new
  238. implementation of the patched functions at this stage.
  239. First, the addresses of the patched functions are found according to their
  240. names. The special relocations, mentioned in the section "New functions",
  241. are applied. The relevant entries are created under
  242. /sys/kernel/livepatch/<name>. The patch is rejected when any above
  243. operation fails.
  244. Second, livepatch enters into a transition state where tasks are converging
  245. to the patched state. If an original function is patched for the first
  246. time, a function specific struct klp_ops is created and an universal
  247. ftrace handler is registered\ [#]_. This stage is indicated by a value of '1'
  248. in /sys/kernel/livepatch/<name>/transition. For more information about
  249. this process, see the "Consistency model" section.
  250. Finally, once all tasks have been patched, the 'transition' value changes
  251. to '0'.
  252. .. [#]
  253. Note that functions might be patched multiple times. The ftrace handler
  254. is registered only once for a given function. Further patches just add
  255. an entry to the list (see field `func_stack`) of the struct klp_ops.
  256. The right implementation is selected by the ftrace handler, see
  257. the "Consistency model" section.
  258. That said, it is highly recommended to use cumulative livepatches
  259. because they help keeping the consistency of all changes. In this case,
  260. functions might be patched two times only during the transition period.
  261. 5.3. Replacing
  262. --------------
  263. All enabled patches might get replaced by a cumulative patch that
  264. has the .replace flag set.
  265. Once the new patch is enabled and the 'transition' finishes then
  266. all the functions (struct klp_func) associated with the replaced
  267. patches are removed from the corresponding struct klp_ops. Also
  268. the ftrace handler is unregistered and the struct klp_ops is
  269. freed when the related function is not modified by the new patch
  270. and func_stack list becomes empty.
  271. See Documentation/livepatch/cumulative-patches.rst for more details.
  272. 5.4. Disabling
  273. --------------
  274. Enabled patches might get disabled by writing '0' to
  275. /sys/kernel/livepatch/<name>/enabled.
  276. First, livepatch enters into a transition state where tasks are converging
  277. to the unpatched state. The system starts using either the code from
  278. the previously enabled patch or even the original one. This stage is
  279. indicated by a value of '1' in /sys/kernel/livepatch/<name>/transition.
  280. For more information about this process, see the "Consistency model"
  281. section.
  282. Second, once all tasks have been unpatched, the 'transition' value changes
  283. to '0'. All the functions (struct klp_func) associated with the to-be-disabled
  284. patch are removed from the corresponding struct klp_ops. The ftrace handler
  285. is unregistered and the struct klp_ops is freed when the func_stack list
  286. becomes empty.
  287. Third, the sysfs interface is destroyed.
  288. 5.5. Removing
  289. -------------
  290. Module removal is only safe when there are no users of functions provided
  291. by the module. This is the reason why the force feature permanently
  292. disables the removal. Only when the system is successfully transitioned
  293. to a new patch state (patched/unpatched) without being forced it is
  294. guaranteed that no task sleeps or runs in the old code.
  295. 6. Sysfs
  296. ========
  297. Information about the registered patches can be found under
  298. /sys/kernel/livepatch. The patches could be enabled and disabled
  299. by writing there.
  300. /sys/kernel/livepatch/<patch>/force attributes allow administrator to affect a
  301. patching operation.
  302. See Documentation/ABI/testing/sysfs-kernel-livepatch for more details.
  303. 7. Limitations
  304. ==============
  305. The current Livepatch implementation has several limitations:
  306. - Only functions that can be traced could be patched.
  307. Livepatch is based on the dynamic ftrace. In particular, functions
  308. implementing ftrace or the livepatch ftrace handler could not be
  309. patched. Otherwise, the code would end up in an infinite loop. A
  310. potential mistake is prevented by marking the problematic functions
  311. by "notrace".
  312. - Livepatch works reliably only when the dynamic ftrace is located at
  313. the very beginning of the function.
  314. The function need to be redirected before the stack or the function
  315. parameters are modified in any way. For example, livepatch requires
  316. using -fentry gcc compiler option on x86_64.
  317. One exception is the PPC port. It uses relative addressing and TOC.
  318. Each function has to handle TOC and save LR before it could call
  319. the ftrace handler. This operation has to be reverted on return.
  320. Fortunately, the generic ftrace code has the same problem and all
  321. this is handled on the ftrace level.
  322. - Kretprobes using the ftrace framework conflict with the patched
  323. functions.
  324. Both kretprobes and livepatches use a ftrace handler that modifies
  325. the return address. The first user wins. Either the probe or the patch
  326. is rejected when the handler is already in use by the other.
  327. - Kprobes in the original function are ignored when the code is
  328. redirected to the new implementation.
  329. There is a work in progress to add warnings about this situation.