asm-annotations.rst 9.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222
  1. Assembler Annotations
  2. =====================
  3. Copyright (c) 2017-2019 Jiri Slaby
  4. This document describes the new macros for annotation of data and code in
  5. assembly. In particular, it contains information about ``SYM_FUNC_START``,
  6. ``SYM_FUNC_END``, ``SYM_CODE_START``, and similar.
  7. Rationale
  8. ---------
  9. Some code like entries, trampolines, or boot code needs to be written in
  10. assembly. The same as in C, such code is grouped into functions and
  11. accompanied with data. Standard assemblers do not force users into precisely
  12. marking these pieces as code, data, or even specifying their length.
  13. Nevertheless, assemblers provide developers with such annotations to aid
  14. debuggers throughout assembly. On top of that, developers also want to mark
  15. some functions as *global* in order to be visible outside of their translation
  16. units.
  17. Over time, the Linux kernel has adopted macros from various projects (like
  18. ``binutils``) to facilitate such annotations. So for historic reasons,
  19. developers have been using ``ENTRY``, ``END``, ``ENDPROC``, and other
  20. annotations in assembly. Due to the lack of their documentation, the macros
  21. are used in rather wrong contexts at some locations. Clearly, ``ENTRY`` was
  22. intended to denote the beginning of global symbols (be it data or code).
  23. ``END`` used to mark the end of data or end of special functions with
  24. *non-standard* calling convention. In contrast, ``ENDPROC`` should annotate
  25. only ends of *standard* functions.
  26. When these macros are used correctly, they help assemblers generate a nice
  27. object with both sizes and types set correctly. For example, the result of
  28. ``arch/x86/lib/putuser.S``::
  29. Num: Value Size Type Bind Vis Ndx Name
  30. 25: 0000000000000000 33 FUNC GLOBAL DEFAULT 1 __put_user_1
  31. 29: 0000000000000030 37 FUNC GLOBAL DEFAULT 1 __put_user_2
  32. 32: 0000000000000060 36 FUNC GLOBAL DEFAULT 1 __put_user_4
  33. 35: 0000000000000090 37 FUNC GLOBAL DEFAULT 1 __put_user_8
  34. This is not only important for debugging purposes. When there are properly
  35. annotated objects like this, tools can be run on them to generate more useful
  36. information. In particular, on properly annotated objects, ``objtool`` can be
  37. run to check and fix the object if needed. Currently, ``objtool`` can report
  38. missing frame pointer setup/destruction in functions. It can also
  39. automatically generate annotations for the ORC unwinder
  40. (Documentation/x86/orc-unwinder.rst)
  41. for most code. Both of these are especially important to support reliable
  42. stack traces which are in turn necessary for kernel live patching
  43. (Documentation/livepatch/livepatch.rst).
  44. Caveat and Discussion
  45. ---------------------
  46. As one might realize, there were only three macros previously. That is indeed
  47. insufficient to cover all the combinations of cases:
  48. * standard/non-standard function
  49. * code/data
  50. * global/local symbol
  51. There was a discussion_ and instead of extending the current ``ENTRY/END*``
  52. macros, it was decided that brand new macros should be introduced instead::
  53. So how about using macro names that actually show the purpose, instead
  54. of importing all the crappy, historic, essentially randomly chosen
  55. debug symbol macro names from the binutils and older kernels?
  56. .. _discussion: https://lore.kernel.org/r/[email protected]
  57. Macros Description
  58. ------------------
  59. The new macros are prefixed with the ``SYM_`` prefix and can be divided into
  60. three main groups:
  61. 1. ``SYM_FUNC_*`` -- to annotate C-like functions. This means functions with
  62. standard C calling conventions. For example, on x86, this means that the
  63. stack contains a return address at the predefined place and a return from
  64. the function can happen in a standard way. When frame pointers are enabled,
  65. save/restore of frame pointer shall happen at the start/end of a function,
  66. respectively, too.
  67. Checking tools like ``objtool`` should ensure such marked functions conform
  68. to these rules. The tools can also easily annotate these functions with
  69. debugging information (like *ORC data*) automatically.
  70. 2. ``SYM_CODE_*`` -- special functions called with special stack. Be it
  71. interrupt handlers with special stack content, trampolines, or startup
  72. functions.
  73. Checking tools mostly ignore checking of these functions. But some debug
  74. information still can be generated automatically. For correct debug data,
  75. this code needs hints like ``UNWIND_HINT_REGS`` provided by developers.
  76. 3. ``SYM_DATA*`` -- obviously data belonging to ``.data`` sections and not to
  77. ``.text``. Data do not contain instructions, so they have to be treated
  78. specially by the tools: they should not treat the bytes as instructions,
  79. nor assign any debug information to them.
  80. Instruction Macros
  81. ~~~~~~~~~~~~~~~~~~
  82. This section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above.
  83. ``objtool`` requires that all code must be contained in an ELF symbol. Symbol
  84. names that have a ``.L`` prefix do not emit symbol table entries. ``.L``
  85. prefixed symbols can be used within a code region, but should be avoided for
  86. denoting a range of code via ``SYM_*_START/END`` annotations.
  87. * ``SYM_FUNC_START`` and ``SYM_FUNC_START_LOCAL`` are supposed to be **the
  88. most frequent markings**. They are used for functions with standard calling
  89. conventions -- global and local. Like in C, they both align the functions to
  90. architecture specific ``__ALIGN`` bytes. There are also ``_NOALIGN`` variants
  91. for special cases where developers do not want this implicit alignment.
  92. ``SYM_FUNC_START_WEAK`` and ``SYM_FUNC_START_WEAK_NOALIGN`` markings are
  93. also offered as an assembler counterpart to the *weak* attribute known from
  94. C.
  95. All of these **shall** be coupled with ``SYM_FUNC_END``. First, it marks
  96. the sequence of instructions as a function and computes its size to the
  97. generated object file. Second, it also eases checking and processing such
  98. object files as the tools can trivially find exact function boundaries.
  99. So in most cases, developers should write something like in the following
  100. example, having some asm instructions in between the macros, of course::
  101. SYM_FUNC_START(memset)
  102. ... asm insns ...
  103. SYM_FUNC_END(memset)
  104. In fact, this kind of annotation corresponds to the now deprecated ``ENTRY``
  105. and ``ENDPROC`` macros.
  106. * ``SYM_FUNC_ALIAS``, ``SYM_FUNC_ALIAS_LOCAL``, and ``SYM_FUNC_ALIAS_WEAK`` can
  107. be used to define multiple names for a function. The typical use is::
  108. SYM_FUNC_START(__memset)
  109. ... asm insns ...
  110. SYN_FUNC_END(__memset)
  111. SYM_FUNC_ALIAS(memset, __memset)
  112. In this example, one can call ``__memset`` or ``memset`` with the same
  113. result, except the debug information for the instructions is generated to
  114. the object file only once -- for the non-``ALIAS`` case.
  115. * ``SYM_CODE_START`` and ``SYM_CODE_START_LOCAL`` should be used only in
  116. special cases -- if you know what you are doing. This is used exclusively
  117. for interrupt handlers and similar where the calling convention is not the C
  118. one. ``_NOALIGN`` variants exist too. The use is the same as for the ``FUNC``
  119. category above::
  120. SYM_CODE_START_LOCAL(bad_put_user)
  121. ... asm insns ...
  122. SYM_CODE_END(bad_put_user)
  123. Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``.
  124. To some extent, this category corresponds to deprecated ``ENTRY`` and
  125. ``END``. Except ``END`` had several other meanings too.
  126. * ``SYM_INNER_LABEL*`` is used to denote a label inside some
  127. ``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``. They are very similar
  128. to C labels, except they can be made global. An example of use::
  129. SYM_CODE_START(ftrace_caller)
  130. /* save_mcount_regs fills in first two parameters */
  131. ...
  132. SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL)
  133. /* Load the ftrace_ops into the 3rd parameter */
  134. ...
  135. SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
  136. call ftrace_stub
  137. ...
  138. retq
  139. SYM_CODE_END(ftrace_caller)
  140. Data Macros
  141. ~~~~~~~~~~~
  142. Similar to instructions, there is a couple of macros to describe data in the
  143. assembly.
  144. * ``SYM_DATA_START`` and ``SYM_DATA_START_LOCAL`` mark the start of some data
  145. and shall be used in conjunction with either ``SYM_DATA_END``, or
  146. ``SYM_DATA_END_LABEL``. The latter adds also a label to the end, so that
  147. people can use ``lstack`` and (local) ``lstack_end`` in the following
  148. example::
  149. SYM_DATA_START_LOCAL(lstack)
  150. .skip 4096
  151. SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end)
  152. * ``SYM_DATA`` and ``SYM_DATA_LOCAL`` are variants for simple, mostly one-line
  153. data::
  154. SYM_DATA(HEAP, .long rm_heap)
  155. SYM_DATA(heap_end, .long rm_stack)
  156. In the end, they expand to ``SYM_DATA_START`` with ``SYM_DATA_END``
  157. internally.
  158. Support Macros
  159. ~~~~~~~~~~~~~~
  160. All the above reduce themselves to some invocation of ``SYM_START``,
  161. ``SYM_END``, or ``SYM_ENTRY`` at last. Normally, developers should avoid using
  162. these.
  163. Further, in the above examples, one could see ``SYM_L_LOCAL``. There are also
  164. ``SYM_L_GLOBAL`` and ``SYM_L_WEAK``. All are intended to denote linkage of a
  165. symbol marked by them. They are used either in ``_LABEL`` variants of the
  166. earlier macros, or in ``SYM_START``.
  167. Overriding Macros
  168. ~~~~~~~~~~~~~~~~~
  169. Architecture can also override any of the macros in their own
  170. ``asm/linkage.h``, including macros specifying the type of a symbol
  171. (``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``). As every macro
  172. described in this file is surrounded by ``#ifdef`` + ``#endif``, it is enough
  173. to define the macros differently in the aforementioned architecture-dependent
  174. header.