ssh.rst 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344
  1. .. SPDX-License-Identifier: GPL-2.0+
  2. .. |u8| replace:: :c:type:`u8 <u8>`
  3. .. |u16| replace:: :c:type:`u16 <u16>`
  4. .. |TYPE| replace:: ``TYPE``
  5. .. |LEN| replace:: ``LEN``
  6. .. |SEQ| replace:: ``SEQ``
  7. .. |SYN| replace:: ``SYN``
  8. .. |NAK| replace:: ``NAK``
  9. .. |ACK| replace:: ``ACK``
  10. .. |DATA| replace:: ``DATA``
  11. .. |DATA_SEQ| replace:: ``DATA_SEQ``
  12. .. |DATA_NSQ| replace:: ``DATA_NSQ``
  13. .. |TC| replace:: ``TC``
  14. .. |TID| replace:: ``TID``
  15. .. |IID| replace:: ``IID``
  16. .. |RQID| replace:: ``RQID``
  17. .. |CID| replace:: ``CID``
  18. ===========================
  19. Surface Serial Hub Protocol
  20. ===========================
  21. The Surface Serial Hub (SSH) is the central communication interface for the
  22. embedded Surface Aggregator Module controller (SAM or EC), found on newer
  23. Surface generations. We will refer to this protocol and interface as
  24. SAM-over-SSH, as opposed to SAM-over-HID for the older generations.
  25. On Surface devices with SAM-over-SSH, SAM is connected to the host via UART
  26. and defined in ACPI as device with ID ``MSHW0084``. On these devices,
  27. significant functionality is provided via SAM, including access to battery
  28. and power information and events, thermal read-outs and events, and many
  29. more. For Surface Laptops, keyboard input is handled via HID directed
  30. through SAM, on the Surface Laptop 3 and Surface Book 3 this also includes
  31. touchpad input.
  32. Note that the standard disclaimer for this subsystem also applies to this
  33. document: All of this has been reverse-engineered and may thus be erroneous
  34. and/or incomplete.
  35. All CRCs used in the following are two-byte ``crc_ccitt_false(0xffff, ...)``.
  36. All multi-byte values are little-endian, there is no implicit padding between
  37. values.
  38. SSH Packet Protocol: Definitions
  39. ================================
  40. The fundamental communication unit of the SSH protocol is a frame
  41. (:c:type:`struct ssh_frame <ssh_frame>`). A frame consists of the following
  42. fields, packed together and in order:
  43. .. flat-table:: SSH Frame
  44. :widths: 1 1 4
  45. :header-rows: 1
  46. * - Field
  47. - Type
  48. - Description
  49. * - |TYPE|
  50. - |u8|
  51. - Type identifier of the frame.
  52. * - |LEN|
  53. - |u16|
  54. - Length of the payload associated with the frame.
  55. * - |SEQ|
  56. - |u8|
  57. - Sequence ID (see explanation below).
  58. Each frame structure is followed by a CRC over this structure. The CRC over
  59. the frame structure (|TYPE|, |LEN|, and |SEQ| fields) is placed directly
  60. after the frame structure and before the payload. The payload is followed by
  61. its own CRC (over all payload bytes). If the payload is not present (i.e.
  62. the frame has ``LEN=0``), the CRC of the payload is still present and will
  63. evaluate to ``0xffff``. The |LEN| field does not include any of the CRCs, it
  64. equals the number of bytes inbetween the CRC of the frame and the CRC of the
  65. payload.
  66. Additionally, the following fixed two-byte sequences are used:
  67. .. flat-table:: SSH Byte Sequences
  68. :widths: 1 1 4
  69. :header-rows: 1
  70. * - Name
  71. - Value
  72. - Description
  73. * - |SYN|
  74. - ``[0xAA, 0x55]``
  75. - Synchronization bytes.
  76. A message consists of |SYN|, followed by the frame (|TYPE|, |LEN|, |SEQ| and
  77. CRC) and, if specified in the frame (i.e. ``LEN > 0``), payload bytes,
  78. followed finally, regardless if the payload is present, the payload CRC. The
  79. messages corresponding to an exchange are, in part, identified by having the
  80. same sequence ID (|SEQ|), stored inside the frame (more on this in the next
  81. section). The sequence ID is a wrapping counter.
  82. A frame can have the following types
  83. (:c:type:`enum ssh_frame_type <ssh_frame_type>`):
  84. .. flat-table:: SSH Frame Types
  85. :widths: 1 1 4
  86. :header-rows: 1
  87. * - Name
  88. - Value
  89. - Short Description
  90. * - |NAK|
  91. - ``0x04``
  92. - Sent on error in previously received message.
  93. * - |ACK|
  94. - ``0x40``
  95. - Sent to acknowledge receival of |DATA| frame.
  96. * - |DATA_SEQ|
  97. - ``0x80``
  98. - Sent to transfer data. Sequenced.
  99. * - |DATA_NSQ|
  100. - ``0x00``
  101. - Same as |DATA_SEQ|, but does not need to be ACKed.
  102. Both |NAK|- and |ACK|-type frames are used to control flow of messages and
  103. thus do not carry a payload. |DATA_SEQ|- and |DATA_NSQ|-type frames on the
  104. other hand must carry a payload. The flow sequence and interaction of
  105. different frame types will be described in more depth in the next section.
  106. SSH Packet Protocol: Flow Sequence
  107. ==================================
  108. Each exchange begins with |SYN|, followed by a |DATA_SEQ|- or
  109. |DATA_NSQ|-type frame, followed by its CRC, payload, and payload CRC. In
  110. case of a |DATA_NSQ|-type frame, the exchange is then finished. In case of a
  111. |DATA_SEQ|-type frame, the receiving party has to acknowledge receival of
  112. the frame by responding with a message containing an |ACK|-type frame with
  113. the same sequence ID of the |DATA| frame. In other words, the sequence ID of
  114. the |ACK| frame specifies the |DATA| frame to be acknowledged. In case of an
  115. error, e.g. an invalid CRC, the receiving party responds with a message
  116. containing an |NAK|-type frame. As the sequence ID of the previous data
  117. frame, for which an error is indicated via the |NAK| frame, cannot be relied
  118. upon, the sequence ID of the |NAK| frame should not be used and is set to
  119. zero. After receival of an |NAK| frame, the sending party should re-send all
  120. outstanding (non-ACKed) messages.
  121. Sequence IDs are not synchronized between the two parties, meaning that they
  122. are managed independently for each party. Identifying the messages
  123. corresponding to a single exchange thus relies on the sequence ID as well as
  124. the type of the message, and the context. Specifically, the sequence ID is
  125. used to associate an ``ACK`` with its ``DATA_SEQ``-type frame, but not
  126. ``DATA_SEQ``- or ``DATA_NSQ``-type frames with other ``DATA``- type frames.
  127. An example exchange might look like this:
  128. ::
  129. tx: -- SYN FRAME(D) CRC(F) PAYLOAD CRC(P) -----------------------------
  130. rx: ------------------------------------- SYN FRAME(A) CRC(F) CRC(P) --
  131. where both frames have the same sequence ID (``SEQ``). Here, ``FRAME(D)``
  132. indicates a |DATA_SEQ|-type frame, ``FRAME(A)`` an ``ACK``-type frame,
  133. ``CRC(F)`` the CRC over the previous frame, ``CRC(P)`` the CRC over the
  134. previous payload. In case of an error, the exchange would look like this:
  135. ::
  136. tx: -- SYN FRAME(D) CRC(F) PAYLOAD CRC(P) -----------------------------
  137. rx: ------------------------------------- SYN FRAME(N) CRC(F) CRC(P) --
  138. upon which the sender should re-send the message. ``FRAME(N)`` indicates an
  139. |NAK|-type frame. Note that the sequence ID of the |NAK|-type frame is fixed
  140. to zero. For |DATA_NSQ|-type frames, both exchanges are the same:
  141. ::
  142. tx: -- SYN FRAME(DATA_NSQ) CRC(F) PAYLOAD CRC(P) ----------------------
  143. rx: -------------------------------------------------------------------
  144. Here, an error can be detected, but not corrected or indicated to the
  145. sending party. These exchanges are symmetric, i.e. switching ``rx`` and
  146. ``tx`` results again in a valid exchange. Currently, no longer exchanges are
  147. known.
  148. Commands: Requests, Responses, and Events
  149. =========================================
  150. Commands are sent as payload inside a data frame. Currently, this is the
  151. only known payload type of |DATA| frames, with a payload-type value of
  152. ``0x80`` (:c:type:`SSH_PLD_TYPE_CMD <ssh_payload_type>`).
  153. The command-type payload (:c:type:`struct ssh_command <ssh_command>`)
  154. consists of an eight-byte command structure, followed by optional and
  155. variable length command data. The length of this optional data is derived
  156. from the frame payload length given in the corresponding frame, i.e. it is
  157. ``frame.len - sizeof(struct ssh_command)``. The command struct contains the
  158. following fields, packed together and in order:
  159. .. flat-table:: SSH Command
  160. :widths: 1 1 4
  161. :header-rows: 1
  162. * - Field
  163. - Type
  164. - Description
  165. * - |TYPE|
  166. - |u8|
  167. - Type of the payload. For commands always ``0x80``.
  168. * - |TC|
  169. - |u8|
  170. - Target category.
  171. * - |TID| (out)
  172. - |u8|
  173. - Target ID for outgoing (host to EC) commands.
  174. * - |TID| (in)
  175. - |u8|
  176. - Target ID for incoming (EC to host) commands.
  177. * - |IID|
  178. - |u8|
  179. - Instance ID.
  180. * - |RQID|
  181. - |u16|
  182. - Request ID.
  183. * - |CID|
  184. - |u8|
  185. - Command ID.
  186. The command struct and data, in general, does not contain any failure
  187. detection mechanism (e.g. CRCs), this is solely done on the frame level.
  188. Command-type payloads are used by the host to send commands and requests to
  189. the EC as well as by the EC to send responses and events back to the host.
  190. We differentiate between requests (sent by the host), responses (sent by the
  191. EC in response to a request), and events (sent by the EC without a preceding
  192. request).
  193. Commands and events are uniquely identified by their target category
  194. (``TC``) and command ID (``CID``). The target category specifies a general
  195. category for the command (e.g. system in general, vs. battery and AC, vs.
  196. temperature, and so on), while the command ID specifies the command inside
  197. that category. Only the combination of |TC| + |CID| is unique. Additionally,
  198. commands have an instance ID (``IID``), which is used to differentiate
  199. between different sub-devices. For example ``TC=3`` ``CID=1`` is a
  200. request to get the temperature on a thermal sensor, where |IID| specifies
  201. the respective sensor. If the instance ID is not used, it should be set to
  202. zero. If instance IDs are used, they, in general, start with a value of one,
  203. whereas zero may be used for instance independent queries, if applicable. A
  204. response to a request should have the same target category, command ID, and
  205. instance ID as the corresponding request.
  206. Responses are matched to their corresponding request via the request ID
  207. (``RQID``) field. This is a 16 bit wrapping counter similar to the sequence
  208. ID on the frames. Note that the sequence ID of the frames for a
  209. request-response pair does not match. Only the request ID has to match.
  210. Frame-protocol wise these are two separate exchanges, and may even be
  211. separated, e.g. by an event being sent after the request but before the
  212. response. Not all commands produce a response, and this is not detectable by
  213. |TC| + |CID|. It is the responsibility of the issuing party to wait for a
  214. response (or signal this to the communication framework, as is done in
  215. SAN/ACPI via the ``SNC`` flag).
  216. Events are identified by unique and reserved request IDs. These IDs should
  217. not be used by the host when sending a new request. They are used on the
  218. host to, first, detect events and, second, match them with a registered
  219. event handler. Request IDs for events are chosen by the host and directed to
  220. the EC when setting up and enabling an event source (via the
  221. enable-event-source request). The EC then uses the specified request ID for
  222. events sent from the respective source. Note that an event should still be
  223. identified by its target category, command ID, and, if applicable, instance
  224. ID, as a single event source can send multiple different event types. In
  225. general, however, a single target category should map to a single reserved
  226. event request ID.
  227. Furthermore, requests, responses, and events have an associated target ID
  228. (``TID``). This target ID is split into output (host to EC) and input (EC to
  229. host) fields, with the respecting other field (e.g. output field on incoming
  230. messages) set to zero. Two ``TID`` values are known: Primary (``0x01``) and
  231. secondary (``0x02``). In general, the response to a request should have the
  232. same ``TID`` value, however, the field (output vs. input) should be used in
  233. accordance to the direction in which the response is sent (i.e. on the input
  234. field, as responses are generally sent from the EC to the host).
  235. Note that, even though requests and events should be uniquely identifiable
  236. by target category and command ID alone, the EC may require specific
  237. target ID and instance ID values to accept a command. A command that is
  238. accepted for ``TID=1``, for example, may not be accepted for ``TID=2``
  239. and vice versa.
  240. Limitations and Observations
  241. ============================
  242. The protocol can, in theory, handle up to ``U8_MAX`` frames in parallel,
  243. with up to ``U16_MAX`` pending requests (neglecting request IDs reserved for
  244. events). In practice, however, this is more limited. From our testing
  245. (although via a python and thus a user-space program), it seems that the EC
  246. can handle up to four requests (mostly) reliably in parallel at a certain
  247. time. With five or more requests in parallel, consistent discarding of
  248. commands (ACKed frame but no command response) has been observed. For five
  249. simultaneous commands, this reproducibly resulted in one command being
  250. dropped and four commands being handled.
  251. However, it has also been noted that, even with three requests in parallel,
  252. occasional frame drops happen. Apart from this, with a limit of three
  253. pending requests, no dropped commands (i.e. command being dropped but frame
  254. carrying command being ACKed) have been observed. In any case, frames (and
  255. possibly also commands) should be re-sent by the host if a certain timeout
  256. is exceeded. This is done by the EC for frames with a timeout of one second,
  257. up to two re-tries (i.e. three transmissions in total). The limit of
  258. re-tries also applies to received NAKs, and, in a worst case scenario, can
  259. lead to entire messages being dropped.
  260. While this also seems to work fine for pending data frames as long as no
  261. transmission failures occur, implementation and handling of these seems to
  262. depend on the assumption that there is only one non-acknowledged data frame.
  263. In particular, the detection of repeated frames relies on the last sequence
  264. number. This means that, if a frame that has been successfully received by
  265. the EC is sent again, e.g. due to the host not receiving an |ACK|, the EC
  266. will only detect this if it has the sequence ID of the last frame received
  267. by the EC. As an example: Sending two frames with ``SEQ=0`` and ``SEQ=1``
  268. followed by a repetition of ``SEQ=0`` will not detect the second ``SEQ=0``
  269. frame as such, and thus execute the command in this frame each time it has
  270. been received, i.e. twice in this example. Sending ``SEQ=0``, ``SEQ=1`` and
  271. then repeating ``SEQ=1`` will detect the second ``SEQ=1`` as repetition of
  272. the first one and ignore it, thus executing the contained command only once.
  273. In conclusion, this suggests a limit of at most one pending un-ACKed frame
  274. (per party, effectively leading to synchronous communication regarding
  275. frames) and at most three pending commands. The limit to synchronous frame
  276. transfers seems to be consistent with behavior observed on Windows.