remote_debug_drv.txt 20 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468
  1. Introduction
  2. ============
  3. The goal of this debug feature is to provide a reliable, responsive,
  4. accurate and secure debug capability to developers interested in
  5. debugging MSM subsystem processor images without the use of a hardware
  6. debugger.
  7. The Debug Agent along with the Remote Debug Driver implements a shared
  8. memory based transport mechanism that allows for a debugger (ex. GDB)
  9. running on a host PC to communicate with a remote stub running on
  10. peripheral subsystems such as the ADSP, MODEM etc.
  11. The diagram below depicts end to end the components involved to
  12. support remote debugging:
  13. : :
  14. : HOST (PC) : MSM
  15. : ,--------, : ,-------,
  16. : | | : | Debug | ,--------,
  17. : |Debugger|<--:-->| Agent | | Remote |
  18. : | | : | App | +----->| Debug |
  19. : `--------` : |-------| ,--------, | | Stub |
  20. : : | Remote| | |<---+ `--------`
  21. : : | Debug |<-->|--------|
  22. : : | Driver| | |<---+ ,--------,
  23. : : `-------` `--------` | | Remote |
  24. : : LA Shared +----->| Debug |
  25. : : Memory | Stub |
  26. : : `--------`
  27. : : Peripheral Subsystems
  28. : : (ADSP, MODEM, ...)
  29. Debugger: Debugger application running on the host PC that
  30. communicates with the remote stub.
  31. Examples: GDB, LLDB
  32. Debug Agent: Software that runs on the Linux Android platform
  33. that provides connectivity from the MSM to the
  34. host PC. This involves two portions:
  35. 1) User mode Debug Agent application that discovers
  36. processes running on the subsystems and creates
  37. TCP/IP sockets for the host to connect to. In addition
  38. to this, it creates an info port that
  39. users can connect to discover the various
  40. processes and their corresponding debug ports.
  41. Remote Debug A character based driver that the Debug
  42. Driver: Agent uses to transport the payload received from the
  43. host to the debug stub running on the subsystem
  44. processor over shared memory and vice versa.
  45. Shared Memory: Shared memory from the SMEM pool that is accessible
  46. from the Applications Processor (AP) and the
  47. subsystem processors.
  48. Remote Debug Privileged code that runs in the kernels of the
  49. Stub: subsystem processors that receives debug commands
  50. from the debugger running on the host and
  51. acts on these commands. These commands include reading
  52. and writing to registers and memory belonging to the
  53. subsystem's address space, setting breakpoints,
  54. single stepping etc.
  55. Hardware description
  56. ====================
  57. The Remote Debug Driver interfaces with the Remote Debug stubs
  58. running on the subsystem processors and does not drive or
  59. manage any hardware resources.
  60. Software description
  61. ====================
  62. The debugger and the remote stubs use Remote Serial Protocol (RSP)
  63. to communicate with each other. This is widely used protocol by both
  64. software and hardware debuggers. RSP is an ASCII based protocol
  65. and used when it is not possible to run GDB server on the target under
  66. debug.
  67. The Debug Agent application along with the Remote Debug Driver
  68. is responsible for establishing a bi-directional connection from
  69. the debugger application running on the host to the remote debug
  70. stub running on a subsystem. The Debug Agent establishes connectivity
  71. to the host PC via TCP/IP sockets.
  72. This feature uses ADB port forwarding to establish connectivity
  73. between the debugger running on the host and the target under debug.
  74. Please note the Debug Agent does not expose HLOS memory to the
  75. remote subsystem processors.
  76. Design
  77. ======
  78. Here is the overall flow:
  79. 1) When the Debug Agent application starts up, it opens up a shared memory
  80. based transport channel to the various subsystem processor images.
  81. 2) The Debug Agent application sends messages across to the remote stubs
  82. to discover the various processes that are running on the subsystem and
  83. creates debug sockets for each of them.
  84. 3) Whenever a process running on a subsystem exits, the Debug Agent
  85. is notified by the stub so that the debug port and other resources
  86. can be reclaimed.
  87. 4) The Debug Agent uses the services of the Remote Debug Driver to
  88. transport payload from the host debugger to the remote stub and vice versa.
  89. 5) Communication between the Remote Debug Driver and the Remote Debug stub
  90. running on the subsystem processor is done over shared memory (see figure).
  91. SMEM services are used to allocate the shared memory that will
  92. be readable and writeable by the AP and the subsystem image under debug.
  93. A separate SMEM allocation takes place for each subsystem processor
  94. involved in remote debugging. The remote stub running on each of the
  95. subsystems allocates a SMEM buffer using a unique identifier so that both
  96. the AP and subsystem get the same physical block of memory. It should be
  97. noted that subsystem images can be restarted at any time.
  98. However, when a subsystem comes back up, its stub uses the same unique
  99. SMEM identifier to allocate the SMEM block. This would not result in a
  100. new allocation rather the same block of memory in the first bootup instance
  101. is provided back to the stub running on the subsystem.
  102. An 8KB chunk of shared memory is allocated and used for communication
  103. per subsystem. For multi-process capable subsystems, 16KB chunk of shared
  104. memory is allocated to allow for simultaneous debugging of more than one
  105. process running on a single subsystem.
  106. The shared memory is used as a circular ring buffer in each direction.
  107. Thus we have a bi-directional shared memory channel between the AP
  108. and a subsystem. We call this SMQ. Each memory channel contains a header,
  109. data and a control mechanism that is used to synchronize read and write
  110. of data between the AP and the remote subsystem.
  111. Overall SMQ memory view:
  112. :
  113. : +------------------------------------------------+
  114. : | SMEM buffer |
  115. : |-----------------------+------------------------|
  116. : |Producer: LA | Producer: Remote |
  117. : |Consumer: Remote | subsystem |
  118. : | subsystem | Consumer: LA |
  119. : | | |
  120. : | Producer| Consumer|
  121. : +-----------------------+------------------------+
  122. : | |
  123. : | |
  124. : | +--------------------------------------+
  125. : | |
  126. : | |
  127. : v v
  128. : +--------------------------------------------------------------+
  129. : | Header | Data | Control |
  130. : +-----------+---+---+---+-----+----+--+--+-----+---+--+--+-----+
  131. : | | b | b | b | | S |n |n | | S |n |n | |
  132. : | Producer | l | l | l | | M |o |o | | M |o |o | |
  133. : | Ver | o | o | o | | Q |d |d | | Q |d |d | |
  134. : |-----------| c | c | c | ... | |e |e | ... | |e |e | ... |
  135. : | | k | k | k | | O | | | | I | | | |
  136. : | Consumer | | | | | u |0 |1 | | n |0 |1 | |
  137. : | Ver | 0 | 1 | 2 | | t | | | | | | | |
  138. : +-----------+---+---+---+-----+----+--+--+-----+---+--+--+-----+
  139. : | |
  140. : + |
  141. : |
  142. : +------------------------+
  143. : |
  144. : v
  145. : +----+----+----+----+
  146. : | SMQ Nodes |
  147. : |----|----|----|----|
  148. : Node # | 0 | 1 | 2 | ...|
  149. : |----|----|----|----|
  150. : Starting Block Index # | 0 | 3 | 8 | ...|
  151. : |----|----|----|----|
  152. : # of blocks | 3 | 5 | 1 | ...|
  153. : +----+----+----+----+
  154. :
  155. Header: Contains version numbers for software compatibility to ensure
  156. that both producers and consumers on the AP and subsystems know how to
  157. read from and write to the queue.
  158. Both the producer and consumer versions are 1.
  159. : +---------+-------------------+
  160. : | Size | Field |
  161. : +---------+-------------------+
  162. : | 1 byte | Producer Version |
  163. : +---------+-------------------+
  164. : | 1 byte | Consumer Version |
  165. : +---------+-------------------+
  166. Data: The data portion contains multiple blocks [0..N] of a fixed size.
  167. The block size SM_BLOCKSIZE is fixed to 128 bytes for header version #1.
  168. Payload sent from the debug agent app is split (if necessary) and placed
  169. in these blocks. The first data block is placed at the next 8 byte aligned
  170. address after the header.
  171. The number of blocks for a given SMEM allocation is derived as follows:
  172. Number of Blocks = ((Total Size - Alignment - Size of Header
  173. - Size of SMQIn - Size of SMQOut)/(SM_BLOCKSIZE))
  174. The producer maintains a private block map of each of these blocks to
  175. determine which of these blocks in the queue is available and which are free.
  176. Control:
  177. The control portion contains a list of nodes [0..N] where N is number
  178. of available data blocks. Each node identifies the data
  179. block indexes that contain a particular debug message to be transferred,
  180. and the number of blocks it took to hold the contents of the message.
  181. Each node has the following structure:
  182. : +---------+-------------------+
  183. : | Size | Field |
  184. : +---------+-------------------+
  185. : | 2 bytes |Staring Block Index|
  186. : +---------+-------------------+
  187. : | 2 bytes |Number of Blocks |
  188. : +---------+-------------------+
  189. The producer and the consumer update different parts of the control channel
  190. (SMQOut / SMQIn) respectively. Each of these control data structures contains
  191. information about the last node that was written / read, and the actual nodes
  192. that were written/read.
  193. SMQOut Structure (R/W by producer, R by consumer):
  194. : +---------+-------------------+
  195. : | Size | Field |
  196. : +---------+-------------------+
  197. : | 4 bytes | Magic Init Number |
  198. : +---------+-------------------+
  199. : | 4 bytes | Reset |
  200. : +---------+-------------------+
  201. : | 4 bytes | Last Sent Index |
  202. : +---------+-------------------+
  203. : | 4 bytes | Index Free Read |
  204. : +---------+-------------------+
  205. SMQIn Structure (R/W by consumer, R by producer):
  206. : +---------+-------------------+
  207. : | Size | Field |
  208. : +---------+-------------------+
  209. : | 4 bytes | Magic Init Number |
  210. : +---------+-------------------+
  211. : | 4 bytes | Reset ACK |
  212. : +---------+-------------------+
  213. : | 4 bytes | Last Read Index |
  214. : +---------+-------------------+
  215. : | 4 bytes | Index Free Write |
  216. : +---------+-------------------+
  217. Magic Init Number:
  218. Both SMQ Out and SMQ In initialize this field with a predefined magic
  219. number so as to make sure that both the consumer and producer blocks
  220. have fully initialized and have valid data in the shared memory control area.
  221. Producer Magic #: 0xFF00FF01
  222. Consumer Magic #: 0xFF00FF02
  223. SMQ Out's Last Sent Index and Index Free Read:
  224. Only a producer can write to these indexes and they are updated whenever
  225. there is new payload to be inserted into the SMQ in order to be sent to a
  226. consumer.
  227. The number of blocks required for the SMQ allocation is determined as:
  228. (payload size + SM_BLOCKSIZE - 1) / SM_BLOCKSIZE
  229. The private block map is searched for a large enough continuous set of blocks
  230. and the user data is copied into the data blocks.
  231. The starting index of the free block(s) is updated in the SMQOut's Last Sent
  232. Index. This update keeps track of which index was last written to and the
  233. producer uses it to determine where the next allocation could be done.
  234. Every allocation, a producer updates the Index Free Read from its
  235. collaborating consumer's Index Free Write field (if they are unequal).
  236. This index value indicates that the consumer has read all blocks associated
  237. with allocation on the SMQ and that the producer can reuse these blocks for
  238. subsquent allocations since this is a circular queue.
  239. At cold boot and restart, these indexes are initialized to zero and all
  240. blocks are marked as available for allocation.
  241. SMQ In's Last Read Index and Index Free Write:
  242. These indexes are written to only by a consumer and are updated whenever
  243. there is new payload to be read from the SMQ. The Last Read Index keeps
  244. track of which index was last read by the consumer and using this, it
  245. determines where the next read should be done.
  246. After completing a read, Last Read Index is incremented to the
  247. next block index. A consumer updates Index Free Write to the starting
  248. index of an allocation whenever it has completed processing the blocks.
  249. This is an optimization that can be used to prevent an additional copy
  250. of data from the queue into a client's data buffer and the data in the queue
  251. itself can be used.
  252. Once Index Free Write is updated, the collaborating producer (on the next
  253. data allocation) reads the updated Index Free Write value and it then
  254. updates its corresponding SMQ Out's Index Free Read and marks the blocks
  255. associated with that index as available for allocation. At cold boot and
  256. restart, these indexes are initialized to zero.
  257. SMQ Out Reset# and SMQ In Reset ACK #:
  258. Since subsystems can restart at anytime, the data blocks and control channel
  259. can be in an inconsistent state when a producer or consumer comes up.
  260. We use Reset and Reset ACK to manage this. At cold boot, the producer
  261. initializes the Reset# to a known number ex. 1. Every other reset that the
  262. producer undergoes, the Reset#1 is simply incremented by 1. All the producer
  263. indexes are reset.
  264. When the producer notifies the consumer of data availability, the consumer
  265. reads the producers Reset # and copies that into its SMQ In Reset ACK#
  266. field when they differ. When that occurs, the consumer resets its
  267. indexes to 0.
  268. 6) Asynchronous notifications between a producer and consumer are
  269. done using the SMP2P service which is interrupt based.
  270. Power Management
  271. ================
  272. None
  273. SMP/multi-core
  274. ==============
  275. The driver uses completion to wake up the Debug Agent client threads.
  276. Security
  277. ========
  278. From the perspective of the subsystem, the AP is untrusted. The remote
  279. stubs consult the secure debug fuses to determine whether or not the
  280. remote debugging will be enabled at the subsystem.
  281. If the hardware debug fuses indicate that debugging is disabled, the
  282. remote stubs will not be functional on the subsystem. Writes to the
  283. queue will only be done if the driver sees that the remote stub has been
  284. initialized on the subsystem.
  285. Therefore even if any untrusted software running on the AP requests
  286. the services of the Remote Debug Driver and inject RSP messages
  287. into the shared memory buffer, these RSP messages will be discarded and
  288. an appropriate error code will be sent up to the invoking application.
  289. Performance
  290. ===========
  291. During operation, the Remote Debug Driver copies RSP messages
  292. asynchronously sent from the host debugger to the remote stub and vice
  293. versa. The debug messages are ASCII based and relatively short
  294. (<25 bytes) and may once in a while go up to a maximum 700 bytes
  295. depending on the command the user requested. Thus we do not
  296. anticipate any major performance impact. Moreover, in a typical
  297. functional debug scenario performance should not be a concern.
  298. Interface
  299. =========
  300. The Remote Debug Driver is a character based device that manages
  301. a piece of shared memory that is used as a bi-directional
  302. single producer/consumer circular queue using a next fit allocator.
  303. Every subsystem, has its own shared memory buffer that is managed
  304. like a separate device.
  305. The driver distinguishes each subsystem processor's buffer by
  306. registering a node with a different minor number.
  307. For each subsystem that is supported, the driver exposes a user space
  308. interface through the following node:
  309. - /dev/rdbg-<subsystem>
  310. Ex. /dev/rdbg-adsp (for the ADSP subsystem)
  311. The standard open(), close(), read() and write() API set is
  312. implemented.
  313. The open() syscall will fail if a subsystem is not present or supported
  314. by the driver or a shared memory buffer cannot be allocated for the
  315. AP - subsystem communication. It will also fail if the subsytem has
  316. not initialized the queue on its side. Here are the error codes returned
  317. in case a call to open() fails:
  318. ENODEV - memory was not yet allocated for the device
  319. EEXIST - device is already opened
  320. ENOMEM - SMEM allocation failed
  321. ECOMM - Subsytem queue is not yet setup
  322. ENOMEM - Failure to initialize SMQ
  323. read() is a blocking call that will return with the number of bytes written
  324. by the subsystem whenever the subsystem sends it some payload. Here are the
  325. error codes returned in case a call to read() fails:
  326. EINVAL - Invalid input
  327. ENODEV - Device has not been opened yet
  328. ERESTARTSYS - call to wait_for_completion_interruptible is interrupted
  329. ENODATA - call to smq_receive failed
  330. write() attempts to send user mode payload out to the subsystem. It can fail
  331. if the SMQ is full. The number of bytes written is returned back to the user.
  332. Here are the error codes returned in case a call to write() fails:
  333. EINVAL - Invalid input
  334. ECOMM - SMQ send failed
  335. In the close() syscall, the control information state of the SMQ is
  336. initialized to zero thereby preventing any further communication between
  337. the AP and the subsystem. Here is the error code returned in case
  338. a call to close() fails:
  339. ENODEV - device wasn't opened/initialized
  340. The Remote Debug driver uses SMP2P for bi-directional AP to subsystem
  341. notification. Notifications are sent to indicate that there are new
  342. debug messages available for processing. Each subsystem that is
  343. supported will need to add a device tree entry per the usage
  344. specification of SMP2P driver.
  345. In case the remote stub becomes non operational or the security configuration
  346. on the subsystem does not permit debugging, any messages put in the SMQ will
  347. not be responded to. It is the responsibility of the Debug Agent app and the
  348. host debugger application such as GDB to timeout and notify the user of the
  349. non availability of remote debugging.
  350. Driver parameters
  351. =================
  352. None
  353. Config options
  354. ==============
  355. The driver is configured with a device tree entry to map an SMP2P entry
  356. to the device. The SMP2P entry name used is "rdbg". Please see
  357. kernel\Documentation\arm\msm\msm_smp2p.txt for information about the
  358. device tree entry required to configure SMP2P.
  359. The driver uses the SMEM allocation type SMEM_LC_DEBUGGER to allocate memory
  360. for the queue that is used to share data with the subsystems.
  361. Dependencies
  362. ============
  363. The Debug Agent driver requires services of SMEM to
  364. allocate shared memory buffers.
  365. SMP2P is used as a bi-directional notification
  366. mechanism between the AP and a subsystem processor.
  367. User space utilities
  368. ====================
  369. This driver is meant to be used in conjunction with the user mode
  370. Remote Debug Agent application.
  371. Other
  372. =====
  373. None
  374. Known issues
  375. ============
  376. For targets with an external subsystem, we cannot use
  377. shared memory for communication and would have to use the prevailing
  378. transport mechanisms that exists between the AP and the external subsystem.
  379. This driver cannot be leveraged for such targets.
  380. To do
  381. =====
  382. None