Fix: Avoid a hang in scanning Cortex-A of STM32MP15x #1628
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Detailed description
BMD at current
main
busy-loops (hangs, effective DoS) if user attempts to scan a STM32MP15x device with any of supported adapters, requiring BMDA restart or probe reboot. This behaviour was enabled recently by Cortex-A related PRs. Previously BMD did not attempt to attach, halt, or otherwise touch CA7 during scan/probe.It took three consecutive days of log analysis from different adapters and gdbserver software with help and suggestions from Dragonmux to come up with some working fix.
Our working hypothesis is that even in Engineering Boot, the primary bootloader (BootROM) enables OSLock, which per Arch.ref.man. prohibits external debugger access to some registers. OpenOCD clears it, so when working through a kind of adapter both servers speak, I could observe BMDA working fine with this target after running OOCD once and before power-cycling DUT (there is no good SRST/Reset button solution without dropping Vdd+Vddcore, see errata).
I tried coding a few different things across cortexa.c and most of them ended up here. Namely, the OSLock check, the HALT_DBG_ENABLE + ITR_ENABLE typo fix, the unused LAR/DSCCR/DSMCR clears, macro definitons for these, and last but not least, my helper routine for printing human-readable names for set bits of a register to save developers from the hassle of
using a decoder ringkeeping A.R.M. and calculator open for understanding some hex dumps.Note that because the problem happens in cortexa_probe, earlier than cortexa_attach, I had to move HDBGen+ITRen writes up there. This is less clean because the user is not guaranteed to attach (so BMD then runs detach cleanup), but I submit just the bug fix for now.
This patchset as it stands was successfully tested in operation by me against STM32MP157D-DK1 with its onboard STLink/V2-1 SWD-only, but also against custom hardware connected to {STLink/V2 standalone; JLink V8; CMSIS-DAP v1/v2 free-dap on rp2040; finally a BMP(-compatible) blackpill-f411ce}. On all debuggers the hang is not seen anymore, in GDB attach and run/continue works, proper general register values are shown. On BMP there's no hang anymore in *_scan which required pressing the probe's reset button.
I understand this patchset may be not as clean and subject to renames and nitpicks, but a) I needed to fix in VCS a known good working solution; b) this problem blocks my PR1546, in which CM4 target support code would not be reachable anymore. I may agree to rebase away all changes except OSLock which is the core change, and/or reformat the DSCR-touching stuff.
Users of Cortex-A targets are welcome to test this patchset on their setups in case I introduce a regression for them by fixing the regression I face.
Your checklist for this pull request
make PROBE_HOST=native
)make PROBE_HOST=hosted
)Closing issues
Not quite fixes but helps #1546.