You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Looking at #2041 I noticed there is a alloca use later in the affected code (Line 153), and it can be optimized out by reusing the packet buffer.
Since we check previously that the requested memory will fit in the packet buffer after encoding we can use the packet buffer itself for the read safely. Care must be taken because later building the packet could possibly corrupt the read data (because each hex encoded byte will take 2 bytes in the buffer), but we can work around this by only using the upper half of the buffer, which should allow the encoding in place to be done safely.
The text was updated successfully, but these errors were encountered:
If there was a secondary half-sized buffer for raw bytes, then I see how it could be handled easily. But alloca/VLA does not result in on-heap fragmentation, and I don't have BMF telemetry to confirm stack usage at every alloca callsite -- 4096 seemed good enough.
That said, I only aimed to restore functionality before the impending freeze for release-candidate, so I didn't look deeper in this (entire m-packet handler or adjacent similar handlers). If you can fully get rid of these stack allocations so that e.g. I could bump GDB packet buffer (compile-time setting) from 1024 to 4096 without blowing past stlinkv3 stack size (which is a USB HS device and benefits from 512 byte bulk packets) then I'd be happy to test it.
But then again, what are the usecases of Long block reads (and writes)? I can think of some, like performing RAM and Flash dumps, extracting ETF/MTB contents, scanning memory otherwise, and inspecting large (>512 byte) in-DUT arrays. This is usually automated with help of Eclipse/VSCode/etc. IDE or (gdb) dump binary memory, not launched by manual commands. compare-sections and blank-check features are processing data in BMD without sending contents to GDB, only checksums/"Is blank".
Unrelated, I also looked into trying to implement m-response RLE compression in BMF and that needed a second buffer for memcpy/memmove. I can push that branch online if you're interested. Because Hclk > USBFS clk > SWD clk, this can optionally speed up communications with the price of complicating packet logs.
Looking at #2041 I noticed there is a
alloca
use later in the affected code (Line 153), and it can be optimized out by reusing the packet buffer.Since we check previously that the requested memory will fit in the packet buffer after encoding we can use the packet buffer itself for the read safely. Care must be taken because later building the packet could possibly corrupt the read data (because each hex encoded byte will take 2 bytes in the buffer), but we can work around this by only using the upper half of the buffer, which should allow the encoding in place to be done safely.
The text was updated successfully, but these errors were encountered: