Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of alloca in gdb m-packet handling can be optimized out #2042

Open
perigoso opened this issue Jan 6, 2025 · 2 comments
Open

Use of alloca in gdb m-packet handling can be optimized out #2042

perigoso opened this issue Jan 6, 2025 · 2 comments
Labels
Enhancement General project improvement GDB Issue/PR related to GDB

Comments

@perigoso
Copy link
Contributor

perigoso commented Jan 6, 2025

Looking at #2041 I noticed there is a alloca use later in the affected code (Line 153), and it can be optimized out by reusing the packet buffer.

Since we check previously that the requested memory will fit in the packet buffer after encoding we can use the packet buffer itself for the read safely. Care must be taken because later building the packet could possibly corrupt the read data (because each hex encoded byte will take 2 bytes in the buffer), but we can work around this by only using the upper half of the buffer, which should allow the encoding in place to be done safely.

@ALTracer
Copy link
Contributor

ALTracer commented Jan 6, 2025

If there was a secondary half-sized buffer for raw bytes, then I see how it could be handled easily. But alloca/VLA does not result in on-heap fragmentation, and I don't have BMF telemetry to confirm stack usage at every alloca callsite -- 4096 seemed good enough.

That said, I only aimed to restore functionality before the impending freeze for release-candidate, so I didn't look deeper in this (entire m-packet handler or adjacent similar handlers). If you can fully get rid of these stack allocations so that e.g. I could bump GDB packet buffer (compile-time setting) from 1024 to 4096 without blowing past stlinkv3 stack size (which is a USB HS device and benefits from 512 byte bulk packets) then I'd be happy to test it.

But then again, what are the usecases of Long block reads (and writes)? I can think of some, like performing RAM and Flash dumps, extracting ETF/MTB contents, scanning memory otherwise, and inspecting large (>512 byte) in-DUT arrays. This is usually automated with help of Eclipse/VSCode/etc. IDE or (gdb) dump binary memory, not launched by manual commands. compare-sections and blank-check features are processing data in BMD without sending contents to GDB, only checksums/"Is blank".

Unrelated, I also looked into trying to implement m-response RLE compression in BMF and that needed a second buffer for memcpy/memmove. I can push that branch online if you're interested. Because Hclk > USBFS clk > SWD clk, this can optionally speed up communications with the price of complicating packet logs.

@dragonmux
Copy link
Member

Sounds good to us perigoso! Really like this plan as a way for getting rid of a rather nasty alloca() and getting some more memory safety.

@dragonmux dragonmux added Enhancement General project improvement GDB Issue/PR related to GDB labels Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement General project improvement GDB Issue/PR related to GDB
Projects
None yet
Development

No branches or pull requests

3 participants