-
-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
__debug function performance #282
Comments
I shortly looked at the CPU diagnostics of displaying 100 of your Strings and the debugger mostly waits for the call to be performed in the other process. AFAICT this needs multiple inter-process-communication (devenv.exe <-> msvcmon.exe <-> debuggee.exe) so I suspect there is some inefficient waiting going on. |
Hmmm, well, in its current state it's almost unusable. I think I'm going to have to turn the __debug feature off, because it's just too slow :/ If it's about ipc waiting, then finding any opportunities for batching up requests seems like the way to go... is the code structured in such a way that you can gather the requests rather than resolving them immediately, and then send them all to the debuggee in one batch? That might require a lib compiled into the debuggee offering a function which can receive a bundle of requests from the debugger, and then the debuggee might iterate the bundle and resolve them all in one big go? :/ |
Maybe we could put a function in druntime that's present when building with symbols... at very least, I wouldn't be upset to link a lib. |
The swapped GC is inside a DLL loaded into the target process, so that could contain helper functions. The calls by VS to evaluate locals or expressions are rather fine grained and probably not easily combinable (especially not delayed until more requests come in), but if an enumerator is returned to VS, you can predict that it is pretty likely that it will get enumerated to some extend, so the elements could be evaluated in larger chunks and be cached. |
Yeah, this was my concern. I can a Microsoft API being very event/response based... with an expectation of immediate responses :/
I'm not sure I follow, but I'll take your word for it. Can the debugger accept and retain an enumerator object across requests, and how would that have an IPC advantage? |
On second look, the API does allow to return results asynchronously, so it could allow bundling multiple requests to the target process.
For structs and arrays an enumerator is returned when asked for "children", and VS then calls something similar to On the other hand, calling a function in the process is set up via an abstract stack language that gets compiled into the target process. This currently happens for each evaluation. Maybe it is the compilation that takes most of the time (not the execution) and can be avoided by reusing the same compiled instruction sequence with only the this-pointer exchanged. |
Okay, it sounds like there's multiple promising paths forwards. |
FYI: I tried executing the same "compiled" instructions multiple times, but that won't help speeding up evaluation as each execution takes 40-50 ms. So that idea can be dismissed. Then I added partial experimental support for the asynchronous interface of the debugger functions. As a result each function call executed in the target process is added to a work list provided by the debugger host. The work list gets executed in another thread, but it is still sequential and takes about the same time for each operation. My implementation isn't yet complete but I assume that the advantage of this will be that it doesn't block the UI and operations can be cancelled before completion and you can continue stepping through the program without waiting for all debug windows being updated. Fully supporting that seems realistic, but I expect it's still quite some work to do. I guess that might help for one of the pain points, but execution will still be slow. To avoid that we could reimplement the worklist and combine requests ourselves before sending them to the target process. Not sure how feasible this actually is, though. Anyway, it still needs the same asynchronous evaluation as above, so the changes above would not be lost. A completely different approach could be the implementation of something similar to C++'s natvis definitions, that allows expressions and some conditions on the debugger side and just has to read target process memory. That might get tricky as well especially with respect to templated structs/classes, though. |
Where you say 'full support', you mean batching them and sending them to the debuggee all at once? It's definitely a huge improvement if it stops locking up the UI, even if it still takes a while to populate the window when you stop. The slow stepping speed is definitely a show-stopper. natvis does seem to be very fast, but it's certainly a weird definition language and it's really hard to implement some data structures. Your idea with __debug functions is definitely a far superior idea, if it can be made to work efficiently. |
I meant full support for the asynchronous interface. My partial implementation is currently limited to displaying arrays of structs with __debugOverview definitions, other enumerations (struct members, locals, etc.) are still using synchronous calls. It seems feasible and worthwhile to adapt these similarly, but still a bit of work. I'll try that. Implementing something like natvis requires a larger effort, indeed. The expression evaluation part is there, though. I can imagine evaluating a static __debugOverviewExpr struct member and evaluating that as if it was a watch expression, but it get's more elaborate if control flow with statements is needed. |
If you're spending some time with VisualD at the moment, do you also have any theories why go-to definition only works around 50% of the time? I guess the frontend is bailing out and much of the code has no semantic markup? Is there a way we can get a clear error output in cases where the semantic bails out? auto-complete and goto definition are really flakey now... I guess it's from the change to DMD-FE? |
Actually, just to add to that something I hadn't noticed earlier; I was just playing around trying to identify patterns why completion suggestions weren't working... I half-typed a symbol (an enum key in this case), and then ctrl-space which should auto-complete or pop-up the completion suggestions, but it was doing nothing like usual... and then I got distracted for a moment, and then a good few seconds later the completion suggestions popped up. So, that's encouraging, but what could possibly disrupt the completion from working quickly? It doesn't feel like something that should be involving IPC or any of those latent operations. |
While you edit the code you change the semantics at the current caret location, so the new source has to be evaluated first. At least the current file has to be analyzed semantically again. The dmdserver tries to minimize analyzing imported modules, but not sure how well this works. In my experience, browsing the source with goto definition works quite ok as you don't edit it in that situation, but completion often is rather slow (I also found a few regressions regarding class members and fields). IIRC it is slower than when actually compiling, not sure what's causing that, I'll have to profile that. I have prepared a version of the dmdserver that can log Visual D's requests and responses, maybe that can shed some insight. I want to complete the async debugger interface before the next release, though. |
You can try the new build of MagoNatCC.dll from https://ci.appveyor.com/project/rainers/mago/builds/50930366/artifacts. This uses the async evaluation for __debugOverview calls, but doesn't work correctly in combination with switching the GC. It seems pretty responsive in my tests, but if I failed to handle completion or cancellation requests, it usually freezes the debugging session at some point, though. |
They're all under |
Just like the last time, only this one needs to be replaced: "c:\Program Files\Microsoft Visual Studio\2022\Preview\Common7\Packages\Debugger\MagoNatCC.dll" |
Too bad. Is this one of the places where you would expect one of your custom strings to be displayed? If something else, could you specify more details about its type? |
Ah, classes instead of structs seem broken. I'll take a look... |
There is a new build here: https://ci.appveyor.com/project/rainers/mago/builds/50957984/artifacts that fixes class display and a few other issues. |
…nchronously * mago: fixed switching GC for funtion evaluation and adapted it to dmd 2.109 * dmdserver: improve completion for member variables, code in switch statement * dmdseerver: added option to display type size and alignment in tool tip * added separate version of dbuild.dll linked against Microsoft.Build.CPPTasks.Common for VS 17.12
I fixed several more issues caused by the conversion to the async interface. It is now released as part of https://github.com/dlang/visuald/releases/tag/v1.4.0-rc3 |
So I've been running with this new code for a while now. Not blocking the debugger is certainly a lot better, but it's still too slow to be practical to actually use in a lot of my code. |
That's not the case for me, I can step instructions immediately while the watch/local windows are still showing "busy". Maybe your use case still hits an expression that causes synchronous calls (tuples come to mind). |
Interesting. I haven't seen any cases where the pattern is different. My main issue with tuples is that they always show like an empty struct. Just |
I noticed that I must have been looking mostly at locals or arrays of structs with |
Are there any opportunities to optimise or improve the performance of these __debug functions?
They are extremely slow. I have a custom string type and if a struct contains a few strings, stepping feels really sluggish, and if there's an array anywhere in view; it takes seconds to 10s of seconds each step.
Why are they so slow? It doesn't seem right that setting up the call should be that much trouble? If I can understand the performance characteristics, maybe I can make improvements on the app side...?
I have NOT enabled the "switch GC" option, since I am confident all my __debug functions are @nogc.
The text was updated successfully, but these errors were encountered: