-
Notifications
You must be signed in to change notification settings - Fork 127
Notes on vogl limitations
All debuggers have limitations. Most of the time, you don't really know what they are until they pop up at 3am. So here's a list of known limitations (some we'll fix, some are very low priority or just not worth it):
- Right now vogl is only a GL v1-v3.3. core/compat profile tracing/snapshotting framework and toolset, and the UI is a total work in progress.
We do already support a bunch of GL 4.x API's, all GL 4.x texture target types, and some things like compute shaders and image unit bindings. But there's a ton of GL 4.x stuff not supported yet. GL 4.x support is our next major goal after 3.x support is stable and validated in more apps.
The entire UI is still very, very new. The texture, renderbuffer, and default framebuffer viewer in particular is very basic. We've just added support for viewing traces that have multiple contexts, but this feature is still under construction. There are dragons in there on top of dragons.
So if you are expecting something really slick like PIX then you're in for a bit of a wait. But we're confident the foundation and general approach we're using to build the debugger is reasonably solid and has legs.
Peter Lohrmann is working on bootstrapping the UI. We're currently using it to help us debug the debugger itself (which is progress of sorts), but there's a bunch of work left before you could probably use it to debug a title with it.
- We don't support LD_PRELOAD-style tracing on Optimus setups (popular on laptops).
We do support manually loading our tracer (libvogltrace32/64.so) on Optimus, but it's not something we've had the time to test much. To do this, manually load libvogltrace and dlsym() the gliGetProcAddressRAD() function (to be renamed to voglGetProcAddress()).
We would like to support LD_PRELOAD-style tracing on Optimus, but honestly it's challenging enough to do this on vanilla desktop stacks. Once you throw 2 drivers in there all bets are off with all the tracers I've tried (including apitrace - it just core dumps when I tried a few months ago). Any help in this area would be great.
- Can't take state snapshots during tracing while any buffers are currently mapped.
We can during replaying now (while trimming or debugging), just not during tracing. This is typically not a problem because almost all apps just map a buffer, poke around inside the mapped region (reading and/or writing with the CPU), then unmap and move on.
Reliably removing this limitation during tracing in all scenarios seems challenging, but we will be adding support for snapshotting during tracing when buffers are completely mapped (which is always the case in the apps we've seen that do this).
- GL 4.x is not supported for full-stream or snapshotting
There's a lot of GL 4.x stuff that will work, but it's not been a priority to support the latest bleeding edge stuff. Almost all shipped GL products we're seeing only use GL 3.x, at best. Interestingly, the biggest/most ported releases tend to use a very conservative set of GL v2/v3.
Cubemap arrays are now supported for snapshotting, but this target type hasn't been tested much yet. Here's the list of texture types we can snapshot: 1D, 2D, RECTANGLE, CUBE_MAP, CUBE_MAP_ARRAY, 1D_ARRAY, 2D_ARRAY, 3D, BUFFER, 2D_MULTISAMPLE, and 2D_MULTISAMPLE_ARRAY. Incomplete textures are OK, but you'll get a warning if you haven't properly set GL_TEXTURE_MAX_LEVEL (which you most definitely should always do because not doing so is unreliable in practice).
- Abuse of GL handles+multiple contexts
Sadly GL handles behave in interesting and obscure ways once you introduce sharelists. So before you delete textures (and most other objects) you should make sure they are not bound on other contexts before you delete them, otherwise you're going down a direction that you'll probably regret (and that will give vogl headaches). vogl will give you errors on this scenario when you try to snapshot. For example:
Let's say you create a second context that shares with your first context. It gens a texture (handle=1), binds it on both contexts, calls glTexStorage() to initialize it, then deletes the texture on the 1st context. Everything appears as expected on the 1st context: the texture becomes auto-unbound, glIsTexture() reports false, and I can't retrieve the texture's width anymore (using glGetTexLevelParameteriv()). All nice and neat.
But on the 2nd context, the texture remains bound, glIsTexture() returns false, but I can still retrieve the texture's width. If I call glGenTextures() handle 1 gets immediately reused, even though it's still bound (as reported by glGet() on GL_TEXTURE_BINDING_2D) and even though I can retrieve texture 1's width. At this point handle 1 means two different things (!) on this specific context, which is most wonderful. If I then rebind texture handle 1 (which was just re-genned) I can no longer retrieve the width.
- Can't snapshot textures after they are deleted (but still bound elsewhere)
We support snapshotting shaders that have been attached to programs and then immediately deleted. We also support snapshotting programs that have been deleted but are still bound. These are pretty common GL patterns. At program link time we make a deep copy of all attached shaders (called the "link time snapshot" in the code), so we can guarantee we can snapshot and recreate the program's actual linked state no matter what the app does with the shaders after linking.
However, there are other scenarios (such as binding a texture to a FBO, then deleting the texture but keeping it bound to the FBO) that we don't fully support for snapshotting. This scenario may never be fully supported: the last time I tried I couldn't query state of deleted (but still bound) textures on at least one driver, and we're not going to deeply shadow all texture state to work around this. Luckily, I've only ever seen this done purposely in one app so far, and the attached texture was not actually used for rendering purposes after the deletion. (They kept it attached to keep their hands on the GPU memory so the driver wouldn't reclaim it.)
vogl will spit out an error and typically try to continue snapshotting when it encounters a handle attached to an object that has been deleted (and we've lost track of). You'll get a handle remap error, because we won't know how to remap the handle from the GL replay domain back into the trace domain. The snapshot may cause the replayer to diverge, though.
- Deleted buffers that are still bound can become unbound while snapshotting
The snapshot system temporarily binds various buffers on each context to snapshot them. If a deleted buffer is bound on a context, it may become permanently unbound after a snapshot (because when we bind a new buffer the old, deleted buffer on that binding point goes away). One workaround is to make sure you always unbind your buffers before deleting them. Longer term, we're probably going to be moving to a system which uses a temporary helper context (that shares lists/objects with each app's context) to avoid manipulating the app's contexts as much as possible.
- During replaying the default (GLX) framebuffer is always 32-bit RGBA, no MSAA, with a 24/8 depth stencil buffer.
On the todo list, but this hasn't been a problem so far. Apps that use MSAA tend to use renderbuffers or maybe MSAA textures, probably because this is more portable (vs. mucking around with the default GLX framebuffer's setup). It's possible for an app replay to diverge if the default framebuffer has a configuration that it didn't have during tracing, but in practice I haven't seen this happen.
- Replay window auto-resizing can be a problem in some apps
Unlike apitrace, we only use a single replay window and resize it as needed. The auto-resize logic can get stuck resizing too much. This problem pops up most often in GLUT/FreeGLUT apps. We can capture/replay them, but the replayer's window code tends to get confused by the GLUT UI window activity. It'll still replay properly, but slowly as the replayer auto-resizes the replay window.
If the window auto-resizes too much use "-lock_window_dimensions -width X -height Y" on the voglreplay command line to lock the replay window to a fixed size.
We may switch to apitrace-style multiple windows, or maybe pbuffers, to work around this (needs investigation).
- Can't snapshot inside of glBegin/glEnd regions.
We didn't think it was worth the extra complexity to be able to snapshot/restore within glBegin/glEnd sequences, so either snapshot right before or right after the region. (Hey, at least we support snapshotting apps that use glBegin at all!)
- Display list limitations
No recursion and no resources can be bound in the display list but textures. We do support around 400 API's inside of display lists. GL display lists are ancient API's at this point, so I don't think we'll do much more in this area unless a big title from the past uses them. (We do already support Doom3's usage of GL display lists, though.)
- Be careful deleting contexts that share lists with other contexts
We support tracing/replaying/snapshotting/restoring the state of multiple contexts. vogl has the concept of "root" contexts and "sharelist groups". A sharelist group is 2 or more contexts that share objects, and the first context created in this group (that doesn't, and can't, share with anything else) is marked as the "root" context for that group.
vogl can't snapshot state if the "root" context of a sharelist group is destroyed while other leaf contexts are still present. Either snapshot immediately after all the leaf contexts are destroyed, or reorder your context deletions so the root gets killed last. In 99% of cases none of this matters; most apps just delete all their contexts at once or just leak them at exit.
- Forking while tracing
We've encountered problems with this on some apps (mostly Mono ones I think). Needs investigation, we haven't tested it and it's low priority unless an important enough engine or title does it.
- Try to delete your contexts before exiting
We've got several hooks in there to make sure the trace is properly flushed and closed when apps exit and leak their contexts. These hooks work most of the time, but it's best if you properly tear down your contexts when you exit.
The replayer does support unflushed traces (with no trace archive at the end), but there are no guarantees.
Also, not properly tearing down your contexts before exiting actually makes it very difficult for us to fully flush any in-progress asynchronous PBO readbacks (used for real-time JPEG capturing).
- Driver compat
We've tested the most on NVidia, a moderate amount on AMD, and (unfortunately) very very little on Intel's open source driver so far. (Not purposely - it's just a time limitation.) We mostly ping-pong between NVidia and AMD as driver bugs pop up and we wait for the vendor to provide us with fixes. A developer at LunarG is now helping us get vogl working on Intel's open source driver.
- Program binary gotchas
If you trace a 32-bit app that uses program binaries, on at least 1 driver (NVidia) you must replay using the 32-bit replayer (same for 64-bit). You can forcefully disable the app's usage of program binaries while tracing using --vogl_disable_gl_program_binary. This flag causes the tracer to remove the GL_ARB_get_program_binary extension string, and it'll also force the driver to always fail links with program bins (in case you don't check the string).
We've gone back and forth with always disabling program binaries by default in the tracer, but at the end of the day we take the policy of changing the app's behavior during tracing as little as possible unless you have purposely chosen to override something.
Note program binaries are usually extremely fragile, so traces containing program binaries may only be replayable on the exact driver version you captured them on.
- Can't take a snapshot while tracing if other threads have contexts current
We take the snapshot immediately after the next glXSwapBuffers() call. The tracer will attempt to make each context current on the same thread that calls glXSwapBuffers() so it can take a snapshot, but it won't be able to do this if the app has any helper contexts current on the other thread(s). So don't leave your helper contexts current across swaps if you want to take a snapshot. (We couldn't think of a reliable/robust way around this limitation.)
To snapshot during tracing, write a file named __trigger_capture__
to the app's current directory and the tracer will immediately take a snapshot. You can take as many snapshots as you want while tracing. (Of course, you can't have specified "--vogl_tracefile X" on your command line, which would have put the tracer into full-stream mode.) I'll better document this within a day or so, for now just search the code in vogl_intercept.cpp.
- Replayer whitelist
If the tracer encounters a GL/GLX function it knows the replayer won't be able to handle it'll give you an error when it encounters the call. The call will be written to the trace as best the tracer can, and the call will go directly to the driver, but the replayer will ignore it (after spitting out an error message). When you exit the traced app, you'll get a list of non-whitelisted funcs that were actually called during tracing. The func whitelist is the union of the API's contained in two files: https://github.com/ValveSoftware/vogl/blob/master/glspec/gl_glx_whitelisted_funcs.txt https://github.com/ValveSoftware/vogl/blob/master/glspec/gl_glx_simple_replay_funcs.txt
You can still try to replay this trace, but it may diverge or horribly fail. To see the full whitelist run the "voglgen" tool with the -debug option in the glspec directory.
Some of the newer GL debug related funcs aren't in the whitelist yet, I'll be adding them in very soon.
You'll get warnings if you call GetProcAddress() on GL/GLX functions that are not in the whitelist. This is typically harmless, most apps use GL extension libraries that retrieve the addresses of hundreds to thousands of GL funcs they never actually call.
- Replayer always renders to a window
If your desktop resolution is limited to (say for example) 1024x768, but the trace wants a 1600x1200 window, it'll play back OK except the default framebuffer's backbuffer will not be fully retrievable. This causes problems for things like the regression test, which want to read the backbuffer's full contents and compute a CRC. We're guessing pbuffers will solve this.
- Tracing on driver X and playing back (or snapshotting) on driver Y is not well tested
There are probably dragons here. We're testing this more often now that we have a regression test, but this scenario can be challenging. We have successfully played back a number of titles on AMD that were recorded on NVidia, for example. But some diverge for obvious reasons, like Doom3 which uses a number of NVidia extensions missing on AMD.
- If the traced app crashes, or exits without destroying its contexts, the tracer can't save the last ~2 screen captures when using --vogl_dump_jpeg_screenshots or --vogl_dump_png_screenshots
Currently the tracer always uses a glReadPixels() with a set of pixel pack buffers to avoid stalling the pipeline when screen capturing is enabled. We couldn't find a reliable way of retrieving the final few frames when the app crashes, or when it failed to destroy its contexts on exit. It seemed cleaner to just not write the final few screenshots vs. risking core dumping the app as we attempt to make GL calls to retrieve the final few frames. We'll eventually add a mode that always does synchronous glReadPixels() if this is important.
- All trace packets during trimming must currently fit into memory
This is not an issue in 64-bit builds, but in 32-bit builds on some traces the replayer's snapshot code can run out of memory if you try to make trims containing large numbers of packets.
- Tracer temporarily requires ~50 MB of free memory while snapshotting
This can be a problem in 32-bit processes (such as Team Fortress 2) that are under extreme memory pressure. We'll be fixing this by better splitting up the binary JSON document serialization process.
- Default framebuffer snapshot problems on AMD
The default framebuffer snapshot code was developed on NVidia and works fine there, but we're having some issues snapshotting the depth/stencil backbuffer on AMD. Sorry - we're looking into it.