-
Notifications
You must be signed in to change notification settings - Fork 573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add library to cache decode instr info in analysis tools #7113
Labels
Comments
abhinav92003
added a commit
that referenced
this issue
Dec 9, 2024
Adds a new library to cache information about decoded instructions. This can be used by analysis tools that need to decode the instr encodings in the trace. The library allows the tools to specify what information they need to cache. Refactors the invariant checker tool to use this library. Issue: #7113
abhinav92003
added a commit
that referenced
this issue
Dec 11, 2024
Moves the module mapper and related functionality into raw2trace_shared. This is in preparation for further changes that include the module mapper logic in a new instr_decode_cache_t. To avoid having to pull in the whole drmemtrace_raw2trace implementation into the instr decode cache library, which will end up including it in the invariant checker and everything that depends on, we separate out the module mapper into raw2trace_shared which is intended for such cases. Issue: #7113
abhinav92003
added a commit
that referenced
this issue
Dec 11, 2024
This is in preparation for further changes that include the module mapper logic in a new library (drmemtrace_instr_decode_cache) that provides an instr decode cache (instr_decode_cache_t). To avoid having to pull in the whole drmemtrace_raw2trace implementation into drmemtrace_instr_decode_cache, which will end up including it in the invariant checker and everything that depends on, we separate out the module mapper into raw2trace_shared which is intended for such cases. Issue: #7113
abhinav92003
added a commit
that referenced
this issue
Dec 12, 2024
…7123) Moves the module mapper and related functionality into raw2trace_shared. This is in preparation for further changes that include the module mapper logic in a new library (drmemtrace_instr_decode_cache) that provides an instr decode cache (instr_decode_cache_t). To avoid having to pull in the whole drmemtrace_raw2trace implementation into drmemtrace_instr_decode_cache, which will end up including it in the invariant checker and everything that depends on, we separate out the module mapper into raw2trace_shared which is intended for such cases. This is a pure code move, no changes were made to logic or style. Issue: #7113
abhinav92003
added a commit
that referenced
this issue
Dec 15, 2024
Moves the module read functionality into raw2trace_shared. Refactors opcode_mix and view to use the new function instead of creating a raw2trace_directory_t. This is in preparation for further changes that include the module read logic in a new library (drmemtrace_decode_cache) that provides an instr decode cache (decode_cache_t). To avoid having to pull in the whole drmemtrace_raw2trace implementation into drmemtrace_decode_cache, which will end up including it in the invariant checker and everything that depends on, we separate out the module read logic into raw2trace_shared which is intended for such cases. While the raw2trace and raw2trace_shared implementation usually is made to deal with only file streams, we make an exception here for read_module_file() which uses DR file APIs instead. This is likely fine because this function is not intended for use with code that wants file streams, and it is better than other alternatives: moving raw2trace-using functionality out of raw2trace_directory_t so we can use raw2trace_directory_t in decode_cache_t (this would still add zlib/lz4 etc unnecessary file libs to drmemtrace_decode_cache), create yet another raw2trace_shared_non_stream.cpp for shared code that uses DR file APIs, or duplicating the read_module_file impl in decode_cache_t. This also removes raw2trace_directory_t::initialize_module_file() as that functionality is now provided by read_module_file in raw2trace_shared. Issue: #7113
abhinav92003
added a commit
that referenced
this issue
Jan 24, 2025
Adds a new drmemtrace_decode_cache library to cache information about decoded instructions using decode_cache_t. This can be used by analysis tools that need to decode the instr encodings in the trace, to avoid overhead of redundant decodes which can get expensive, and also to avoid duplication of various related logic. The library allows the tools to decide what information they need to cache and add implementation for how to obtain it. Also, it uses instr_noalloc_t when possible to reduce heap usage and allocation/deallocation overhead. If the trace does not include embedded encodings or if the user wants to get encodings from the app binaries using module_mapper_t instead, they can provide the module file path to the init API on the decode_cache_t object. decode_cache_t keeps a single initialized module_mapper_t at any time, which is shared between all decode_cache_t objects (even the ones of different template types); this is done by tracking the count of active objects that use the module mapper. decode_cache_t provides the clear_cache() API which can be used in parallel_shard_exit() to keep memory consumption in check by freeing up cached decode info that may not be needed for result computation in later print_results() which has to wait until all shards are done. Refactors the invariant checker and opcode mix tools to use this library. Modifies add_encodings_to_memrefs to support a mode where encodings are not set in the generated test memrefs but only the instr addr and size fields are set. Makes the opcode cache in opcode_mix_t per-shard instead of per-worker. Decode info must not be cached per-worker as that may cause stale encodings for non-first shards processed by the worker. This means the worker init and worker exit APIs can be removed now from opcode_mix_t. Adds decode_cache_test and opcode_mix_test unit tests that verify operation of the decode_cache_t. Issue: #7113
abhinav92003
added a commit
that referenced
this issue
Jan 27, 2025
Refactors the view_t to use the new decode_cache_t instead of duplicating the logic. Modifies decode_info_base_t::set_decode_info_derived to allow it to return the error string. This is useful for cases where the decode_info_t object encountered some error during its operation, which is especially possible when used with a decode_cache_t with include_decoded_instr_=false in which case the decode_info_t also performs the decoding on its own. Adds new tests to decode_cache_test to verify some related cases. Issue: #7113
abhinav92003
added a commit
that referenced
this issue
Jan 27, 2025
Fixes the opcode_mix tool to get the filetype from the memtrace_stream_t interface instead of the TRACE_MARKER_TYPE_FILETYPE marker memref which may not be seen if -skip_instrs is applied by the user. Adds a new unit test that runs the opcode_mix tool with -skip_instrs, which would fail because of an unseen filetype prior to this fix. Issue: #7224, #7113
abhinav92003
added a commit
that referenced
this issue
Jan 28, 2025
Fixes the opcode_mix tool to get the filetype from the memtrace_stream_t interface instead of the TRACE_MARKER_TYPE_FILETYPE marker memref which may not be seen if -skip_instrs is applied by the user. Modifies opcode_mix_t to use initialize_stream instead of initialize, to get the serial stream. Adds a new unit test that runs the opcode_mix tool with -skip_instrs, which would fail because of an unseen filetype prior to this fix. Augments decode_cache_t::init() documentation to suggest a reliable way to obtain the filetype. Issue: #7224, #7113
abhinav92003
added a commit
that referenced
this issue
Jan 30, 2025
Moves the check that compares the current build enviroment with the arch bit of the trace file to decode_cache_t::init. Due to some complexities in how build_target_arch_type() is declared, it had to be wrapped by a new API in decode_cache_base_t. build_target_arch_type() is defined conditionally in trace_entry.h only if dr_defines.h has already defined certain symbols. It is hard to ensure that for all builds, so we ensure it just for decode_cache.cpp which is packaged as a separate lib unit drmemtrace_decode_cache. Issue: #7113
abhinav92003
added a commit
that referenced
this issue
Jan 30, 2025
Refactors the view_t tool to use the new decode_cache_t instead of duplicating the instr decoding logic. Modifies decode_info_base_t::set_decode_info_derived to allow it to return the error string. This is useful for cases where there was an error during its operation, which is especially possible when used with a decode_cache_t with include_decoded_instr_ = false in which case the decode_info_t also performs the decoding on its own. Adds new tests to decode_cache_test to verify some related cases. Not adding a new release note for the API signature change as there was no intervening release since it was initially added. Also fixes the view_t logic to get filetype, which should be obtained from the memtrace_stream_t object instead of TRACE_MARKER_TYPE_FILETYPE to handle cases where the filetype marker is not seen by the tool because of analyzer options like -skip_instrs that skip over a certain initial part of the trace. Issue: #7113, #7224
abhinav92003
added a commit
that referenced
this issue
Jan 30, 2025
Moves the check that compares the current build enviroment with the arch bit of the trace file to decode_cache_t::init. Due to some complexities in how build_target_arch_type() is declared in trace_entry.h, it had to be wrapped by a new API in decode_cache_base_t. build_target_arch_type() is defined conditionally in trace_entry.h only if dr_defines.h has already defined certain symbols. It is hard to ensure that for all builds dr_defines.h is included before trace_entry.h, so we ensure it just for decode_cache.cpp which is packaged as a separate lib unit drmemtrace_decode_cache. This also means that the new API could not be in decode_cache_t which is a template class that has to be defined in the header. Issue: #7113, #7236
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Various analyzers require decoding instr encodings, e.g. opcode_mix and invariant_checker. Some want to remember the full decoded instr_t, whereas others want to record only specific information about the decoded instr. It would be nice to have a shared library to obviate the need for new analysis tools to re-implement the logic.
The text was updated successfully, but these errors were encountered: