All notable changes to this project will be documented in this file.
This is the combined CHANGELOG for all packages: iai-callgrind
, iai-callgrind-runner
and
iai-callgrind-macros
. iai-callgrind
and iai-callgrind-runner
use the same version which is the
version used here. iai-callgrind-macros
uses a different version number but is not a standalone
package, so its changes are also listed here.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- (#160): Add
--separate-targets
(env:IAI_CALLGRIND_SEPARATE_TARGETS
). Using this option causes the compilation target to be included in the iai-callgrind output directory tree to mitigate issues when running benchmarks on multiple targets. For example, instead of having all output files undertarget/iai
, using this option puts all files under the directorytarget/iai/x86_64-unknown-linux-gnu
if running the benchmarks on thex86_64-unknown-linux-gnu
target. - (#188): Add the
option
--home
(env:IAI_CALLGRIND_HOME
) to be able to change the default home directorytarget/iai
. - (#192): The
#[bench]
attribute now accepts asetup
parameter similarly to the#[benches]
attribute. The#[bench]
and#[benches]
attribute accept a newteardown
parameter. Theteardown
function is called with the return value of the benchmark function. The#[library_benchmark]
attribute now accepts a globalsetup
andteardown
parameter which are applied to all following#[bench]
and#[benches]
attributes if they don't specify one of these parameters themselves. - (#194): Add
--nocapture
(env:IAI_CALLGRIND_NOCAPTURE
) option to tell iai-callgrind to not capturecallgrind
terminal output of benchmark functions. For all possible values see theREADME
. - (#201): Add support for generic benchmark functions fixing #198 (Generic bench arguments cause compilation failure).
- Update locked dependencies:
syn
-> 2.0.72,cc
-> 1.1.5,serde
-> 1.0.204 - Update minimal version of
syn
-> 2.0.32 - (#201): The
BinaryBenchmarkConfig::entry_point
andRun::entry_point
functions now use glob patterns as argument with*
as placeholder for any amount of characters. - (#203): Improve
error messages during the initialization phase of the
iai-callgrind-runner
, get rid of a lot of unwraps and include a solution hint. These errors mainly happen if theiai-callgrind
library has a different version than theiai-callgrind-runner
binary.
- (#192): Fix a wrongly issued compiler error when the setup parameter was specified before the args parameter and the number of elements of the args parameter did not match the number of arguments of the benchmark function.
- (#192): Fix the
error span of wrong user supplied argument types or wrong number of arguments.
The compiler errors now point to the exact location of any wrong arguments
instead of the generic call-site of the
#[library_benchmark]
attribute. If there is a setup function involved, we leave it to the rust compiler to point to the location of the setup function and the wrong arguments.
- (#169): Clearify documentation about the scope of uniqueness of benchmark ids. Thanks to @peter-kehl
- (#175): Mark
iai-callgrind build dependencies required only by the
client_request_defs
feature as optional. Solve cargo's--check-cfg
warnings if currently active rust version is>= 1.80.0
. Thanks to @DaniPopes - Update some locked dependencies
The default EventKind
for RegressionConfig
and FlamegraphConfig
changed,
to EventKind::Ir
so, if you're updating from a previous version of
iai-callgrind
, please read carefully!
- (#71): Add a DHAT cost summary similar to the summary of callgrind events in the benchmark run output. Thanks to @dewert99.
- (#80): Add
pre-built
iai-callgrind-runner
binaries for most valgrind supported targets to the github release pages.iai-callgrind-runner
can now also be installed withcargo binstall
. - (#88): Support
filtering benchmarks by name. This is a command-line option only and the
filter can be given as positional argument in
cargo bench -- FILTER
. Specifying command-line arguments in addition to theFILTER
still works. - (#144): Verify compatibility with latest valgrind release 3.23.0 and update client requests to newly supported target arm64/freebsd.
- (#152): Support comparison of benches in library benchmark functions by id.
- (#158): Support
environment variable
IAI_CALLGRIND_<TRIPLE>_VALGRIND_INCLUDE
with<TRIPLE>
being the hosts target triple. This variable takes precedence over the more genericIAI_CALLGRIND_VALGRIND_INCLUDE
environment variable. Thanks to @qRoC
- (#94): Support
running
iai-callgrind
benchmarks without cache simulation (--cache-sim=no
). Previously, specifying this option emitted a warning. Note that running the benchmarks with--cache-sim=no
implies that there is also no estimated cycles calculation. - (#106): Due to
#94, the
default
EventKind
forRegressionConfig
andFlamegraphConfig
changed fromEventKind::EstimatedCycles
toEventKind::Ir
. - Updated locked dependencies to their most recent version
- Due to backwards incompatible changes to the summary schema the schema version
was updated v1 -> v2. The current schema file is stored in
iai-callgrind-runner/schemas/summary.v2.schema.json
- (#86): Fix
positional arguments meant as filter as in
cargo bench -- FILTER
causeiai-callgrind
to crash. - (#110): Fix example in README. Thanks to @jembishop
- (#145): Fixed an error on freebsd when copying fixtures in binary benchmarks.
- Update locked dependencies
- (#84): Fix an error
when
--load-baseline
loads the dataset from the--baseline
argument. This error led to a comparison of the--baseline
dataset with itself.
- Update env_logger and which dependencies in Cargo.toml
- Update locked dependencies
- (#81): Fix security
advisory RUSTSEC-2024-0006 of shlex dependency and update shlex to 1.3.0. Use
shlex::try_join
instead of deprecatedshlex::join
.
- (#42): Support
valgrind client requests. The client requests are available in the
iai-callgrind
package and can be activated via feature flags (client_requests
andclient_requests_defs
). - (#38): Add support
for specifying multiple library benchmarks in one go with the
#[benches]
attribute. This attribute also accepts asetup
argument which takes a path to a function, so theargs
are passed as parameter to thesetup
function instead of the benchmarking function.
- (#48): Update MSRV
from
1.60.0
to1.66.0
. Make use of new language features. - (#48): Update
dependencies. Use latest possible versions (with our MSRV) of
which
,cargo_metadata
,indexmap
,clap
and others.
- (#48): Change our
implementation of
black_box
to wrapstd::hint::black_box
which is stable since1.66.0
. The usage ofiai_callgrind::black_box
is deprecated andstd::hint::black_box
should be used directly.
- (#48): The
lazy_static
dependency ofiai-callgrind-runner
is now optional and not unnecessarily installed with theiai-callgrind
package.
- (#31): Machine
readable output. This feature adds an environment variable
IAI_CALLGRIND_SAVE_SUMMARY
and command line argument--save-summary
to create asummary.json
next to the usual output files of a benchmark which contains all the terminal output data and more in a machine readable output format. The json schema for the json summary file is stored iniai-callgrind-runner/schemas/*.json
. In addition to--save-summary
and saving the summary to a file it's possible with--output-format=default|json|pretty-json
to specify the output format for the terminal output. - Add command line arguments
--allow-aslr
,--regression
and--regression-fail-fast
which have higher precedence than their environment variable counterpartsIAI_CALLGRIND_ALLOW_ASLR
,IAI_CALLGRIND_REGRESSION
andIAI_CALLGRIND_REGRESSION_FAIL_FAST
- (#29): Add support
to compare against baselines instead of the usual
*.old
output files. This adds command-line arguments--save-baseline=BASELINE
,--load-baseline=BASELINE
and--baseline=BASELINE
and their environment variable counterpartsIAI_CALLGRIND_SAVE_BASELINE
,IAI_CALLGRIND_LOAD_BASELINE
andIAI_CALLGRIND_BASELINE
. - (#30): Add
environment variable
IAI_CALLGRIND_CALLGRIND_ARGS
as complement to--callgrind-args
- Like discussed in #31, the parsing of command line arguments for iai-callgrind
in
cargo bench ... -- ARGS
had to change. Instead of interpreting allARGS
as Callgrind arguments, Callgrind arguments can now be passed with the--callgrind-args=...
option, so other iai-callgrind arguments are now possible, for example the--save-summary=...
option in #31 or even--help
and--version
. - The names of output files and directories of binary benchmarks changed the
order from
ID.BINARY
toBINARY.ID
to match the file naming schemeFUNCTION.ID
of library benchmarks. - (#35): The
terminal output of other valgrind tool runs (like Memcheck, DRD, ...) is now
more informative and also shows the content of the log file, if any. If not
specified otherwise, Memcheck, DRD and Helgrind now run with
--error-exitcode=201
. If any errors are detected by these tools, setting this option to an exit code different from0
causes the benchmark run to fail immediately and show the whole logging output. - The output file names of flamegraphs had to change due to #29.
- All output not being part of the summary terminal output now goes to stderr.
This change affects the logging output at
info
level and the regression check output.
- The
iai-callgrind-runner
dependenciesregex
andglob
were removed from theiai-callgrind
dependencies. - The
stderr
output from a valgrind run wasn't shown in case of an error during the benchmark run because of the change to use--log-file
to store valgrind output in log files. However, not all valgrind output goes into the log file in case of an error, so it is still necessary to print thestderr
output after the log file content to see all error output of valgrind. - Update the yanked wasm-bindgen
0.2.88
to0.2.89
- (#6): Show and fail
benchmarks on performance regressions. Configuration of regression checks can
be done with
RegressionConfig
or with the new environment variablesIAI_CALLGRIND_REGRESSION
andIAI_CALLGRIND_REGRESSION_FAIL_FAST
- (#26): Show event
kinds which are not associated with callgrind's cache simulation if available.
For example, running callgrind with flags like
--collect-systime
(SysCount
,SysTime
,SysCpuTime
), ... - (#18): Add support for DHAT, Massif, BBV, Memcheck, Helgrind, DRD. It's now possible to run each of these tools for each benchmark (in addition to callgrind). The output files of the profiling tools DHAT, Massif and BBV can be found next to the usual callgrind output files.
- The output format was reworked and now shows the old event counts next to the new event counts instead of just the new event counts.
- The output format now shows the factor in addition to the percentage difference when comparing the new benchmark run with the old benchmark run. The factor can be more intuitive when trying to estimate performance improvements.
- The output format also received some small improvements in case a cost is not recorded either in the new benchmark run or in the old benchmark run.
- The percentage difference is now a digit shorter to equalize the widths of the different other string outputs within the parentheses.
- Due to the additional possible output files from tools like DHAT, Massif, etc. (but also flamegraphs), the output of benchmark runs is now nested one level deeper into a directory for each benchmark id instead of putting all output files into the group directory.
- Passing short options (like
-v
) toLibraryBenchmarkConfig::raw_callgrind_args
,BinaryBenchmarkConfig::raw_callgrind_args
,Run::raw_callgrind_args
Tool::args
is now possible - The output of iai-callgrind when running multiple tool was adjusted
--log-file
for callgrind runs is now ignored because the log files are now created and placed next to the usual output files of iai-callgrind-q
,--quiet
arguments are now ignored because they are known to cause problems when parsing log file output for example for DHAT.
- Fix examples README to show the correct summary costs of events
- Fix error handling if valgrind terminates abnormally or with a signal instead of an exit code
- Fixed missing flamegraph creation when running setup, after, before and
teardown functions in binary benchmarks if
bench
is set totrue
. - Running callgrind with
--compress-pos=yes
is currently incompatible with iai-callgrind's parsing of callgrind output files. If this option is given, it will be ignored. - Running iai-callgrind with valgrind's options
--help
,-h
,--help-debug
,--help-dyn-options
,--version
may cause problems and these arguments are now ignored.
- Update repository to use github organization
iai-callgrind/iai-callgrind
- Lower the locked inferno dependency to
0.11.12
to workaround yankedahash
version0.8.3
- (#23): Create regular and differential flamegraphs from callgrind output.
- (#22): Clearify how to update iai-callgrind-runner
- Some small fixes of parsing callgrind output files in the event that no records are present.
- (#20): Clearing the
environment variables with
env_clear
may break finding valgrind.
The old api to setup library benchmarks using only the main!
macro is deprecated and was removed.
See the README for a description of the new api.
Also, the api to setup binary benchmarks only with the main!
macro is now deprecated and was
removed. Please use the builder api using the binary_benchmark_groups!
and Run
. The old binary
benchmark api lacked the rich possibilities of the builder api and maintaining two such different
apis adds a lot of unnecessary complexity.
Additionally, the scheme to setup binary benchmarks and specifying configuration options was
reworked and is now closer to the scheme how library benchmarks are set up. It's now possible to
specify a BinaryBenchmarkConfig
at group level:
binary_benchmark_group!(
name = some_name;
config = BinaryBenchmarkConfig::default();
benchmark = ...
)
BinaryBenchmarkConfig
and Run
received a lot of new methods to configure a binary benchmark run
at all levels from top-level main!
via binary_benchmark_group
down to Run
.
- (#5): Use a new attribute macro
(
#[library_benchmark]
) based api to setup library benchmarks. Also, bring the library benchmark api closer to the binary benchmark api and use alibrary_benchmark_group!
macro together withmain!(library_benchmark_groups = ...)
BinaryBenchmarkConfig
has new methods:sandbox
,fixtures
,env
,envs
,pass_through_env
,pass_through_envs
,env_clear
,entry_point
,current_dir
,exit_with
Run
has new methods:pass_through_env
,pass_through_envs
,env_clear
,entry_point
,current_dir
,exit_with
,raw_callgrind_args
- It's now possible to specify a
BinaryBenchmarkConfig
at group level in thebinary_benchmark_group!
macro with the argumentconfig = ...
IAI_CALLGRIND_COLOR
environment variable which controls the color output of iai-callgrind. This variable is now checked first before the usualCARGO_TERM_COLOR
.
- The output line
L1 Data Hits
changed toL1 Hits
and in consequence now shows the event count for instruction and data hits - (#7): Clear environment variables before
running library benchmarks. With that change comes the possibility to influence that behavior with
the
LibraryBenchmarkConfig::env_clear
method and set custom environment variables withLibraryBenchmarkConfig::envs
. - (#15): Use
IAI_CALLGRIND
prefix for iai-callgrind environment variables.IAI_ALLOW_ASLR
->IAI_CALLGRIND_ALLOW_ASLR
,RUST_LOG
->IAI_CALLGRIND_LOG
. - Callgrind invocations, if
IAI_CALLGRIND_LOG
level isDEBUG
now runs Callgrind with--verbose
(This flag isn't documented in the official documentation of Callgrind) - The signature of
Run::env
changed fromenv(var: ...)
toenv(key: ... , value: ...)
- The signature of
Run::envs
changed fromenvs(vars: [String])
toenvs(vars: [(Into<OsString>, Into<OsString>)])
- The signatures of
Arg::new
,Run::args
,Run::with_args
,Run::with_cmd_args
changed their usage ofAsRef<[...]>
to [IntoIterator<Item = ...>
]
- The old api from before [#5] using only the
main!
is now deprecated and the functionality was removed. Using the old api produces a compile error. For migrating library benchmarks to the new api see the README. Run::options
and theOptions
struct were removed and all methods of this struct moved intoRun
directly but are now also available inBinaryBenchmarkConfig
.BinaryBenchmarkGroup::fixtures
andBinaryBenchmarkGroup::sandbox
were removed and they moved toBinaryBenchmarkConfig::fixtures
andBinaryBenchmarkConfig::sandbox
- (#19): Library benchmark functions with equal bodies produce event counts of zero.
- If the Callgrind arguments
--dump-instr=yes
anddump-line=yes
were used together, the event counters were summed up incorrectly. - The Callgrind argument
--dump-every-bb
and similar arguments causing multiple file outputs cannot be handled byiai-callgrind
and therefore--combine-dumps=yes
is now set per default. This flag cannot be unset. --compress-strings
is now ignored, because the parser needs the uncompressed strings or else produces event counts of zero.- Some debugging output was printed to stdout instead of stderr
- Adjust parsing of yes/no values from
LibraryBenchmarkConfig
andBinaryBenchmarkConfig
raw callgrind arguments to callgrind's parsing of command-line arguments. Now, only exact matches ofyes
andno
are considered to be valid command-line arguments.
- The dependency version requirements in all packages are loosened and more openly formulated.
Especially, the upper bounds were updated to include the latest versions. However, the
Cargo.lock
file locks the dependencies to versions which are compatible with the current MSRV1.60.0
.
- The
iai-callgrind
package was unnecessarily using all the dependencies of theiai-callgrind-runner
although only dependent on theapi
feature of the runner. Also, the directserde
dependency was removed becauseserde
is already part of theapi
feature of the runner. - Changed the license from
Apache-2.0 AND MIT
toApache-2.0 OR MIT
in Cargo.toml files of all packages
- (#4): The destination
directory of iai callgrind output files changes from
/workspace/$CARGO_PKG_NAME/target/iai
to/workspace/target/iai/$CARGO_PKG_NAME
and respects theCARGO_TARGET_DIR
environment variable
- (#3): builder api for binary benchmarks
- BREAKING: an id for args in the macro api is now mandatory
- binary benchmarks: The filename of callgrind output for benchmarked
setup
,teardown
,before
andafter
functions changed tocallgrind.$id.$function.out
. - binary benchmarks: The filename of callgrind output for benchmarked binaries does not include the arguments for the binary anymore.
- The filename for callgrind output files is now truncated to a maximum of 255 bytes
- library benchmarks: Fix event counting to include costs of inlined functions
- (#2): Benchmarking binaries of a crate. Added a full description of this benchmarking scheme in the README
- IAI_CALLGRIND_RUNNER environment variable which may specify the path to the iai-callgrind-runner binary
- The error output changed and double information was removed when running the
iai-callgrind-runner
fails - The architecture detection changed from using
uname -m
to use rust'sstd::env::consts::ARCH
- The cfg_if dependency was removed
- If running with ASLR disabled, proccontrol on freebsd was missing to run the valgrind binary
BREAKING: Counting of events changed and therefore event counters are incompatible with versions
before 0.4.0
. Usually, event counters are now lower and more precise than before.
- Instead of counting all events within the benchmarking function, only events of function calls (cfn entries) within the benchmarking functions are attributed to the final event counts.
- MSRV changed from v1.56.0 -> v1.60.0
- Bump log dependency 0.4.17 -> 0.4.19
- Counting of events was sometimes summarizing the events of the
main
function instead of the benchmarking function
- Add output of Callgrind at
RUST_LOG=info
level but also more debug and trace output.
- The version mismatch check should cause an error when the library version is < 0.3.0
This version is incompatible to previous versions due to changes in the main!
macro which is
passing additional arguments to the runner. However, benchmarks written with a version before
v0.3.0
don't need any changes but can take advantage of some new features.
- The
toggle-collect
callgrind argument now accumulates multiple occurrences instead of replacing them. The defaulttoggle-collect
for the benchmark function cannot be replaced anymore. - A version mismatch of the
iai-callgrind
library and theiai-callgrind-runner
is now an error. - Fix, update and extend the README. Add more real-world examples.
- The
main!
macro has two forms now, with the first having the ability to pass arguments to callgrind. - More examples in the benches folder
- Use the
RUST_LOG
environment variable to control the verbosity level of the runner. - Add colored output. The
CARGO_TERM_COLOR
variable can be used to disable colors.
- A cargo filter argument which is a positional argument resulted in the the runner to crash.
This version is mostly compatible with v0.1.0
but needs some additional setup. See
Installation in the README. Benchmarks created with v0.1.0
should not
need any changes but can maybe improved with the additional features from this version.
- The repository layout changed and this package is now separated in a library (iai-callgrind) with the main macro and the black_box and the binary package (iai-callgrind-runner) with the runner needed to run the benchmarks
- It's now possible to pass additional arguments to callgrind
- The output of the collected event counters and metrics has changed
- Other improvements to stabilize the metrics across different systems
- Initial migration from Iai