Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do a GC run before running the benchmarks #72

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

udesou
Copy link

@udesou udesou commented Oct 10, 2023

When I was running the benchmarks with MMTk, I noticed that for some reason, running it through a script would trigger GC in the benchmark, whereas running it via command line wouldn't.
The fix: making sure full GC is ran before running the benchmark to get rid of any garbage that may exist. Since this may also be a problem for the Julia GC, I'm opening this PR.

NB: when collecting statistics via MMTk harness methods, it always does a GC before the benchmark, exactly to target this issue. However, we are using Julia's GC stats (as much as we can), and that's why it was problematic in our case.

@qinsoon
Copy link

qinsoon commented Oct 11, 2023

A bit more information on this PR: The GCBenchmarks are executed by a wrapper (run_benchmarks.jl) and the gctime macro in Julia. Any objects created in this Julia code before the actual benchmark commences will also be present in the heap. These objects aren't allocated by the benchmark itself, but they might influence GC decisions and statistics for the benchmark. In practice, we discovered a scenario where we observed different GC decisions (1 GC vs. 0 GC) when executing run_benchmarks.jl through a Python script compared to running run_benchmarks.jl directly in a terminal using the same command line. We suspect that Julia might allocate resources differently based on the runtime environment. These varying runtime environments can result in different initial heap states, leading to distinct GC behaviors. By performing an explicit GC before the actual benchmark to clear any residual garbage, we can minimize this discrepancy. This adjustment resolved the specific issue we encountered.

@oscardssmith
Copy link
Collaborator

I think this looks relatively reasonable.

@steveblackburn
Copy link

Wait. I'm confused by this.

The key question is whether the workload is properly harnessed. This is a key concept in our methodology --- that we control precisely what we measure. We typically do not 'harness' the entire execution, rather we have the core of the measurable workload harnessed, and we (optionally) run that harnessed core multiple times.

It sounds to me like you're comparing a harnessed workload to the total wall clock time for the workload.

If that's correct, this is not an MMTk issue; it's just a methodological problem (which is easily fixed :-).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants