Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pi0 and photon #39

Closed
wants to merge 22 commits into from
Closed

Pi0 and photon #39

wants to merge 22 commits into from

Conversation

sebouh137
Copy link
Contributor

Briefly, what does this PR introduce?

Adds photon and pi0 benchmarks

What kind of change does this PR introduce?

  • Bug fix (issue #__)
  • New feature (issue #__)
  • Documentation update
  • Other: __

Please check if this PR fulfills the following:

  • Tests for the changes have been added
  • Documentation has been added / updated
  • Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

no

Does this PR change default behavior?

yes. adds new benchmarks

@sebouh137 sebouh137 marked this pull request as ready for review September 17, 2024 00:27
@sebouh137 sebouh137 requested a review from wdconinc September 17, 2024 00:29
@wdconinc
Copy link
Contributor

wdconinc commented Sep 17, 2024

We don't have the operational capacity to add more pipelines like this. @veprbl

Typical jobs: (data: trace.json in https://ui.perfetto.dev)
image

ZDC jobs (same time window):
image

These are occupying limited job slots for long times. It is leading us to lag behind, sometimes by 8 hours, as pipelines try to catch up. That is causing people to ignore benchmarks entirely.

Before we add more pipelines, one or more of several things need to happen:

  • reduce load from current jobs:
    • reduce events to smaller number overall
    • adaptive or scheduled pipelines with larger number of events (smaller on PRs, larger on main)
    • run simulation and/or analysis more efficiently
    • cache and reuse simulation artifacts when appropriate
    • prevent concurrent running from many commits on the same branch: new commit kills older jobs
  • expand runners
    • provide additional venues to run jobs on demand, e.g. runners on OSG

@wdconinc
Copy link
Contributor

Note in the waterfall graphs above that it takes until 3h45 after start of the pipeline before the benchmarks even start, and until 6h30 when they complete. That's the backlog I'm talking about... We used to start benchmarks after 30 minutes, and complete within 2 hours, with most results available early and only 18x275 DIS NC taking a long time.

@veprbl
Copy link
Member

veprbl commented Sep 17, 2024

Please move these to detector_benchmarks. Thanks.

@veprbl
Copy link
Member

veprbl commented Sep 17, 2024

Regarding performance. Most of the items are out of scope for this PR. For this we just need to push the simulation time as much as possible. Hopefully it won't make the benchmark completely useless. We should be able to scale those statistics up later by figuring out some of the steps that @wdconinc mentioned.

@sebouh137
Copy link
Contributor Author

@wdconinc To be fair though, the benchmarks in this current pull request is expected to run considerably faster than the lambda and sigma, since it has electromagnetic instead of hadronic showers, and a larger fraction of the generated events are used in the analysis samples (so the simulated statistics don't need to be as high). These two new ones take a little over an hour running locally on my laptop (which is probably slower than on the ci); once the benchmark runs on here I can determine how much to reduce the statistics by to make it take, say, 30 or 15 minutes. Does this sound reasonable?

@sebouh137
Copy link
Contributor Author

sebouh137 commented Sep 19, 2024

It's about half an hour for each of these two ZDC electromagnetic-calorimetry benchmarks. I will see what I can do about reducing the stats on the other benchmarks (lambda and sigma in zdc and neutron in insert), without diminishing the quality of those benchmarks, and if there is something that can be done, it will be in a future pull request

@wdconinc
Copy link
Contributor

It's down to about half an hour for each of these two benchmarks.

Half an hour for the longest job or half a core-hour for all jobs even if they run simultaneously?

@sebouh137 sebouh137 enabled auto-merge (squash) September 19, 2024 15:02
@sebouh137
Copy link
Contributor Author

It's down to about half an hour for each of these two benchmarks.

Half an hour for the longest job or half a core-hour for all jobs even if they run simultaneously?

Half an hour for the longest job. Let me know if this is sufficient; if not, I will reduce statistics by a factor of 2.

@sebouh137
Copy link
Contributor Author

Please move these to detector_benchmarks. Thanks.

ok, I will move this to detector benchmarks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants