Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tuner] Run baseline benchmarks first with reusable helper function and regression detection #789

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

bangtianliu
Copy link
Contributor

@bangtianliu bangtianliu commented Jan 8, 2025

This PR is relevant to address the task in: #783.

  1. Run the baseline benchmark at the start to quickly assess whether the tuner is making progress.
  2. Run the baseline benchmark both at the beginning and the end of tuning to detect potential performance regressions during the process of tuning (e.g., device overheating).

This PR adds a reusable helper function that takes a list of candidate indices and returns benchmarking results, allowing the same function to be used for baseline and candidate benchmark results.

Furthermore, this PR adds detection and logging for several conditions:

  • Uniqueness of baseline results: Verify the uniqueness of baseline results and logging a warning if duplicates are detected.
  • Consistency of device IDs across baseline runs: Ensure that the device IDs from two baseline runs match, logging a warning if any discrepancies are found.
  • Performance regression detection: Identify performance regressions between two baseline runs and logging a warning if regressions are encountered.
    Also, unit tests are added to show the corner cases.

@bangtianliu bangtianliu requested review from kuhar and Max191 January 8, 2025 19:36
@bangtianliu bangtianliu force-pushed the benchmark_baseline branch 2 times, most recently from bd88a9c to 46d6a87 Compare January 8, 2025 19:40
Copy link
Contributor

@Max191 Max191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Just a few nits about the logging and code reuse. Could you also add a more informative PR description (essentially explain the reasoning for benchmarking baselines twice like in the linked issue).

tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
@bangtianliu bangtianliu requested a review from Max191 January 9, 2025 00:37
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
@bangtianliu bangtianliu requested a review from kuhar January 9, 2025 02:54
Copy link
Contributor

@Max191 Max191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Just one nit.

tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
@kuhar
Copy link
Member

kuhar commented Jan 9, 2025

Ideally, these corner cases should be covered by unit tests.

@bangtianliu
Copy link
Contributor Author

WIP: need to rebase on top of #799, and continue to work.

@bangtianliu bangtianliu requested a review from kuhar January 9, 2025 23:17
@bangtianliu bangtianliu force-pushed the benchmark_baseline branch 3 times, most recently from 30ab7e4 to 6c88791 Compare January 9, 2025 23:22
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you edit the PR description and explain what error conditions you considered and how they are handled?

tuner/tuner/candidate_gen.py Outdated Show resolved Hide resolved
tuner/tuner/candidate_gen.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner_test.py Outdated Show resolved Hide resolved
tuner/tuner/libtuner_test.py Outdated Show resolved Hide resolved
@bangtianliu bangtianliu force-pushed the benchmark_baseline branch 4 times, most recently from f4be426 to 6966557 Compare January 10, 2025 04:49
@bangtianliu bangtianliu changed the title [tuner] Run baseline benchmarks first and check regression [tuner] Run baseline benchmarks first with reusable helper function and regression detection Jan 10, 2025
@bangtianliu bangtianliu requested a review from kuhar January 10, 2025 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants