Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking Future Work #165

Open
deboer-tim opened this issue Dec 6, 2024 · 0 comments
Open

Benchmarking Future Work #165

deboer-tim opened this issue Dec 6, 2024 · 0 comments
Assignees
Labels
📐 benchmark Benchmarking granite

Comments

@deboer-tim
Copy link

This epic lists future ideas for model benchmarking from the onsite meeting.

Benchmarks

Investigate the following benchmarks which are easy for the model team to run. The expectation is that we could adapt big code with our own tests to start, and use the others as necessary if big code can't handle some of the specific tests below.

Tests

Aspects to test:

  • chat
  • code completion
  • file editing
  • multi-file editing
  • multiple languages (not within one test)
  • capture memory use and performance for each test

Phases

  • automate existing tests
  • be able to evaluate more/new models quicker
  • continuously add tests over time to match what we're adding/testing in the extension
  • run once on multiple laptops/OSes/GPUs (using what the team has) to set performance baseline, confirm spec cutoff or where we run which/multiple models
  • automate pipeline with fixed set of hardware
@deboer-tim deboer-tim added the 📐 benchmark Benchmarking granite label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📐 benchmark Benchmarking granite
Projects
None yet
Development

No branches or pull requests

2 participants