Skip to content

Actions: mlfoundations/evalchemy

Actions

Lint

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
165 workflow runs
165 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

even higher max_tokens for math
Lint #92: Commit 6d9d0cc pushed by RyanMarten
January 20, 2025 00:45 27s main
January 20, 2025 00:45 27s
fix broken imports
Lint #91: Commit a57a772 pushed by RyanMarten
January 20, 2025 00:33 25s main
January 20, 2025 00:33 25s
set max_new_tokens higher for AIME24 and AMC23
Lint #90: Commit 4451346 pushed by RyanMarten
January 20, 2025 00:21 22s main
January 20, 2025 00:21 22s
Merge pull request #53 from marianna13/marianna/bigcodebench
Lint #89: Commit 7bdedd2 pushed by RyanMarten
January 20, 2025 00:19 25s main
January 20, 2025 00:19 25s
Support BigCodeBench
Lint #88: Pull request #53 synchronize by RyanMarten
January 20, 2025 00:18 23s marianna13:marianna/bigcodebench
January 20, 2025 00:18 23s
add additional reproduction numbers
Lint #87: Commit cee5592 pushed by RyanMarten
January 20, 2025 00:16 22s main
January 20, 2025 00:16 22s
Merge pull request #52 from mlfoundations/amc_aime
Lint #86: Commit 4c2d957 pushed by RyanMarten
January 19, 2025 22:59 22s main
January 19, 2025 22:59 22s
Update README.md
Lint #85: Commit 1437cf3 pushed by RyanMarten
January 19, 2025 22:55 25s main
January 19, 2025 22:55 25s
amc23 eval
Lint #84: Pull request #52 synchronize by RyanMarten
January 19, 2025 22:51 26s amc_aime
January 19, 2025 22:51 26s
add AIME24 ryan reproduced number
Lint #83: Commit 369c783 pushed by RyanMarten
January 19, 2025 22:51 25s amc_aime
January 19, 2025 22:51 25s
amc23 eval
Lint #82: Pull request #52 synchronize by RyanMarten
January 19, 2025 22:44 21s amc_aime
January 19, 2025 22:44 21s
lower max_new_tokens, generation takes too long
Lint #81: Commit 850402c pushed by RyanMarten
January 19, 2025 22:44 21s amc_aime
January 19, 2025 22:44 21s
amc23 eval
Lint #80: Pull request #52 synchronize by RyanMarten
January 19, 2025 22:04 25s amc_aime
January 19, 2025 22:04 25s
linting
Lint #79: Commit 23cad3a pushed by RyanMarten
January 19, 2025 22:04 27s amc_aime
January 19, 2025 22:04 27s
amc23 eval
Lint #78: Pull request #52 synchronize by RyanMarten
January 19, 2025 22:01 24s amc_aime
January 19, 2025 22:01 24s
amc23 eval
Lint #76: Pull request #52 synchronize by RyanMarten
January 19, 2025 20:56 23s amc_aime
January 19, 2025 20:56 23s
table ref
Lint #75: Commit 3f183bd pushed by RyanMarten
January 19, 2025 20:56 24s amc_aime
January 19, 2025 20:56 24s
amc23 eval
Lint #74: Pull request #52 synchronize by RyanMarten
January 19, 2025 15:26 21s amc_aime
January 19, 2025 15:26 21s
update README
Lint #73: Commit 125bd97 pushed by RyanMarten
January 19, 2025 15:26 21s amc_aime
January 19, 2025 15:26 21s
amc23 eval
Lint #72: Pull request #52 synchronize by RyanMarten
January 19, 2025 15:24 21s amc_aime
January 19, 2025 15:24 21s
resolve merge conflict
Lint #71: Commit 1c9752e pushed by RyanMarten
January 19, 2025 15:23 22s amc_aime
January 19, 2025 15:23 22s
Support BigCodeBench
Lint #70: Pull request #53 synchronize by marianna13
January 19, 2025 10:36 27s marianna13:marianna/bigcodebench
January 19, 2025 10:36 27s
Update reproduced_benchmarks for mixeval
Lint #69: Commit 6a286b1 pushed by neginraoof
January 19, 2025 06:23 26s main
January 19, 2025 06:23 26s
amc23 eval
Lint #68: Pull request #52 synchronize by RyanMarten
January 19, 2025 00:21 24s amc_aime
January 19, 2025 00:21 24s