Skip to content

Actions: huggingface/trl

Secret Leaks

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
2,315 workflow runs
2,315 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

🥞 Fix CPO gradient accumulation loss scaling (#2645)
Secret Leaks #2165: Commit 8e65825 pushed by qgallouedec
January 24, 2025 15:02 13s 2626-kto
January 24, 2025 15:02 13s
Fix grpo
Secret Leaks #2164: Commit 6130c96 pushed by qgallouedec
January 24, 2025 14:48 18s 2625-grpo
January 24, 2025 14:48 18s
🥞 Fix CPO gradient accumulation loss scaling (#2645)
Secret Leaks #2163: Commit 8e65825 pushed by qgallouedec
January 24, 2025 14:47 15s 2625-grpo
January 24, 2025 14:47 15s
initial PRIME
Secret Leaks #2162: Commit bac0cd3 pushed by kashif
January 24, 2025 12:58 15s prime
January 24, 2025 12:58 15s
🥞 Fix CPO gradient accumulation loss scaling (#2645)
Secret Leaks #2161: Commit 8e65825 pushed by qgallouedec
January 24, 2025 11:22 21s main
January 24, 2025 11:22 21s
Fix CPO
Secret Leaks #2160: Commit 6567321 pushed by qgallouedec
January 24, 2025 11:01 17s 2620-cpo
January 24, 2025 11:01 17s
Update grpo_trainer.md
Secret Leaks #2159: Commit 5e4d7be pushed by qgallouedec
January 24, 2025 10:59 15s 2620-cpo
January 24, 2025 10:59 15s
initial
Secret Leaks #2158: Commit 4201299 pushed by kashif
January 24, 2025 09:04 17s prime
January 24, 2025 09:04 17s
Update grpo_trainer.md
Secret Leaks #2157: Commit 5e4d7be pushed by qgallouedec
January 24, 2025 08:06 17s main
January 24, 2025 08:06 17s
Merge branch 'main' into mpo
Secret Leaks #2156: Commit c622a29 pushed by ariG23498
January 24, 2025 07:24 18s mpo
mpo
January 24, 2025 07:24 18s
🌯 Fix context manager runtime error when gather is disabled (#2639)
Secret Leaks #2155: Commit f34b70a pushed by qgallouedec
January 23, 2025 20:23 17s main
January 23, 2025 20:23 17s
🍭 Custom reward function for RLOO (#2612)
Secret Leaks #2154: Commit 0e216f7 pushed by August-murr
January 23, 2025 19:16 20s main
January 23, 2025 19:16 20s
🥞 Fix BCO gradient accumulation loss scaling (#2638)
Secret Leaks #2153: Commit 59c2014 pushed by qgallouedec
January 23, 2025 17:57 16s main
January 23, 2025 17:57 16s
Fix grad accum
Secret Leaks #2152: Commit 7025c26 pushed by qgallouedec
January 23, 2025 17:32 15s 2619-bco
January 23, 2025 17:32 15s
🥞 Fix DPO gradient accumulation loss scaling (#2615)
Secret Leaks #2151: Commit 40c2383 pushed by qgallouedec
January 23, 2025 17:31 12s 2619-bco
January 23, 2025 17:31 12s
🥞 Fix DPO gradient accumulation loss scaling (#2615)
Secret Leaks #2150: Commit 40c2383 pushed by qgallouedec
January 23, 2025 17:12 19s main
January 23, 2025 17:12 19s
chore: adding loss weights to the trainer
Secret Leaks #2149: Commit 408c1df pushed by ariG23498
January 23, 2025 16:38 15s mpo
mpo
January 23, 2025 16:38 15s
remove redundant code
Secret Leaks #2148: Commit 35db48e pushed by ariG23498
January 23, 2025 16:36 16s mpo
mpo
January 23, 2025 16:36 16s
Secret Leaks
Secret Leaks #2147: by qgallouedec
January 23, 2025 16:30 17s main
January 23, 2025 16:30 17s
doc ci
Secret Leaks #2146: Commit 0b0b9bd pushed by qgallouedec
January 23, 2025 16:24 20s rename-var-for-clarity
January 23, 2025 16:24 20s
rename advatages to per_token_loss for clarity
Secret Leaks #2145: Commit 091211a pushed by qgallouedec
January 23, 2025 16:21 17s rename-var-for-clarity
January 23, 2025 16:21 17s
fix mutable field issue
Secret Leaks #2144: Commit 7638607 pushed by ariG23498
January 23, 2025 16:18 17s mpo
mpo
January 23, 2025 16:18 17s
Update docs/source/grpo_trainer.md
Secret Leaks #2143: Commit be78d9f pushed by qgallouedec
January 23, 2025 16:16 16s grpo-custom-rew-func
January 23, 2025 16:16 16s
clearer
Secret Leaks #2142: Commit 49e02eb pushed by qgallouedec
January 23, 2025 15:07 21s grpo-custom-rew-func
January 23, 2025 15:07 21s
fix script
Secret Leaks #2141: Commit d6e3fbf pushed by qgallouedec
January 23, 2025 15:03 18s grpo-custom-rew-func
January 23, 2025 15:03 18s