📉 Optimize GRPO memory usage by redefining per_device_batch_size
as generations per device
#7299
Job | Run time |
---|---|
33m 53s | |
8s | |
25m 54s | |
33m 43s | |
36m 14s | |
23m 19s | |
22m 34s | |
27m 15s | |
23m 52s | |
29m 14s | |
24m 22s | |
29m 34s | |
5h 10m 2s |