📉 Optimize GRPO memory usage by redefining per_device_batch_size
as generations per device
#7291
Job | Run time |
---|---|
34m 32s | |
24m 46s | |
33m 16s | |
9s | |
27m 11s | |
20m 7s | |
34m 10s | |
27m 1s | |
23m 6s | |
28m 26s | |
31m 44s | |
27m 29s | |
5h 11m 57s |