Skip to content

📉 Optimize GRPO memory usage by redefining per_device_batch_size as generations per device #7291

📉 Optimize GRPO memory usage by redefining per_device_batch_size as generations per device

📉 Optimize GRPO memory usage by redefining per_device_batch_size as generations per device #7291