Skip to content

📉 Optimize GRPO memory usage by redefining per_device_batch_size as generations per device #7292

📉 Optimize GRPO memory usage by redefining per_device_batch_size as generations per device

📉 Optimize GRPO memory usage by redefining per_device_batch_size as generations per device #7292