Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPO config: what's the reason that the global batch size must be divisible by the number of generations? #2858

Closed
MrDoghead opened this issue Feb 14, 2025 · 2 comments
Labels
🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information

Comments

@MrDoghead
Copy link

I found that the parameter num_generations: Number of generations per prompt to sample. The global batch size (num_processes * per_device_batch_size) must be divisible by this value.

Does anyone know why?

@github-actions github-actions bot added 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information labels Feb 14, 2025
@AIR-hl
Copy link
Contributor

AIR-hl commented Feb 14, 2025

Same problem ! @qgallouedec

my settings: num_generations=8, per_device_train_batch_size=1, gradient_accumulation_steps=8

ValueError: The global train batch size (1 x 1) must be evenly divisible by the number of generations per prompt (8). Given the current train batch size, the valid values for the number of generations are: [].

@AIR-hl
Copy link
Contributor

AIR-hl commented Feb 14, 2025

I found that the parameter num_generations: Number of generations per prompt to sample. The global batch size (num_processes * per_device_batch_size) must be divisible by this value.

Does anyone know why?

@MrDoghead Here is reason: #2776 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
Projects
None yet
Development

No branches or pull requests

2 participants