-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: The global train batch size (3 x 1) must be evenly divisible by the number of generations per prompt (8). #236
Comments
set a parameter in grpoconfig,default is 8,change to 3 |
Thank you. Setting it to 3, does this mean the size of each group is 3? If I set it to a larger number, will it converge faster? Setting it to 3 does work, but why can’t it be a multiple of 3, like 6 or 9? It just makes the model compute a few more times, which takes longer, but doesn’t it limit it to factors rather than multiples? |
It is related to a recent PR in the trl library.
Ref: |
Thank you, I understand now. The reason I could only set it to 3 before was because I set per_device_batch_size=1, and I only had 3 GPUs for training. So, when I increase the per_device_batch_size, I can also increase the num_generations accordingly. |
I am using the latest training code, and both transformers and trl are the latest versions from the main branch.
I only have 4 L20 GPUs and want to try training the GRPO qwen-1.5b model, but I encountered an error. It says that 8 results were sampled, and training cannot be done with 3 GPUs. The issue is that with 4 GPUs, one must be reserved for vllm to sample and infer, so only 3 GPUs are actually used for training. Similarly, if I use 8 GPUs, only 7 are actually used for training. Does this mean I need to set num_generations=7? Is it unreasonable to use the default setting of 8 for the number of GPUs, since an 8-GPU machine cannot actually use all 8 GPUs? Also, if I need 64 GPUs, what should I do? How can I set up multi-machine training?
The text was updated successfully, but these errors were encountered: