You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found that the parameter num_generations: Number of generations per prompt to sample. The global batch size (num_processes * per_device_batch_size) must be divisible by this value.
Does anyone know why?
The text was updated successfully, but these errors were encountered:
my settings: num_generations=8, per_device_train_batch_size=1, gradient_accumulation_steps=8
ValueError: The global train batch size (1 x 1) must be evenly divisible by the number of generations per prompt (8). Given the current train batch size, the valid values for the number of generations are: [].
I found that the parameter num_generations: Number of generations per prompt to sample. The global batch size (num_processes * per_device_batch_size) must be divisible by this value.
I found that the parameter num_generations: Number of generations per prompt to sample. The global batch size (num_processes * per_device_batch_size) must be divisible by this value.
Does anyone know why?
The text was updated successfully, but these errors were encountered: