Skip to content

Commit

Permalink
🧬 Fix typo in grpo_trainer.py (#2988)
Browse files Browse the repository at this point in the history
  • Loading branch information
congchan authored Feb 28, 2025
1 parent ac327d5 commit 1a303cc
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion trl/trainer/grpo_trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -574,7 +574,7 @@ def _get_train_sampler(self) -> Sampler:
# distributed to different GPUs, allowing rewards to be computed and normalized correctly within each prompt
# group. Using the same seed across processes ensures consistent prompt assignment, preventing discrepancies
# in group formation.
# 2. repeats the batch multiple times to allow reusing generaations across multiple updates. Refer to
# 2. repeats the batch multiple times to allow reusing generations across multiple updates. Refer to
# _prepare_inputs to see how the generations are stored and reused.

# In the following figure, the values are the prompt indices. The first row shows the first sampled batch, the
Expand Down

0 comments on commit 1a303cc

Please sign in to comment.