Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[SFT] add token accuracy metric
#2597 opened Jan 21, 2025 by kashif Loading…
5 tasks
🐍 Support Python 3.13
#2593 opened Jan 20, 2025 by qgallouedec Draft
5 tasks
[WIP] [Liger] liger JSD support
#2573 opened Jan 16, 2025 by Mecoli1219 Draft
5 tasks
Reduce memory consumption when training with PPO
#2571 opened Jan 15, 2025 by summerspringwei Loading…
5 tasks
[Liger] liger DPO support
#2568 opened Jan 14, 2025 by kashif Loading…
WIP: Base Online Trainer
#2567 opened Jan 13, 2025 by mnoukhov Draft
4 tasks
Add _compute_score method to PPOTrainer
#2560 opened Jan 11, 2025 by oliveiraeliel Draft
2 of 5 tasks
Reintroduce truncation_mode in DPOTrainer
#2551 opened Jan 8, 2025 by anakin87 Loading…
4 of 5 tasks
[Judges] rlhflow pairwise judges
#2548 opened Jan 7, 2025 by kashif Loading…
MPO
#2544 opened Jan 6, 2025 by qgallouedec Draft
5 tasks
add "_prepare_fsdp" for DPOTrainer
#2539 opened Jan 3, 2025 by faaany Loading…
PPOTrainer: fix progress bar for num_mini_batches > 1
#2531 opened Dec 29, 2024 by dawidm Loading…
4 tasks done
Include stop token in policy model's generation_config
#2528 opened Dec 28, 2024 by dawidm Loading…
2 of 5 tasks
KTO refactor
#2507 opened Dec 20, 2024 by qgallouedec Draft
5 tasks
[Liger] Integrate Liger CPO & SimPO
#2506 opened Dec 20, 2024 by Mecoli1219 Loading…
1 of 6 tasks
[Liger] add native liger-kernel orpo loss
#2482 opened Dec 15, 2024 by kashif Loading…
Allow eval in Online DPO
#2476 opened Dec 13, 2024 by qgallouedec Draft
5 tasks
2
1
Add length-normalized DPO
#2458 opened Dec 10, 2024 by hugoabonizio Loading…
1 of 5 tasks
[Reward] initial CLoud Reward trainer
#2432 opened Dec 3, 2024 by kashif Loading…
added eos token for ppotrainer
#2420 opened Nov 30, 2024 by dame-cell Loading…
3 of 5 tasks
ProTip! no:milestone will show everything without a milestone.