-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix tokenize_row in xPOTrainer #1683
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice refactoring. Thanks. CC @kashif for a check.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Tests failing. @AIR-hl could you check? |
I guess it’s because the default |
btw the TRL maintainers might have an issue with arg being renamed due to backward compatibility... there is a mechanism i believe for deprecating things |
Do I have any good idea? I am just a student who lacks practical experience |
for now perhaps lets not rename the arguments? |
I'm not sure this is a great change. This makes it near impossible to extend the functionality of |
tokenize_row
andbuild_tokenized_answer
fromDPOTrainer.py
,ORPOTrainer.py
andCPOTrainer.py
;tokenize_row
andbuild_tokenized_answer
totrainer/utils.py
;max_completion_length
tomax_target_length
inORPO
in order to be consistent withDPO
andCPO
Sorry for my multiple pr, I've never done it before, this is my first time contributing to a project. :(