-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create proposal for training repo #85
base: main
Are you sure you want to change the base?
Conversation
b17eac1
to
2c0c19f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, one comment
docs/backend/training-repo.md
Outdated
The maintainers should be the folks who currently work on training. | ||
Namely: | ||
|
||
- [Aldo Pareja](https://github.com/aldopareja) | ||
- [James Kunstle](https://github.com/orgs/instructlab/people/JamesKunstle) | ||
- [Oleg Silkin](https://github.com/orgs/instructlab/people/RobotSail) | ||
- [Mustafa Eyceoz](https://github.com/orgs/instructlab/people/Maxusmusti) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we create a new GitHub Team for this, something like Training Maintainers
? And these folks can be defined as the initial members
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this made it into the doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lgtm, though I'll hold off on approval until more folks weigh in.
Please take a look at the lining errors when you have some time.
I'd also suggest moving the file to a new docs/training/
directory, similar to the docs/cli/
directory. I want to create a similar docs/sdg/
directory. I think this organization will help.
1bef61d
to
03ec16b
Compare
LGTM |
Signed-off-by: Oleg S <[email protected]>
03ec16b
to
f719c6d
Compare
I made a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting a proposal! The PR title/Commit title are a bit confusing at first. "create proposal for training repo" is too generic, please rephrase to highlight the proposal.
@@ -2,3 +2,5 @@ | |||
|
|||
# Spelling | |||
dictionary.dic | |||
|
|||
venv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was merged in #90. Please rebase.
@@ -0,0 +1,77 @@ | |||
# Modify Repository Proposal: Training |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about simply docs/training/repo.md
? We are in the training directory already so no need to repeat it.
|
||
Today, we have training implemented for different use-cases | ||
existing in different repos. LoRA/QLoRA training currently lives | ||
in the [`instructlab`](https://github.com/instructlab/instructlab) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, this logic will have to be extracted to instructlab/training
once we agree on this proposal, right?
- [James Kunstle](https://github.com/orgs/instructlab/people/JamesKunstle) | ||
- [Oleg Silkin](https://github.com/orgs/instructlab/people/RobotSail) | ||
- [Mustafa Eyceoz](https://github.com/orgs/instructlab/people/Maxusmusti) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 3 links return 404 for me.
Modulo feedback from @leseb, this LGTM. When feedback addressed I will take another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some initial thoughts
The maintainers should be the folks who currently work on training. | ||
Namely: | ||
|
||
- [Aldo Pareja](https://github.com/aldopareja) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to be added to this as the community train rep currently. Or @alimaredia. If we are planning on moving that code to the library, I would appreciate a part in that 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sdg
and eval
repos bootstrapped access by copying the existing Backend Maintainers
team. That's an alternative that could be used here.
Rather than consolidating the logic, we can keep the existing | ||
logic as-is. | ||
|
||
This would mean that the [CLI repo](https://github.com/instructlab/instructlab) and the [training repo](https://github.com/instructlab/training) would both maintain their own implementations of training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more what I was thinking, and this overlaps with #52 alot. Something we need to reconcile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this comment mean that this alternative is what you had in mind?
This shouldn't be merged until it is reconciled with #52 which pre-exists this |
@RobotSail would you mind rebasing this? The main edit I see a need for is just reflecting that "Backend Maintainers" was used to create the initial maintainers list. Otherwise, I think we should just merge this. This work has already moved on and is well in progress. cc @cdoern since you felt this should block on #52 , though this doc doesn't really talk about CLI integration details, so I think it's OK to merge it. Let me know if there's some other conflicting info you'd like to comment on more specifically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @RobotSail for pushing the PR. Its looks ok if it is just covering making the existing repo into a library.
I am bit confused by "Move everything into one repo but don't design an interface" section. Can you explain what you mean here.
As @russellb said, the PR should probably be merged as the library is now in situ.
This happened but the doc was never merged? |
Creates a proposal for a new training repo.
Signed-off-by: Oleg S [email protected]