-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce InstructLab Profiles Managed via ilab profile...
to Run Key Commands at Different Fidelity Levels
#52
base: main
Are you sure you want to change the base?
Conversation
Added a |
578dc70
to
3de2f2c
Compare
There's an open PR that's related: instructlab/instructlab#1008 from @derekhiggins This PR lets you arbitrarily override some internal training arguments. It's an interesting idea to provide a powerful override option to let people experiment with changes. You can see it in use here: instructlab/instructlab#1111 I think this is worth considering as inspiration in this design. We can try to provide nice interfaces, but having a way to override some internal details is valuable as well while we continue to evolve and figure out what works best in different environments. |
docs/lofi-hifi-backends.md
Outdated
|
||
|
||
|
||
### ilab model train integrated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cdoern what is the rationale behind integrated
vs qlora
? I think integrated introduces more ambiguity than just calling out the exact algorithm being used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RobotSail had some opinions over qlora vs phased. I think the gist is: one is a training technique the other is an algorithm. The names need to both be training techniques
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using the algorithms is better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to phrase this as extending the ilab commands to be configurable to use different technologies. The title doesn't clearly define what the design doc is about.
docs/lofi-hifi-backends.md
Outdated
|
||
This document describes adding different data generation, mixing, and training backends to ilab to enable higher fidelity training using the backend code. | ||
|
||
Currently all training is done via qlora or the like. Adding the following commands will enable higher fidelity training and introduce commands such as data mixing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently all training is done via qlora or the like. Adding the following commands will enable higher fidelity training and introduce commands such as data mixing. | |
Currently all training is done via qlora or the like. |
nit: can we be a bit more specific here
docs/lofi-hifi-backends.md
Outdated
### Reasoning | ||
|
||
|
||
Plugging into hardware acceleration and multi-phase training is the logical next step for ilab. Ensuring we do this in a clean way that does not overload our current commands is also crucial. Many of the processes in the backend are confusing so we want to abstract some of the steps away from users while also giving them a reasonable amount of choice in configuring these new processes. However, maintaining the current laptop story is important to users without hardware access. Splitting these two paths into separate commands maintains the integrity of each. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plugging into hardware acceleration and multi-phase training is the logical next step for ilab. Ensuring we do this in a clean way that does not overload our current commands is also crucial. Many of the processes in the backend are confusing so we want to abstract some of the steps away from users while also giving them a reasonable amount of choice in configuring these new processes. However, maintaining the current laptop story is important to users without hardware access. Splitting these two paths into separate commands maintains the integrity of each. | |
Plugging into hardware acceleration and multi-phase training is the logical next step for `ilab`. Ensuring we do this in a clean way that does not overload our current commands is also crucial. Many of the processes in the backend are confusing so we want to abstract some of the steps away from users while also giving them a reasonable amount of choice in configuring these new processes. However, maintaining the current laptop story is important to users without hardware access. Splitting these two paths into separate commands maintains the integrity of each. |
dropping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial impression was concern about using sub-sub-commands instead of flags. I think that is explained well. So... it makes sense to introduce a new level of commands when they affect what sets of flags are relevant. ✅
But my concern now (given some doubts about click) is do we have some PoC to reassure that we have good ability to implement this w/ usability. E.g. click support, usable default help. It would just be nice to hear we have that confidence.
docs/lofi-hifi-backends.md
Outdated
--model-dir=path , dir where the model to be used in this phase is located | ||
--data-dir=path , dir where the data for this phase is located | ||
|
||
Note there is no Lora in this command and there is no quantization. The `ilab model train integrated` command will use the transformers library with pytorch. This is because those have awesome plugins for deepspeed and fsdp WITH lora and qlora. Those are absolute necessities for community usecase. However, they will be mostly unused in the "High Fidelity" usecase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar comment about the use of "community" here -- that's not a use case differentiator
docs/lofi-hifi-backends.md
Outdated
--knowledge-recipes=[]str (path to yaml) | ||
--skill-recipes=[]str (path to yaml) | ||
|
||
* Do we need an `ilab recipe` cmd? * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed, this was an old thought
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @cdoern. It is looking good.
Some comments inline.
I think it would also help to provide links and context to the AI capabilities like deepspeed, qlora, fdsp etc.
docs/lofi-hifi-backends.md
Outdated
| | |______lofi * (name pending) | ||
| | |______hifi * (name pending) | ||
| | | ||
| |____mix * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
docs/lofi-hifi-backends.md
Outdated
@@ -0,0 +1,212 @@ | |||
# Introduce Commands that Run Jobs with Different Fidelity levels for Key ilab functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the title of this doc to me is something like:
Add additional implementations for ilab capabilities
docs/lofi-hifi-backends.md
Outdated
@@ -0,0 +1,212 @@ | |||
# Introduce Commands that Run Jobs with Different Fidelity levels for Key ilab functions | |||
|
|||
This document describes adding different data generation, mixing, and training backends to ilab to enable higher fidelity training using the backend code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like: This document describes adding different implementations for ilab
capabilities like data generation, and tuning to provide better AI performance.
docs/lofi-hifi-backends.md
Outdated
|
||
The Higher Fidelity versions would validate the existence of hardware that can properly run the generation, mixing, and training backends. At least for training, the existing infrastructure simply shells out to various python scripts, libraries, etc. So, as long as we combine this backend code into a place that can be imported into ilab without breaking other dependencies, this should be more of a structural change than a functional one. We know the backend code works on an isolated system, we just need to make it pluggable. | ||
|
||
High Fidelity can run locally on someone's laptop or desktop and even utilize deepspeed if they have GPUs. In a more powerful system, the user can also run it in a container, utilize deepspeed and potentially even distribute the workload accorss machines using torch.distributed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doe we know if using different methods for generation and training which produce better fidelity in model can run on laptops? If so, can you state the h/w specifications?
docs/lofi-hifi-backends.md
Outdated
|
||
|
||
|
||
### ilab model train integrated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using the algorithms is better
docs/lofi-hifi-backends.md
Outdated
* Transformers+Pytorch support Qlora&&FSDP. While deepspeed might be a more "server-rack" use-case, having multi-phase training in the CLI for anyone with a consumer GPU makes sense. | ||
2. Someone interested in ML, has a Homelab, or *anything with 2 GPUs* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the basic consumer laptop with GPU like Mac M-Series?
ilab train peft
, ilab train phased
, and other commands for increased model fidelity
docs/lofi-hifi-backends.md
Outdated
|
||
--gpus=str , describes the amount of GPUs (of what is available) to use for this process. This comes in the form of: 0-1, 8, etc. | ||
--quantize=bool, enabled Qlora which basically loads the model in a quantized form so it can fit on a consumer GPU | ||
--accelerator=str (deepspeed, fsdp) describes the optimizer framework to use during training |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name is overloaded and ambiguous, consider calling it something else like optimizer
docs/lofi-hifi-backends.md
Outdated
--gpus=str , describes the amount of GPUs (of what is available) to use for this process. This comes in the form of: 0-1, 8, etc. | ||
--quantize=bool, enabled Qlora which basically loads the model in a quantized form so it can fit on a consumer GPU | ||
--accelerator=str (deepspeed, fsdp) describes the optimizer framework to use during training | ||
--learning-rate, int (?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does learning-rate mean ?no helpful text
docs/lofi-hifi-backends.md
Outdated
--quantize=bool, enabled Qlora which basically loads the model in a quantized form so it can fit on a consumer GPU | ||
--accelerator=str (deepspeed, fsdp) describes the optimizer framework to use during training | ||
--learning-rate, int (?) | ||
--batch-len=int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does batch-len mean? no helpful text
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this proposal fit with #85? Should the new code go into the new training lib? If yes, this should be mentioned. Same question for the newly added sdg repo?
docs/lofi-hifi-backends.md
Outdated
|
||
### ilab model train phased | ||
|
||
This command would take roughly the following arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command would take roughly the following arguments | |
This command would take the following arguments |
It's the design phase, so let's be more precise.
docs/lofi-hifi-backends.md
Outdated
4. give you a model in safetensors format or GGUF format since this is not a bitsnbytes model. | ||
|
||
|
||
The big advantage here is faster and higher fidelity training than currently exists in the CLI because of deepspeed (or fsdp). The user could even set this up for multi GPU or multi system support with future ilab enhancements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefits from this approach look similar to the ones when using --quantize
, can we add more on the gains?
010da7c
to
d10018e
Compare
ilab train peft
, ilab train phased
, and other commands for increased model fidelityilab profile...
to run different key commands at different Fidelity levels
Ok, I resolved a bunch of outdated comments since this EP just got a major overhaul. After talking with teams designing new instructlab libraries the only way I can see to reconcile ALL of them while also maintaining the current CLI usecases is to provide top level "profiles". The team has kicked this idea around before from having a "prius" profile to a "F1 racecar" profile. For immediate purposes, I would create a way to set a profile out of three-ish hardcoded ones the training team (@RobotSail @aldopareja @JamesKunstle @Maxusmusti) will provide me and then in the future we can setup an interactive way to make a custom profile based off of these defaults. |
ilab profile...
to run different key commands at different Fidelity levelsilab profile...
to Run Key Commands at Different Fidelity Levels
This may seem like a pretty 180 degree shift. However, since the main point of this doc was ilab train commands, the training team is designing the library as a single entrypoint which takes a config file of options. With this being the case, the majority of this doc became outdated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My biggest request after a first look is to focus the proposal on training. That is going to make it easier to get to a proposal that can get consensus.
docs/lofi-hifi-backends.md
Outdated
@@ -0,0 +1,207 @@ | |||
# Introduce Instructlab profiles managed via `ilab profile...` to run different key commands at different Fidelity levels | |||
|
|||
This document describes adding different data generation, mixing, and training backends to `ilab` to enable higher fidelity training using the backend code. By higher fidelity we mean models that preform better, were trained on better hardware, off of larger data sets and using more intensive training techniques. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This document describes adding different data generation, mixing, and training backends to `ilab` to enable higher fidelity training using the backend code. By higher fidelity we mean models that preform better, were trained on better hardware, off of larger data sets and using more intensive training techniques. | |
This document describes adding different data generation, mixing, and training backends to `ilab` to enable higher-fidelity training using the backend code. By higher fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we scope this down to training only?
docs/lofi-hifi-backends.md
Outdated
@@ -0,0 +1,207 @@ | |||
# Introduce Instructlab profiles managed via `ilab profile...` to run different key commands at different Fidelity levels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth changing the filename to reflect the latest approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This first line reads more like a summary than a title. I would shorten it to something like:
# Introduce Instructlab profiles managed via `ilab profile...` to run different key commands at different Fidelity levels | |
# InstructLab Training Profiles |
docs/lofi-hifi-backends.md
Outdated
|
||
Currently all training is done via QLoRA or the like. Adding the following commands will enable higher fidelity training and introduce commands such as data mixing. | ||
|
||
This document focuses primarily on training, specifically different configuration types or "profiles" for `ilab model train`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ha - here you clarify the doc really is intended to be focused on training. I think removing earlier references to other topics would help.
docs/lofi-hifi-backends.md
Outdated
|
||
Eventually, we will want to enable a `ilab profile init` command that will allow users to initialize a custom profile based off of one of the pre-baked ones. This is not targeted for an upcoming release though. | ||
|
||
### Immediate Goals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a new section called something like "UX Overview" that walks through what the experience would look like to a user?
I know the commands are scattered throughout the doc, but there's a lot of discussion around each one. A focused look at the UX (a tl;dr of the commands) would help.
docs/lofi-hifi-backends.md
Outdated
Eventually this profile would have settings for generation, eval, etc. But for immediate goals, hardcoded training settings is the MVP. Rather than having a `--config` option at the `ilab model train` level, storing the profile at the global level allows us to expand this idea to other ilab commands in the future. We need to be careful about how we introduce new concepts like this. | ||
|
||
For immediate releases I would introduce the idea of a profile, a command to set a specific profile, and hardcoded profiles that plug into key ilab commands namely training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This idea that a profile is not specific to training seems like a critical detail of this proposal.
My expectation was something very focused on training.
A global profile for everything is just the existing configuration file. You can have multiple configuration files if needed. I'm having a hard time deciding how this would end up being significantly different.
I think you're proposing a set of options that are supported in the config file but NOT via command line arguments. Is that right?
Honestly I think I'd rather just have all the command line arguments PLUS an extra one --profile
that's shorthand for a set of defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to make this all more clear, but the idea of having a profile is related to but distinct from a config.
I think we can iterate on where profiles are stored (could be in the config.yaml for all I care really) but having a way to say "ok enable a set of defaults for commands x, y, z that I don't need to think about for my specific usecase is something that we will need.
The other approach is a bunch of flags on cmds that will bloat the UX, I am very very anti adding dozens of flags.
The training folks are going down the config.json route, so I am trying to find a way to make that work thruout the CLI without just having a random --config
flag JUST for training.
Let me take another pass at this just for training for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, this is the design doc for friday's design freeze so it needs to consider all commands being added
docs/lofi-hifi-backends.md
Outdated
|
||
- accelerator=str (deepspeed, fsdp) describes the optimizer framework to use during training | ||
- gpus=str describes the amount of GPUs (of what is available) to use for this process. This comes in the form of: 0-1, 8, etc. | ||
- ds_config=str desceibes a path to a .json file configuring deepspeed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
desceibes -> describes
Should this be deepspeed_config to avoid any potential confusion with future options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure!
docs/lofi-hifi-backends.md
Outdated
- you can only run certain benchmarks depending on what type of evaluation you are doing. | ||
Note: We could have `ilab model evaluate` as a single command and take flags that depend on each other like `--checkpoint-dir` and `--benchmarks` but in general, with the new CLI design we are trying to get out of the habit of flags that depend on each other. | ||
--output-dir: str, determines where the best checkpoint is put for the next phase of training | ||
--input-dir: str, takes the directory of the model/checkpoint to evaluate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would I pass a path to a single model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsure, @alimaredia @alinaryan @nathan-weinberg what is the format of passing a model vs checkpoint dir in the evaluation library? I have never run eval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current design:
https://github.com/instructlab/eval/pull/6/files
Is eval would take a single model. So for a checkpoint the caller would need to look through the checkpoints and pass each model 1x1.
docs/lofi-hifi-backends.md
Outdated
```yaml | ||
profile: | ||
train: | ||
gpus: 0-4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: In order to get these right, the user would need to know whether we do any parallelization of train and eval steps.
docs/lofi-hifi-backends.md
Outdated
taxonomy_path: /path/to/large/taxonomy | ||
num_grounded_questions: 10 | ||
num_samples: 10 | ||
evaluate: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gpus?
docs/lofi-hifi-backends.md
Outdated
|
||
The underlying training, eval, and generation libraries will handle the specifics based off of the config provided via the profile. For example is a user passes CPU training, no Deepspeed/FSDP etc then the training library will run the equivalent of "linux_train" that currently exists, outputting a model ready to be used. If the user has 4 GPUS, Deepspeed enabled and 15 epochs, the training library might give you a series of checkpoints. | ||
|
||
`ilab checkpoint evaluate` will be used in conjunction with `ilab model train` when the user is running multi-phase training. This command will run full scale inter-checkpoint evaluation on the given directory. An output dir will then hold then best checkpoint and all necessary data to run another `ilab train phased` command on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to do a copy of the winning model or a symlink or ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most likely a copy, I know the training folks expect a format like --data-dir=somephase/data
and --model-dir=somephase/model
so after eval we need to have a dir that just points to the checkpoint to pick up from. I mean it could just be the existing dir from the last phase of training but I think creating a new dir with just the chosen checkpoint makes sense
docs/lofi-hifi-backends.md
Outdated
|
||
The underlying training, eval, and generation libraries will handle the specifics based off of the config provided via the profile. For example is a user passes CPU training, no Deepspeed/FSDP etc then the training library will run the equivalent of "linux_train" that currently exists, outputting a model ready to be used. If the user has 4 GPUS, Deepspeed enabled and 15 epochs, the training library might give you a series of checkpoints. | ||
|
||
`ilab checkpoint evaluate` will be used in conjunction with `ilab model train` when the user is running multi-phase training. This command will run full scale inter-checkpoint evaluation on the given directory. An output dir will then hold then best checkpoint and all necessary data to run another `ilab train phased` command on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a user, what would I want to do between a training phase and running checkpoint evaluate? Or would running evaluate as part of train be more straightforward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we agreed on no orchestration, but I could be convinced to add a --eval
flag to training.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what would make that be considered orchestration. I wouldn't have said it's simply something that takes multiple steps. Training is already multiple steps per phase. Eval of a checkpoint dir is also multiple steps. Generally I think of orchestration as involving more complex workflows or coordinating parallel processes. The question I have here is what's the desired input/output flow from a user perspective. Specifically, what's the reason(s) a user would like a separation between the train and eval of each phase?
6a192b4
to
cd49fa6
Compare
docs/ilab-profile.md
Outdated
|
||
### Immediate Goals and Core Principles | ||
|
||
For the near future, there will be a single upper level profile that can be initialized via `ilab profile set <profile_name>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know how often we expect people to switch between profiles?
MMLU bench needs the following options: | ||
- --model: str, default is granite (?) | ||
- --tasks: []str, default is {"mmlu_pr"}. This is the list of MMLU tasks to run. | ||
- --few-shots (int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be mmlu-few-shots? Or would we reuse this param for other potential future benchmarks?
…ey commands at different Fidelity levels This enhancement discusses "more intensive" training and data generation techniques as well as a new Data Mixing command. This is all built off of the command redesign. The goal here is to produce higher fidelity models using the CLI. Signed-off-by: Charlie Doern <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cdoern for working hard on this and the improvements. I like the direction is is going in now.
I have some feedback inline. It is mostly around:
- What are we trying to solve rather than how
- It is capability of the workflow rather than just getting things from the backend (even if implementation from the backend)
- Mention how you can add customized profile
@@ -0,0 +1,392 @@ | |||
# Introduce Functionality to Utilize the InstructLab backend: `ilab` Profiles, and Command Adaptations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Introduce Functionality to Utilize the InstructLab backend: `ilab` Profiles, and Command Adaptations | |
# Extend the CLI to be Configurable for Workflow Capabilities |
@@ -0,0 +1,392 @@ | |||
# Introduce Functionality to Utilize the InstructLab backend: `ilab` Profiles, and Command Adaptations | |||
|
|||
This document describes adding different data generation, mixing, and training configurations to the `ilab` CLI to enable higher-fidelity training using the backend code. By higher-fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This document describes adding different data generation, mixing, and training configurations to the `ilab` CLI to enable higher-fidelity training using the backend code. By higher-fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques. | |
InstructLab is a workflow for a model alignment technique for adding contributions directly to Large language Models (LLMs) using multi-phased alignment training. The Instruct Lab CLI (`ilab`) currently hard codes the different capabilities (e.g. data generation, training etc.) of the workflow. | |
This document describes extending the CLI to make the different capabilities of the workflow to be configurable. This would enable users to configure the CLI based on: | |
- Hardware specification of the user. For example, numbers of GPUs, memory etc. | |
- Using capabilities which are distributed for the workflow. |
|
||
This document describes adding different data generation, mixing, and training configurations to the `ilab` CLI to enable higher-fidelity training using the backend code. By higher-fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques. | ||
|
||
Users will be able to set a pre-defined profile type in their config.yaml that would enable sane defaults for higher fidelity training, generation, and evaluation. These defaults will funnel into existing and new flags for the CLI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be in the what you are trying to do section? This is implementation and more "how" want to do it.
|
||
Users will be able to set a pre-defined profile type in their config.yaml that would enable sane defaults for higher fidelity training, generation, and evaluation. These defaults will funnel into existing and new flags for the CLI. | ||
|
||
## `ilab` Profiles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## `ilab` Profiles | |
## Proposed Design |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now introduce the profiles concept here, giving an overview like from above.
Profiles should at first be static and not exposed to users. Internally, a profile would look something like: | ||
|
||
```yaml | ||
profile: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Profile should have a name
attribute to identify it
## Proposal for Default Profiles | ||
|
||
1. **CPU Only, Laptop Profile** | ||
- low instruction number for generation | ||
- gpus: -1 for training, eval, generation | ||
- no deepspeed config or accelerator config for training | ||
- low epoch number for training | ||
- single phase training, no eval support given that there are no GPUs. | ||
2. **Single GPU, Generate, Train, Eval profile -- CUDA** | ||
- mid level instruction number for generation, higher than the default 10 (100-500) | ||
- eventual deepspeed support with LoRA and QLoRA in training | ||
- 10+ epochs for training | ||
- eval on checkpoints if hardware permits (depends on vRam) | ||
3. **Multi GPU, Generate, Train, Eval profile -- CUDA** | ||
- high instruction count, 500 or so for generation | ||
- deepspeed support without quantization or LoRA | ||
- evaluation on checkpoints after a phase of training | ||
|
||
**Profile 3 would most likely have sub-profiles for different GPU support.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the Mac Metal profile covered?
|
||
### Reasoning | ||
|
||
The profile settings will be used as arguments for most if not all libraries being introduced to the `ilab` backend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The profile settings will be used as arguments for most if not all libraries being introduced to the `ilab` backend. | |
The profile settings will be used as arguments for most if not all capabilities that are part of the workflow. |
@cdoern what's the status on this doc? |
This enhancement discusses "more intensive" training and data generation techniques as well as a new Data Mixing command. This is all built off of the command redesign. The goal here is to produce higher fidelity models using the CLI.