Introduce InstructLab Profiles Managed via `ilab profile...` to Run Key Commands at Different Fidelity Levels #52

cdoern · 2024-05-17T14:13:43Z

This enhancement discusses "more intensive" training and data generation techniques as well as a new Data Mixing command. This is all built off of the command redesign. The goal here is to produce higher fidelity models using the CLI.

russellb · 2024-05-17T14:16:12Z

Added a hold because this is blocked on the design in #48 and the corresponding code in github.com/instructlab/instructlab/pull/990 (and maybe others)

docs/lofi-hifi-backends.md

russellb · 2024-05-20T14:07:55Z

There's an open PR that's related: instructlab/instructlab#1008 from @derekhiggins

This PR lets you arbitrarily override some internal training arguments. It's an interesting idea to provide a powerful override option to let people experiment with changes. You can see it in use here: instructlab/instructlab#1111

I think this is worth considering as inspiration in this design. We can try to provide nice interfaces, but having a way to override some internal details is valuable as well while we continue to evolve and figure out what works best in different environments.

jaideepr97 · 2024-05-22T22:59:28Z

docs/lofi-hifi-backends.md

+
+
+
+### ilab model train integrated


@cdoern what is the rationale behind integrated vs qlora? I think integrated introduces more ambiguity than just calling out the exact algorithm being used

@RobotSail had some opinions over qlora vs phased. I think the gist is: one is a training technique the other is an algorithm. The names need to both be training techniques

I think using the algorithms is better

docs/lofi-hifi-backends.md

hickeyma

I think it would be better to phrase this as extending the ilab commands to be configurable to use different technologies. The title doesn't clearly define what the design doc is about.

nathan-weinberg · 2024-05-23T14:21:13Z

docs/lofi-hifi-backends.md

+
+This document describes adding different data generation, mixing, and training backends to ilab to enable higher fidelity training using the backend code.
+
+Currently all training is done via qlora or the like. Adding the following commands will enable higher fidelity training and introduce commands such as data mixing.


Suggested change

Currently all training is done via qlora or the like. Adding the following commands will enable higher fidelity training and introduce commands such as data mixing.

Currently all training is done via qlora or the like.

nit: can we be a bit more specific here

docs/lofi-hifi-backends.md

nathan-weinberg · 2024-05-23T14:45:30Z

docs/lofi-hifi-backends.md

+### Reasoning
+
+
+Plugging into hardware acceleration and multi-phase training is the logical next step for ilab. Ensuring we do this in a clean way that does not overload our current commands is also crucial. Many of the processes in the backend are confusing so we want to abstract some of the steps away from users while also giving them a reasonable amount of choice in configuring these new processes. However, maintaining the current laptop story is important to users without hardware access. Splitting these two paths into separate commands maintains the integrity of each.


Suggested change

Plugging into hardware acceleration and multi-phase training is the logical next step for ilab. Ensuring we do this in a clean way that does not overload our current commands is also crucial. Many of the processes in the backend are confusing so we want to abstract some of the steps away from users while also giving them a reasonable amount of choice in configuring these new processes. However, maintaining the current laptop story is important to users without hardware access. Splitting these two paths into separate commands maintains the integrity of each.

Plugging into hardware acceleration and multi-phase training is the logical next step for `ilab`. Ensuring we do this in a clean way that does not overload our current commands is also crucial. Many of the processes in the backend are confusing so we want to abstract some of the steps away from users while also giving them a reasonable amount of choice in configuring these new processes. However, maintaining the current laptop story is important to users without hardware access. Splitting these two paths into separate commands maintains the integrity of each.

docs/lofi-hifi-backends.md

russellb · 2024-05-23T17:58:04Z

dropping hold because #48 merged

markstur

My initial impression was concern about using sub-sub-commands instead of flags. I think that is explained well. So... it makes sense to introduce a new level of commands when they affect what sets of flags are relevant. ✅

But my concern now (given some doubts about click) is do we have some PoC to reassure that we have good ability to implement this w/ usability. E.g. click support, usable default help. It would just be nice to hear we have that confidence.

docs/lofi-hifi-backends.md

russellb · 2024-05-29T00:41:28Z

docs/lofi-hifi-backends.md

+--model-dir=path , dir where the model to be used in this phase is located
+--data-dir=path , dir where the data for this phase is located
+
+Note there is no Lora in this command and there is no quantization. The `ilab model train integrated` command will use the transformers library with pytorch. This is because those have awesome plugins for deepspeed and fsdp WITH lora and qlora. Those are absolute necessities for community usecase. However, they will be mostly unused in the "High Fidelity" usecase.


similar comment about the use of "community" here -- that's not a use case differentiator

docs/lofi-hifi-backends.md

russellb · 2024-05-29T00:46:36Z

docs/lofi-hifi-backends.md

+--knowledge-recipes=[]str (path to yaml)
+--skill-recipes=[]str (path to yaml)
+
+* Do we need an `ilab recipe` cmd? *


What does this mean?

removed, this was an old thought

docs/lofi-hifi-backends.md

hickeyma

Thanks for working on this @cdoern. It is looking good.

Some comments inline.

I think it would also help to provide links and context to the AI capabilities like deepspeed, qlora, fdsp etc.

docs/lofi-hifi-backends.md

hickeyma · 2024-06-07T11:13:22Z

docs/lofi-hifi-backends.md

+|       |    |______lofi * (name pending)
+|       |    |______hifi * (name pending)
+|       |
+|       |____mix *


docs/lofi-hifi-backends.md

hickeyma · 2024-06-07T11:16:24Z

docs/lofi-hifi-backends.md

@@ -0,0 +1,212 @@
+# Introduce Commands that Run Jobs with Different Fidelity levels for Key ilab functions


I think the title of this doc to me is something like:

Add additional implementations for ilab capabilities

hickeyma · 2024-06-07T11:18:15Z

docs/lofi-hifi-backends.md

@@ -0,0 +1,212 @@
+# Introduce Commands that Run Jobs with Different Fidelity levels for Key ilab functions
+
+This document describes adding different data generation, mixing, and training backends to ilab to enable higher fidelity training using the backend code.


How about something like: This document describes adding different implementations for ilab capabilities like data generation, and tuning to provide better AI performance.

docs/lofi-hifi-backends.md

hickeyma · 2024-06-07T12:28:15Z

docs/lofi-hifi-backends.md

+
+The Higher Fidelity versions would validate the existence of hardware that can properly run the generation, mixing, and training backends. At least for training, the existing infrastructure simply shells out to various python scripts, libraries, etc. So, as long as we combine this backend code into a place that can be imported into ilab without breaking other dependencies, this should be more of a structural change than a functional one. We know the backend code works on an isolated system, we just need to make it pluggable.
+
+High Fidelity can run locally on someone's laptop or desktop and even utilize deepspeed if they have GPUs. In a more powerful system, the user can also run it in a container, utilize deepspeed and potentially even distribute the workload accorss machines using torch.distributed.


Doe we know if using different methods for generation and training which produce better fidelity in model can run on laptops? If so, can you state the h/w specifications?

hickeyma · 2024-06-07T12:31:30Z

docs/lofi-hifi-backends.md

+
+
+
+### ilab model train integrated


I think using the algorithms is better

hickeyma · 2024-06-07T12:34:12Z

docs/lofi-hifi-backends.md

+    * Transformers+Pytorch support Qlora&&FSDP. While deepspeed might be a more "server-rack" use-case, having multi-phase training in the CLI for anyone with a consumer GPU makes sense.
+2. Someone interested in ML, has a Homelab, or *anything with 2 GPUs*


What about the basic consumer laptop with GPU like Mac M-Series?

jeremyeder · 2024-06-10T20:20:18Z

docs/lofi-hifi-backends.md

+
+--gpus=str , describes the amount of GPUs (of what is available) to use for this process. This comes in the form of: 0-1, 8, etc.
+--quantize=bool, enabled Qlora which basically loads the model in a quantized form so it can fit on a consumer GPU
+--accelerator=str (deepspeed, fsdp) describes the optimizer framework to use during training


name is overloaded and ambiguous, consider calling it something else like optimizer

jeremyeder · 2024-06-10T20:20:45Z

docs/lofi-hifi-backends.md

+--gpus=str , describes the amount of GPUs (of what is available) to use for this process. This comes in the form of: 0-1, 8, etc.
+--quantize=bool, enabled Qlora which basically loads the model in a quantized form so it can fit on a consumer GPU
+--accelerator=str (deepspeed, fsdp) describes the optimizer framework to use during training
+--learning-rate, int (?)


what does learning-rate mean ?no helpful text

jeremyeder · 2024-06-10T20:20:58Z

docs/lofi-hifi-backends.md

+--quantize=bool, enabled Qlora which basically loads the model in a quantized form so it can fit on a consumer GPU
+--accelerator=str (deepspeed, fsdp) describes the optimizer framework to use during training
+--learning-rate, int (?)
+--batch-len=int


what does batch-len mean? no helpful text

leseb

How does this proposal fit with #85? Should the new code go into the new training lib? If yes, this should be mentioned. Same question for the newly added sdg repo?

docs/lofi-hifi-backends.md

leseb · 2024-06-11T13:20:41Z

docs/lofi-hifi-backends.md

+
+### ilab model train phased
+
+This command would take roughly the following arguments


Suggested change

This command would take roughly the following arguments

This command would take the following arguments

It's the design phase, so let's be more precise.

docs/lofi-hifi-backends.md

leseb · 2024-06-11T13:25:34Z

docs/lofi-hifi-backends.md

+4. give you a model in safetensors format or GGUF format since this is not a bitsnbytes model.
+
+
+The big advantage here is faster and higher fidelity training than currently exists in the CLI because of deepspeed (or fsdp). The user could even set this up for multi GPU or multi system support with future ilab enhancements.


The benefits from this approach look similar to the ones when using --quantize, can we add more on the gains?

docs/lofi-hifi-backends.md

cdoern · 2024-06-11T19:08:49Z

Ok, I resolved a bunch of outdated comments since this EP just got a major overhaul.

After talking with teams designing new instructlab libraries the only way I can see to reconcile ALL of them while also maintaining the current CLI usecases is to provide top level "profiles". The team has kicked this idea around before from having a "prius" profile to a "F1 racecar" profile. For immediate purposes, I would create a way to set a profile out of three-ish hardcoded ones the training team (@RobotSail @aldopareja @JamesKunstle @Maxusmusti) will provide me and then in the future we can setup an interactive way to make a custom profile based off of these defaults.

cdoern · 2024-06-11T19:18:40Z

This may seem like a pretty 180 degree shift. However, since the main point of this doc was ilab train commands, the training team is designing the library as a single entrypoint which takes a config file of options. With this being the case, the majority of this doc became outdated.

russellb

My biggest request after a first look is to focus the proposal on training. That is going to make it easier to get to a proposal that can get consensus.

russellb · 2024-06-11T21:25:26Z

docs/lofi-hifi-backends.md

@@ -0,0 +1,207 @@
+# Introduce Instructlab profiles managed via `ilab profile...` to run different key commands at different Fidelity levels
+
+This document describes adding different data generation, mixing, and training backends to `ilab` to enable higher fidelity training using the backend code. By higher fidelity we mean models that preform better, were trained on better hardware, off of larger data sets and using more intensive training techniques.


Suggested change

This document describes adding different data generation, mixing, and training backends to `ilab` to enable higher fidelity training using the backend code. By higher fidelity we mean models that preform better, were trained on better hardware, off of larger data sets and using more intensive training techniques.

This document describes adding different data generation, mixing, and training backends to `ilab` to enable higher-fidelity training using the backend code. By higher fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques.

Can we scope this down to training only?

docs/lofi-hifi-backends.md

russellb · 2024-06-11T21:26:11Z

docs/lofi-hifi-backends.md

@@ -0,0 +1,207 @@
+# Introduce Instructlab profiles managed via `ilab profile...` to run different key commands at different Fidelity levels


I think it's worth changing the filename to reflect the latest approach.

This first line reads more like a summary than a title. I would shorten it to something like:

Suggested change

# Introduce Instructlab profiles managed via `ilab profile...` to run different key commands at different Fidelity levels

# InstructLab Training Profiles

docs/lofi-hifi-backends.md

russellb · 2024-06-11T21:27:56Z

docs/lofi-hifi-backends.md

+
+Currently all training is done via QLoRA or the like. Adding the following commands will enable higher fidelity training and introduce commands such as data mixing.
+
+This document focuses primarily on training, specifically different configuration types or "profiles" for `ilab model train`.


Ah ha - here you clarify the doc really is intended to be focused on training. I think removing earlier references to other topics would help.

docs/lofi-hifi-backends.md

russellb · 2024-06-11T21:35:01Z

docs/lofi-hifi-backends.md

+
+Eventually, we will want to enable a `ilab profile init` command that will allow users to initialize a custom profile based off of one of the pre-baked ones. This is not targeted for an upcoming release though. 
+
+### Immediate Goals


Could you add a new section called something like "UX Overview" that walks through what the experience would look like to a user?

I know the commands are scattered throughout the doc, but there's a lot of discussion around each one. A focused look at the UX (a tl;dr of the commands) would help.

docs/lofi-hifi-backends.md

russellb · 2024-06-11T21:40:47Z

docs/lofi-hifi-backends.md

+Eventually this profile would have settings for generation, eval, etc. But for immediate goals, hardcoded training settings is the MVP. Rather than having a `--config` option at the `ilab model train` level, storing the profile at the global level allows us to expand this idea to other ilab commands in the future. We need to be careful about how we introduce new concepts like this.
+
+For immediate releases I would introduce the idea of a profile, a command to set a specific profile, and hardcoded profiles that plug into key ilab commands namely training.


This idea that a profile is not specific to training seems like a critical detail of this proposal.

My expectation was something very focused on training.

A global profile for everything is just the existing configuration file. You can have multiple configuration files if needed. I'm having a hard time deciding how this would end up being significantly different.

I think you're proposing a set of options that are supported in the config file but NOT via command line arguments. Is that right?

Honestly I think I'd rather just have all the command line arguments PLUS an extra one --profile that's shorthand for a set of defaults.

I need to make this all more clear, but the idea of having a profile is related to but distinct from a config.

I think we can iterate on where profiles are stored (could be in the config.yaml for all I care really) but having a way to say "ok enable a set of defaults for commands x, y, z that I don't need to think about for my specific usecase is something that we will need.

The other approach is a bunch of flags on cmds that will bloat the UX, I am very very anti adding dozens of flags.

The training folks are going down the config.json route, so I am trying to find a way to make that work thruout the CLI without just having a random --config flag JUST for training.

Let me take another pass at this just for training for now.

also, this is the design doc for friday's design freeze so it needs to consider all commands being added

docs/lofi-hifi-backends.md

danmcp · 2024-06-12T16:23:20Z

docs/lofi-hifi-backends.md

+
+- accelerator=str (deepspeed, fsdp) describes the optimizer framework to use during training
+- gpus=str describes the amount of GPUs (of what is available) to use for this process. This comes in the form of: 0-1, 8, etc.
+- ds_config=str desceibes a path to a .json file configuring deepspeed. 


desceibes -> describes

Should this be deepspeed_config to avoid any potential confusion with future options

docs/lofi-hifi-backends.md

danmcp · 2024-06-12T16:42:07Z

docs/lofi-hifi-backends.md

+    - you can only run certain benchmarks depending on what type of evaluation you are doing. 
+Note: We could have `ilab model evaluate` as a single command and take flags that depend on each other like `--checkpoint-dir` and `--benchmarks` but in general, with the new CLI design we are trying to get out of the habit of flags that depend on each other.
+--output-dir: str, determines where the best checkpoint is put for the next phase of training
+--input-dir: str, takes the directory of the model/checkpoint to evaluate. 


How would I pass a path to a single model?

unsure, @alimaredia @alinaryan @nathan-weinberg what is the format of passing a model vs checkpoint dir in the evaluation library? I have never run eval

The current design:

https://github.com/instructlab/eval/pull/6/files

Is eval would take a single model. So for a checkpoint the caller would need to look through the checkpoints and pass each model 1x1.

danmcp · 2024-06-12T16:46:32Z

docs/lofi-hifi-backends.md

+```yaml
+profile:
+    train:
+        gpus: 0-4


Note: In order to get these right, the user would need to know whether we do any parallelization of train and eval steps.

danmcp · 2024-06-12T16:46:55Z

docs/lofi-hifi-backends.md

+        taxonomy_path: /path/to/large/taxonomy
+        num_grounded_questions: 10
+        num_samples: 10
+    evaluate:


danmcp · 2024-06-12T16:54:19Z

docs/lofi-hifi-backends.md

+
+The underlying training, eval, and generation libraries will handle the specifics based off of the config provided via the profile. For example is a user passes CPU training, no Deepspeed/FSDP etc then the training library will run the equivalent of "linux_train" that currently exists, outputting a model ready to be used. If the user has 4 GPUS, Deepspeed enabled and 15 epochs, the training library might give you a series of checkpoints.
+
+`ilab checkpoint evaluate` will be used in conjunction with `ilab model train` when the user is running multi-phase training. This command will run full scale inter-checkpoint evaluation on the given directory. An output dir will then hold then best checkpoint and all necessary data to run another `ilab train phased` command on.


Is this going to do a copy of the winning model or a symlink or ?

most likely a copy, I know the training folks expect a format like --data-dir=somephase/data and --model-dir=somephase/model so after eval we need to have a dir that just points to the checkpoint to pick up from. I mean it could just be the existing dir from the last phase of training but I think creating a new dir with just the chosen checkpoint makes sense

danmcp · 2024-06-12T16:56:29Z

docs/lofi-hifi-backends.md

+
+The underlying training, eval, and generation libraries will handle the specifics based off of the config provided via the profile. For example is a user passes CPU training, no Deepspeed/FSDP etc then the training library will run the equivalent of "linux_train" that currently exists, outputting a model ready to be used. If the user has 4 GPUS, Deepspeed enabled and 15 epochs, the training library might give you a series of checkpoints.
+
+`ilab checkpoint evaluate` will be used in conjunction with `ilab model train` when the user is running multi-phase training. This command will run full scale inter-checkpoint evaluation on the given directory. An output dir will then hold then best checkpoint and all necessary data to run another `ilab train phased` command on.


As a user, what would I want to do between a training phase and running checkpoint evaluate? Or would running evaluate as part of train be more straightforward?

I think we agreed on no orchestration, but I could be convinced to add a --eval flag to training.

I am not sure what would make that be considered orchestration. I wouldn't have said it's simply something that takes multiple steps. Training is already multiple steps per phase. Eval of a checkpoint dir is also multiple steps. Generally I think of orchestration as involving more complex workflows or coordinating parallel processes. The question I have here is what's the desired input/output flow from a user perspective. Specifically, what's the reason(s) a user would like a separation between the train and eval of each phase?

leseb · 2024-06-14T12:47:51Z

docs/ilab-profile.md

+
+### Immediate Goals and Core Principles
+
+For the near future, there will be a single upper level profile that can be initialized via `ilab profile set <profile_name>`


Do we know how often we expect people to switch between profiles?

danmcp · 2024-06-13T23:31:42Z

docs/ilab-profile.md

+MMLU bench needs the following options:
+- --model: str, default is granite (?)
+- --tasks: []str, default is {"mmlu_pr"}. This is the list of MMLU tasks to run.
+- --few-shots (int)


Should this be mmlu-few-shots? Or would we reuse this param for other potential future benchmarks?

…ey commands at different Fidelity levels This enhancement discusses "more intensive" training and data generation techniques as well as a new Data Mixing command. This is all built off of the command redesign. The goal here is to produce higher fidelity models using the CLI. Signed-off-by: Charlie Doern <[email protected]>

hickeyma

Thanks @cdoern for working hard on this and the improvements. I like the direction is is going in now.

I have some feedback inline. It is mostly around:

What are we trying to solve rather than how
It is capability of the workflow rather than just getting things from the backend (even if implementation from the backend)
Mention how you can add customized profile

hickeyma · 2024-07-03T12:45:51Z

docs/ilab-profile.md

@@ -0,0 +1,392 @@
+# Introduce Functionality to Utilize the InstructLab backend: `ilab` Profiles, and Command Adaptations


Suggested change

# Introduce Functionality to Utilize the InstructLab backend: `ilab` Profiles, and Command Adaptations

# Extend the CLI to be Configurable for Workflow Capabilities

hickeyma · 2024-07-03T13:07:59Z

docs/ilab-profile.md

@@ -0,0 +1,392 @@
+# Introduce Functionality to Utilize the InstructLab backend: `ilab` Profiles, and Command Adaptations
+
+This document describes adding different data generation, mixing, and training configurations to the `ilab` CLI to enable higher-fidelity training using the backend code. By higher-fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques.


Suggested change

This document describes adding different data generation, mixing, and training configurations to the `ilab` CLI to enable higher-fidelity training using the backend code. By higher-fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques.

InstructLab is a workflow for a model alignment technique for adding contributions directly to Large language Models (LLMs) using multi-phased alignment training. The Instruct Lab CLI (`ilab`) currently hard codes the different capabilities (e.g. data generation, training etc.) of the workflow.

This document describes extending the CLI to make the different capabilities of the workflow to be configurable. This would enable users to configure the CLI based on:

- Hardware specification of the user. For example, numbers of GPUs, memory etc.

- Using capabilities which are distributed for the workflow.

hickeyma · 2024-07-03T13:08:55Z

docs/ilab-profile.md

+
+This document describes adding different data generation, mixing, and training configurations to the `ilab` CLI to enable higher-fidelity training using the backend code. By higher-fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques.
+
+Users will be able to set a pre-defined profile type in their config.yaml that would enable sane defaults for higher fidelity training, generation, and evaluation. These defaults will funnel into existing and new flags for the CLI.


Should this be in the what you are trying to do section? This is implementation and more "how" want to do it.

hickeyma · 2024-07-03T13:11:18Z

docs/ilab-profile.md

+
+Users will be able to set a pre-defined profile type in their config.yaml that would enable sane defaults for higher fidelity training, generation, and evaluation. These defaults will funnel into existing and new flags for the CLI.
+
+## `ilab` Profiles


Suggested change

## `ilab` Profiles

## Proposed Design

Now introduce the profiles concept here, giving an overview like from above.

hickeyma · 2024-07-03T13:14:13Z

docs/ilab-profile.md

+Profiles should at first be static and not exposed to users. Internally, a profile would look something like:
+
+```yaml
+profile:


Profile should have a name attribute to identify it

hickeyma · 2024-07-03T13:15:49Z

docs/ilab-profile.md

+## Proposal for Default Profiles
+
+1. **CPU Only, Laptop Profile**
+    - low instruction number for generation
+    - gpus: -1 for training, eval, generation
+    - no deepspeed config or accelerator config for training
+    - low epoch number for training
+    - single phase training, no eval support given that there are no GPUs.
+2. **Single GPU, Generate, Train, Eval profile -- CUDA**
+    - mid level instruction number for generation, higher than the default 10 (100-500)
+    - eventual deepspeed support with LoRA and QLoRA in training
+    - 10+ epochs for training
+    - eval on checkpoints if hardware permits (depends on vRam)
+3. **Multi GPU, Generate, Train, Eval profile -- CUDA**
+    - high instruction count, 500 or so for generation
+    - deepspeed support without quantization or LoRA
+    - evaluation on checkpoints after a phase of training
+
+**Profile 3 would most likely have sub-profiles for different GPU support.**


Where is the Mac Metal profile covered?

hickeyma · 2024-07-03T13:17:08Z

docs/ilab-profile.md

+
+### Reasoning
+
+The profile settings will be used as arguments for most if not all libraries being introduced to the `ilab` backend.


Suggested change

The profile settings will be used as arguments for most if not all libraries being introduced to the `ilab` backend.

The profile settings will be used as arguments for most if not all capabilities that are part of the workflow.

nathan-weinberg · 2024-10-29T20:09:33Z

@cdoern what's the status on this doc?

russellb added the hold label May 17, 2024

russellb requested changes May 17, 2024

View reviewed changes

docs/lofi-hifi-backends.md Outdated Show resolved Hide resolved

cdoern force-pushed the lofi-hifi branch 3 times, most recently from 578dc70 to 3de2f2c Compare May 18, 2024 16:54

russellb mentioned this pull request May 20, 2024

Run e2e training test without 4 bit quant instructlab/instructlab#1111

Closed

cdoern force-pushed the lofi-hifi branch from 3de2f2c to 424bb45 Compare May 22, 2024 20:32

jaideepr97 reviewed May 22, 2024

View reviewed changes

docs/lofi-hifi-backends.md Outdated Show resolved Hide resolved

hickeyma requested changes May 23, 2024

View reviewed changes

cdoern force-pushed the lofi-hifi branch from 424bb45 to 19ef6c4 Compare May 23, 2024 13:02

nathan-weinberg self-requested a review May 23, 2024 14:20

nathan-weinberg requested changes May 23, 2024

View reviewed changes

russellb removed the hold label May 23, 2024

markstur reviewed May 23, 2024

View reviewed changes

russellb requested changes May 29, 2024

View reviewed changes

russellb mentioned this pull request May 29, 2024

train rework, introduce --backend and --dtype flags instructlab/instructlab#1157

Open

hickeyma requested changes Jun 7, 2024

View reviewed changes

cdoern force-pushed the lofi-hifi branch from 19ef6c4 to 6737618 Compare June 7, 2024 19:22

cdoern changed the title ~~Introduce lofi and hifi commands for train, generate. Add Data Mixing~~ introduce ilab train peft ilab train phased and other commands for increased model fidelity Jun 7, 2024

cdoern changed the title ~~introduce ilab train peft ilab train phased and other commands for increased model fidelity~~ introduce ilab train peft, ilab train phased, and other commands for increased model fidelity Jun 7, 2024

jeremyeder reviewed Jun 10, 2024

View reviewed changes

cdoern force-pushed the lofi-hifi branch from 6737618 to 7c99fa8 Compare June 11, 2024 12:42

leseb requested changes Jun 11, 2024

View reviewed changes

cdoern mentioned this pull request Jun 11, 2024

Create proposal for training repo #85

Open

nathan-weinberg requested review from alinaryan and danmcp June 11, 2024 14:21

cdoern force-pushed the lofi-hifi branch 2 times, most recently from 010da7c to d10018e Compare June 11, 2024 18:44

cdoern changed the title ~~introduce ilab train peft, ilab train phased, and other commands for increased model fidelity~~ Introduce Instructlab profiles managed via ilab profile... to run different key commands at different Fidelity levels Jun 11, 2024

cdoern force-pushed the lofi-hifi branch from d10018e to 95a7d59 Compare June 11, 2024 19:06

cdoern changed the title ~~Introduce Instructlab profiles managed via ilab profile... to run different key commands at different Fidelity levels~~ Introduce InstructLab Profiles Managed via ilab profile... to Run Key Commands at Different Fidelity Levels Jun 11, 2024

russellb requested changes Jun 11, 2024

View reviewed changes

danmcp requested changes Jun 12, 2024

View reviewed changes

cdoern force-pushed the lofi-hifi branch 15 times, most recently from 6a192b4 to cd49fa6 Compare June 13, 2024 17:19

leseb reviewed Jun 14, 2024

View reviewed changes

danmcp reviewed Jun 14, 2024

View reviewed changes

cdoern force-pushed the lofi-hifi branch from cd49fa6 to 656a7a1 Compare June 17, 2024 14:23

hickeyma requested changes Jul 3, 2024

View reviewed changes


		This document describes adding different data generation, mixing, and training backends to ilab to enable higher fidelity training using the backend code.

		Currently all training is done via qlora or the like. Adding the following commands will enable higher fidelity training and introduce commands such as data mixing.

		### Reasoning


		Plugging into hardware acceleration and multi-phase training is the logical next step for ilab. Ensuring we do this in a clean way that does not overload our current commands is also crucial. Many of the processes in the backend are confusing so we want to abstract some of the steps away from users while also giving them a reasonable amount of choice in configuring these new processes. However, maintaining the current laptop story is important to users without hardware access. Splitting these two paths into separate commands maintains the integrity of each.

		@@ -0,0 +1,212 @@
		# Introduce Commands that Run Jobs with Different Fidelity levels for Key ilab functions


		The Higher Fidelity versions would validate the existence of hardware that can properly run the generation, mixing, and training backends. At least for training, the existing infrastructure simply shells out to various python scripts, libraries, etc. So, as long as we combine this backend code into a place that can be imported into ilab without breaking other dependencies, this should be more of a structural change than a functional one. We know the backend code works on an isolated system, we just need to make it pluggable.

		High Fidelity can run locally on someone's laptop or desktop and even utilize deepspeed if they have GPUs. In a more powerful system, the user can also run it in a container, utilize deepspeed and potentially even distribute the workload accorss machines using torch.distributed.

		* Transformers+Pytorch support Qlora&&FSDP. While deepspeed might be a more "server-rack" use-case, having multi-phase training in the CLI for anyone with a consumer GPU makes sense.
		2. Someone interested in ML, has a Homelab, or anything with 2 GPUs


		### ilab model train phased

		This command would take roughly the following arguments

	This command would take roughly the following arguments
	This command would take the following arguments

		4. give you a model in safetensors format or GGUF format since this is not a bitsnbytes model.


		The big advantage here is faster and higher fidelity training than currently exists in the CLI because of deepspeed (or fsdp). The user could even set this up for multi GPU or multi system support with future ilab enhancements.

		@@ -0,0 +1,207 @@
		# Introduce Instructlab profiles managed via `ilab profile...` to run different key commands at different Fidelity levels

		This document describes adding different data generation, mixing, and training backends to `ilab` to enable higher fidelity training using the backend code. By higher fidelity we mean models that preform better, were trained on better hardware, off of larger data sets and using more intensive training techniques.

	# Introduce Instructlab profiles managed via `ilab profile...` to run different key commands at different Fidelity levels
	# InstructLab Training Profiles


		Currently all training is done via QLoRA or the like. Adding the following commands will enable higher fidelity training and introduce commands such as data mixing.

		This document focuses primarily on training, specifically different configuration types or "profiles" for `ilab model train`.


		Eventually, we will want to enable a `ilab profile init` command that will allow users to initialize a custom profile based off of one of the pre-baked ones. This is not targeted for an upcoming release though.

		### Immediate Goals

		Eventually this profile would have settings for generation, eval, etc. But for immediate goals, hardcoded training settings is the MVP. Rather than having a `--config` option at the `ilab model train` level, storing the profile at the global level allows us to expand this idea to other ilab commands in the future. We need to be careful about how we introduce new concepts like this.

		For immediate releases I would introduce the idea of a profile, a command to set a specific profile, and hardcoded profiles that plug into key ilab commands namely training.


		The underlying training, eval, and generation libraries will handle the specifics based off of the config provided via the profile. For example is a user passes CPU training, no Deepspeed/FSDP etc then the training library will run the equivalent of "linux_train" that currently exists, outputting a model ready to be used. If the user has 4 GPUS, Deepspeed enabled and 15 epochs, the training library might give you a series of checkpoints.

		`ilab checkpoint evaluate` will be used in conjunction with `ilab model train` when the user is running multi-phase training. This command will run full scale inter-checkpoint evaluation on the given directory. An output dir will then hold then best checkpoint and all necessary data to run another `ilab train phased` command on.


		### Immediate Goals and Core Principles

		For the near future, there will be a single upper level profile that can be initialized via `ilab profile set <profile_name>`

		@@ -0,0 +1,392 @@
		# Introduce Functionality to Utilize the InstructLab backend: `ilab` Profiles, and Command Adaptations

	# Introduce Functionality to Utilize the InstructLab backend: `ilab` Profiles, and Command Adaptations
	# Extend the CLI to be Configurable for Workflow Capabilities

-This document describes adding different data generation, mixing, and training configurations to the `ilab` CLI to enable higher-fidelity training using the backend code. By higher-fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques.
+InstructLab is a workflow for a model alignment technique for adding contributions directly to Large language Models (LLMs) using multi-phased alignment training. The Instruct Lab CLI (`ilab`) currently hard codes the different capabilities (e.g. data generation, training etc.) of the workflow.
+This document describes extending the CLI to make the different capabilities of the workflow to be configurable. This would enable users to configure the CLI based on:
+   - Hardware specification of the user. For example, numbers of GPUs, memory etc.
+   - Using capabilities which are distributed for the workflow.


		This document describes adding different data generation, mixing, and training configurations to the `ilab` CLI to enable higher-fidelity training using the backend code. By higher-fidelity, we mean models that perform better, were trained on better hardware, off of larger data sets, and used more intensive training techniques.

		Users will be able to set a pre-defined profile type in their config.yaml that would enable sane defaults for higher fidelity training, generation, and evaluation. These defaults will funnel into existing and new flags for the CLI.


		Users will be able to set a pre-defined profile type in their config.yaml that would enable sane defaults for higher fidelity training, generation, and evaluation. These defaults will funnel into existing and new flags for the CLI.

		## `ilab` Profiles

Introduce InstructLab Profiles Managed via ilab profile... to Run Key Commands at Different Fidelity Levels #52

Are you sure you want to change the base?

Introduce InstructLab Profiles Managed via ilab profile... to Run Key Commands at Different Fidelity Levels #52

Conversation

cdoern commented May 17, 2024

russellb commented May 17, 2024

russellb commented May 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hickeyma left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

russellb commented May 23, 2024

markstur left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hickeyma left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leseb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdoern commented Jun 11, 2024 • edited Loading

cdoern commented Jun 11, 2024

russellb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdoern Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hickeyma left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nathan-weinberg commented Oct 29, 2024

Introduce InstructLab Profiles Managed via `ilab profile...` to Run Key Commands at Different Fidelity Levels #52

Introduce InstructLab Profiles Managed via `ilab profile...` to Run Key Commands at Different Fidelity Levels #52

cdoern commented Jun 11, 2024 •

edited

Loading

cdoern Jun 12, 2024 •

edited

Loading


		### Reasoning

		The profile settings will be used as arguments for most if not all libraries being introduced to the `ilab` backend.