-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
351e8c3
commit 5b690eb
Showing
1 changed file
with
10 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,19 @@ | ||
# k2_training | ||
# Training Code for LLM360 K2-65B | ||
|
||
This repository contains the code for training K2-65B, a 65 billion parameter large | ||
language model from LLM360. | ||
|
||
> [!NOTE] | ||
> This repository is under active development. If you have suggestions or find bugs, please open a GitHub issue or reach out. | ||
### Launch Training | ||
To launch training, run: | ||
``` | ||
bash scripts/pretrain_65b.sh | ||
``` | ||
|
||
### Converting Megatron Checkpoints to HuggingFace Format | ||
To convert model checkpoints from Megatron to HuggingFace format, run: | ||
``` | ||
python convert_ckpt_to_hf.py --load_path <megatron_ckpt_dir> --save_path <huggingface_ckpt_dir> | ||
``` | ||
``` |