From 5b690ebe4b23c1a16225623c3265c32a30860086 Mon Sep 17 00:00:00 2001 From: willieneis Date: Wed, 29 May 2024 04:05:11 -0700 Subject: [PATCH] update README --- README.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index e886e1f..dc7f30e 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,19 @@ -# k2_training +# Training Code for LLM360 K2-65B + +This repository contains the code for training K2-65B, a 65 billion parameter large +language model from LLM360. + +> [!NOTE] +> This repository is under active development. If you have suggestions or find bugs, please open a GitHub issue or reach out. ### Launch Training +To launch training, run: ``` bash scripts/pretrain_65b.sh ``` ### Converting Megatron Checkpoints to HuggingFace Format +To convert model checkpoints from Megatron to HuggingFace format, run: ``` python convert_ckpt_to_hf.py --load_path --save_path -``` \ No newline at end of file +```