Skip to content

Releases: hiyouga/LLaMA-Factory

v0.6.1: Patch release

29 Mar 04:07
Compare
Choose a tag to compare

This patch mainly fixes #2983

In commit 9bec3c9, we built the optimizer and scheduler inside the trainers, which inadvertently introduced a bug: when DeepSpeed was enabled, the trainers in transformers would build an optimizer and scheduler before calling the create_optimizer_and_scheduler method [1], then the optimizer created by our method would overwrite the original one, while the scheduler would not. Consequently, the scheduler would no longer affect the learning rate in the optimizer, leading to a regression in the training result. We have fixed this bug in 3bcd41b and 8c77b10. Thank @HideLord for helping us identify this critical bug.

[1] https://github.com/huggingface/transformers/blob/v4.39.1/src/transformers/trainer.py#L1877-L1881

We have also fixed #2961 #2981 #2982 #2983 #2991 #3010

v0.6.0: Paper Release, GaLore and FSDP+QLoRA

25 Mar 15:50
Compare
Choose a tag to compare

We released our paper on arXiv! Thanks to all co-authors and AK's recommendation

New features

  • Support GaLore algorithm, allowing full-parameter learning of a 7B model using less than 24GB VRAM
  • Support FSDP+QLoRA that allows QLoRA fine-tuning of a 70B model on 2x24GB GPUs
  • Support LoRA+ algorithm for better LoRA fine-tuning by @qibaoyuan in #2830
  • LLaMA Factory 🤝 vLLM, enjoy 270% inference speed with --infer_backend vllm
  • Add Colab notebook for easily getting started
  • Support pushing fine-tuned models to Hugging Face Hub in web UI
  • Support apply_chat_template by adding a chat template to the tokenizer after fine-tuning
  • Add dockerize support by @S3Studio in #2743 #2849

New models

  • Base models
    • OLMo (1B/7B)
    • StarCoder2 (3B/7B/15B)
    • Yi-9B
  • Instruct/Chat models
    • OLMo-7B-Instruct

New datasets

  • Supervised fine-tuning datasets
    • Cosmopedia (en)
  • Preference datasets
    • Orca DPO (en)

Bug fix

v0.5.3: DoRA and AWQ/AQLM QLoRA

28 Feb 17:01
Compare
Choose a tag to compare

New features

New models

  • Base models
    • Gemma (2B/7B)
  • Instruct/Chat models
    • Gemma-it (2B/7B)

Bug fix

v0.5.2: Block Expansion, Qwen1.5 Models

20 Feb 07:32
Compare
Choose a tag to compare

New features

  • Support block expansion in LLaMA Pro, see tests/llama_pro.py for usage
  • Add use_rslora option for the LoRA method

New models

  • Base models
    • Qwen1.5 (0.5B/1.8B/4B/7B/14B/72B)
    • DeepSeekMath-7B-Base
    • DeepSeekCoder-7B-Base-v1.5
    • Orion-14B-Base
  • Instruct/Chat models
    • Qwen1.5-Chat (0.5B/1.8B/4B/7B/14B/72B)
    • MiniCPM-2B-SFT/DPO
    • DeepSeekMath-7B-Instruct
    • DeepSeekCoder-7B-Instruct-v1.5
    • Orion-14B-Chat
    • Orion-14B-Long-Chat
    • Orion-14B-RAG-Chat
    • Orion-14B-Plugin-Chat

New datasets

  • Supervised fine-tuning datasets
    • SlimOrca (en)
    • Dolly (de)
    • Dolphin (de)
    • Airoboros (de)
  • Preference datasets
    • Orca DPO (de)

Bug fix

v0.5.0: Agent Tuning, Unsloth Integration

20 Jan 18:37
Compare
Choose a tag to compare

Congratulations on 10k stars 🎉 Make LLM fine-tuning easier and faster together with LLaMA-Factory ✨

New features

  • Support agent tuning for most models, you can fine-tune any LLMs with --dataset glaive_toolcall for tool using #2226
  • Support function calling in both API and Web mode with fine-tuned models, same as the OpenAI's format
  • LLaMA Factory 🤝 Unsloth, enjoy 170% LoRA training speed with --use_unsloth, see benchmarking here
  • Supports fine-tuning models on MPS device #2090

New models

  • Base models
    • Phi-2 (2.7B)
    • InternLM2 (7B/20B)
    • SOLAR-10.7B
    • DeepseekMoE-16B-Base
    • XVERSE-65B-2
  • Instruct/Chat models
    • InternLM2-Chat (7B/20B)
    • SOLAR-10.7B-Instruct
    • DeepseekMoE-16B-Chat
    • Yuan (2B/51B/102B)

New datasets

  • Supervised fine-tuning datasets
    • deepctrl dataset
    • Glaive function calling dataset v2

Core updates

  • Refactor data engine: clearer dataset alignment, easier templating and tool formatting
  • Refactor saving logic for models with value head #1789
  • Use ruff code formatter for stylish code

Bug fix

v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration

16 Dec 13:48
Compare
Choose a tag to compare

🚨🚨 Core refactor

  • Deprecate checkpoint_dir and use adapter_name_or_path instead
  • Replace resume_lora_training with create_new_adapter
  • Move the patches in model loading to llmtuner.model.patcher
  • Bump to Transformers 4.36.1 to adapt to the Mixtral models
  • Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
  • Temporarily disable LongLoRA due to breaking changes, which will be supported later

The above changes were made by @hiyouga in #1864

New features

  • Add DPO-ftx: mixing fine-tuning gradients to DPO via the dpo_ftx argument, suggested by @lylcst in #1347 (comment)
  • Integrate AutoGPTQ into the model export via the export_quantization_bit and export_quantization_dataset arguments
  • Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
  • Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b
  • Support system column in both alpaca and sharegpt dataset formats

New models

  • Base models
    • Mixtral-8x7B-v0.1
  • Instruct/Chat models
    • Mixtral-8x7B-v0.1-instruct
    • Mistral-7B-Instruct-v0.2
    • XVERSE-65B-Chat
    • Yi-6B-Chat

Bug fix

v0.3.3: ModelScope Integration, Reward Server

03 Dec 14:17
Compare
Choose a tag to compare

New features

  • Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
  • Support launching a reward model server in demo API via specifying --stage=rm in api_demo.py
  • Support using a reward model server in PPO training via specifying --reward_model_type api
  • Support adjusting the shard size of exported models via the export_size argument

New models

  • Base models
    • DeepseekLLM-Base (7B/67B)
    • Qwen (1.8B/72B)
  • Instruct/Chat models
    • DeepseekLLM-Chat (7B/67B)
    • Qwen-Chat (1.8B/72B)
    • Yi-34B-Chat

New datasets

  • Supervised fine-tuning datasets
  • Preference datasets

Bug fix

v0.3.2: Patch release

21 Nov 05:41
Compare
Choose a tag to compare

New features

  • Support training GPTQ quantized model #729 #1481 #1545
  • Support resuming reward model training #1567

Bug fix

v0.3.0: Full-Parameter RLHF

16 Nov 08:24
Compare
Choose a tag to compare

New features

  • Support full-parameter RLHF training (RM & PPO)
  • Refactor llmtuner core in #1525 by @hiyouga
  • Better LLaMA Board: full-parameter RLHF and demo mode

New models

  • Base models
    • ChineseLLaMA-1.3B
    • LingoWhale-8B
  • Instruct/Chat models
    • ChineseAlpaca-1.3B
    • Zephyr-7B-Alpha/Beta

Bug fix

v0.2.2: Patch release

13 Nov 15:16
Compare
Choose a tag to compare

Bug fix