LlamaFactory 一键式LLM训练、微调工具介绍与实践

2024年7月19日修改

作者：LeonYi

一、LlamaFactory介绍

LlamaFactory 是一个封装比较完善的LLM微调工具，它能够帮助用户快速地训练和微调大多数LLM模型。​

1.1 简介

LlamaFactory主要通过Trainer类来实现训练流程，通过设置数据集、模型选型、训练类型、微调超参、模型保存，以及训练状态监控等信息，来开启训练。​

支持的训练方法（这里的Pre-Training指的是增量预训练）

LlamaFactory基于PEFT和TRL进行二次封装，从而可以快速开始SFT和RLHF微调。同时，引入GaLore和Unsloth等方案，能降低训练显存占用。​

1.2 特性

•
各种模型 : LLaMA, LLaVA, Mistral, Mixtral-MoE, Qwen, Yi, Gemma, Baichuan, ChatGLM, Phi, etc.​

•
集成训练方法 : (Continuous) pre-training, (multimodal) supervised fine-tuning, reward modeling, PPO, DPO and ORPO.​

•
Scalable resources : 32-bit full-tuning, 16-bit freeze-tuning, 16-bit LoRA and 2/4/8-bit QLoRA via AQLM/AWQ/GPTQ/LLM.int8.​

•
Advanced algorithms : GaLore, BAdam, DoRA, LongLoRA, LLaMA Pro, Mixture-of-Depths, LoRA+, LoftQ and Agent tuning.​

•
实用tricks : FlashAttention-2, Unsloth, RoPE scaling, NEFTune and rsLoRA.​

•
实验监控 :LlamaBoard, TensorBoard, Wandb, MLflow, etc.​

•
推理集成 : OpenAI-style API, Gradio UI and CLI with vLLM worker.​

LlamaFactory支持单机单卡，同时整合了accelerate和deepseed的单机多卡、多机多卡分布式训练。​

支持的模型

LlamaFactory 一键式LLM训练、微调工具介绍与实践​