Fine-Tune Llama 3 with Llama Factory: A Step-by-Step Guide

Ready to fine-tune the powerful Llama 3 model? This guide provides a practical walkthrough using Llama Factory, a user-friendly tool that makes model optimization accessible to everyone. We'll explore how to leverage Llama Factory to tailor Llama 3 for your specific needs, even without extensive coding knowledge.

What is Llama Factory? A User-Friendly Fine-Tuning Tool

Llama Factory simplifies the complex process of fine-tuning large language models (LLMs) like Llama 3. It offers an intuitive interface and efficient algorithms, making model optimization accessible and cost-effective. With Llama Factory, you can fine-tune over 100 models.

Accessibility: Streamlines fine-tuning, making it user-friendly.
Efficiency: Offers Lora and GaLore configurations to minimize GPU usage.
Flexibility: Supports various models, including Llama, Mistral, and Falcon.
Advanced Algorithms: Integrates GaLore, BadM, and LoRA.
Monitoring: Integrates tools like TensorBoard, VanDB, and MLflow.

Why Fine-Tune Llama 3? Enhance Performance for Specific Tasks

Fine-tuning adapts a pre-trained model like Llama 3 to a specific task or dataset, improving its performance and accuracy. This involves adjusting the model's parameters using new data, allowing it to perform well on specialized tasks without starting from scratch.

Improved Accuracy: Tailor the model's responses for specific use cases.
Reduced Harmful Content: Fine-tuning can mitigate toxic output.
Resource Efficiency: Save time and resources by adapting an existing model.

Llama Board: Your No-Code Interface for Llama Factory

Llama Board provides a user-friendly interface for Llama Factory, allowing you to adjust and improve LLM performance without coding. It offers a comprehensive dashboard to customize how the language model learns and processes information.

Key Features of Llama Board

Easy Customization: Adjust settings on a webpage to control model learning.
Progress Monitoring: Track updates and graphs to assess model improvement.
Flexible Testing: Compare model outputs to known answers or interact with it directly.
Multilingual Support: Works in English, Russian, and Chinese, with plans for more languages.

Fine-Tuning Llama 3 with Llama Factory: A Practical Guide

Let’s dive into actually fine-tuning Llama 3. Here's a hands-on walkthrough to demonstrate how to leverage Llama Factory for model optimization.

Step 1: Setting Up the Environment

Clone the Repository: Start by cloning the Llama Factory repository from GitHub:
```
!git clone https://github.com/hiyouga/LLaMA-Factory.git
%cd LLaMA-Factory
%ls
```

Install Dependencies: Install necessary libraries, including Unsloth for efficient fine-tuning, along with xformers and bitsandbytes.

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers==0.0.25
!pip install .[bitsandbytes]
!pip install 'urllib3<2'

Step 2: Verify GPU Availability

Ensure your GPU is properly set up by running the following code:

import torch
try:
 assert torch.cuda.is_available() is True
except AssertionError:
 print("Your GPU is not setup!")

Step 3: Import and Prepare the Dataset

Import a dataset from the cloned GitHub repository or use your own custom dataset.

import json

%cd /notebooks/LLaMA-Factory
MODEL_NAME = "Llama-3"

with open("/notebooks/LLaMA-Factory/data/identity.json", "r", encoding="utf-8") as f:
 dataset = json.load(f)

for sample in dataset:
 sample["output"] = sample["output"].replace("MODEL_NAME", MODEL_NAME).replace("AUTHOR", "LLaMA Factory")

with open("/notebooks/LLaMA-Factory/data/identity.json", "w", encoding="utf-8") as f:
 json.dump(dataset, f, indent=2, ensure_ascii=False)

Step 4: Launch the Gradio Web App

Generate the Gradio web app link for Llama Factory to access the GUI.

#generates the web app link
%cd /notebooks/LLaMA-Factory
!GRADIO_SHARE=1 llamafactory-cli webui

Step 5: Configure the Fine-Tuning Parameters via GUI

Model Selection: Choose Llama 3 (8B)
Adapter Configuration: Select LoRa or other adapters.
Training Options: Choose supervised fine-tuning (SFT).
Dataset Selection: Pick from the provided datasets or upload your own.
Hyperparameter Configuration: Adjust epochs, batch size, and learning rate.

Step 6: Starting the fine-tuning process

Once all configurations are set, you can initiate the training process by clicking the “Start” button. Or you can fine-tune using command line.

Step 7: Starting the fine-tuning process Via CLI

Alternatively, start the training and fine-tuning using the CLI commands:

args = dict(
 stage="sft", # Specifies the stage of training. Here, it's set to "sft" for supervised fine-tuning
 do_train=True,
 model_name_or_path="unsloth/llama-3-8b-Instruct-bnb-4bit", # use bnb-4bit-quantized Llama-3-8B-Instruct model
 dataset="identity,alpaca_gpt4_en", # use the alpaca and identity datasets
 template="llama3", # use llama3 for prompt template
 finetuning_type="lora", # use the LoRA adapters which saves up memory
 lora_target="all", # attach LoRA adapters to all linear layers
 output_dir="llama3_lora", # path to save LoRA adapters
 per_device_train_batch_size=2, # specify the batch size
 gradient_accumulation_steps=4, # the gradient accumulation steps
 lr_scheduler_type="cosine", # use the learning rate as cosine learning rate scheduler
 logging_steps=10, # log every 10 steps
 warmup_ratio=0.1, # use warmup scheduler
 save_steps=1000, # save checkpoint every 1000 steps
 learning_rate=5e-5, # the learning rate
 num_train_epochs=3.0, # the epochs of training
 max_samples=500, # use 500 examples in each dataset
 max_grad_norm=1.0, # clip gradient norm to 1.0
 quantization_bit=4, # use 4-bit QLoRA
 loraplus_lr_ratio=16.0, # use LoRA+ with lambda=16.0
 use_unsloth=True, # use UnslothAI's LoRA optimization for 2x faster training
 fp16=True, # use float16 mixed precision training
)

json.dump(args, open("train_llama3.json", "w", encoding="utf-8"), indent=2)

Step 8: Run Training

Open a terminal and run the below command

!llamafactory-cli train train_llama3.json

This will start the training process.

Step 9: Run Inference Command

Run the below code using your terminal.

!llamafactory-cli chat infer_llama3.json

Conclusion: Empowering LLM Customization with Llama Factory

Llama Factory simplifies the fine-tuning process, making it accessible for customizing LLMs like Llama 3. Its intuitive interface and efficient techniques empower developers to tailor models for specific applications. By encouraging experimentation and community growth, Llama Factory plays a crucial role in advancing the field of large language models.

References

Fine-Tune Llama 3 with Llama Factory: A Step-by-Step Guide

What is Llama Factory? A User-Friendly Fine-Tuning Tool

Accessibility: Streamlines fine-tuning, making it user-friendly.
Efficiency: Offers Lora and GaLore configurations to minimize GPU usage.
Flexibility: Supports various models, including Llama, Mistral, and Falcon.
Advanced Algorithms: Integrates GaLore, BadM, and LoRA.
Monitoring: Integrates tools like TensorBoard, VanDB, and MLflow.

Why Fine-Tune Llama 3? Enhance Performance for Specific Tasks

Improved Accuracy: Tailor the model's responses for specific use cases.
Reduced Harmful Content: Fine-tuning can mitigate toxic output.
Resource Efficiency: Save time and resources by adapting an existing model.

Llama Board: Your No-Code Interface for Llama Factory

Key Features of Llama Board

Easy Customization: Adjust settings on a webpage to control model learning.
Progress Monitoring: Track updates and graphs to assess model improvement.
Flexible Testing: Compare model outputs to known answers or interact with it directly.
Multilingual Support: Works in English, Russian, and Chinese, with plans for more languages.

Fine-Tuning Llama 3 with Llama Factory: A Practical Guide

Let’s dive into actually fine-tuning Llama 3. Here's a hands-on walkthrough to demonstrate how to leverage Llama Factory for model optimization.

Step 1: Setting Up the Environment

Clone the Repository: Start by cloning the Llama Factory repository from GitHub:
```
!git clone https://github.com/hiyouga/LLaMA-Factory.git
%cd LLaMA-Factory
%ls
```

Install Dependencies: Install necessary libraries, including Unsloth for efficient fine-tuning, along with xformers and bitsandbytes.

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers==0.0.25
!pip install .[bitsandbytes]
!pip install 'urllib3<2'

Step 2: Verify GPU Availability

Ensure your GPU is properly set up by running the following code:

import torch
try:
 assert torch.cuda.is_available() is True
except AssertionError:
 print("Your GPU is not setup!")

Step 3: Import and Prepare the Dataset

Import a dataset from the cloned GitHub repository or use your own custom dataset.

import json

%cd /notebooks/LLaMA-Factory
MODEL_NAME = "Llama-3"

with open("/notebooks/LLaMA-Factory/data/identity.json", "r", encoding="utf-8") as f:
 dataset = json.load(f)

for sample in dataset:
 sample["output"] = sample["output"].replace("MODEL_NAME", MODEL_NAME).replace("AUTHOR", "LLaMA Factory")

with open("/notebooks/LLaMA-Factory/data/identity.json", "w", encoding="utf-8") as f:
 json.dump(dataset, f, indent=2, ensure_ascii=False)

Step 4: Launch the Gradio Web App

Generate the Gradio web app link for Llama Factory to access the GUI.

#generates the web app link
%cd /notebooks/LLaMA-Factory
!GRADIO_SHARE=1 llamafactory-cli webui

Step 5: Configure the Fine-Tuning Parameters via GUI

Model Selection: Choose Llama 3 (8B)
Adapter Configuration: Select LoRa or other adapters.
Training Options: Choose supervised fine-tuning (SFT).
Dataset Selection: Pick from the provided datasets or upload your own.
Hyperparameter Configuration: Adjust epochs, batch size, and learning rate.

Step 6: Starting the fine-tuning process

Once all configurations are set, you can initiate the training process by clicking the “Start” button. Or you can fine-tune using command line.

Step 7: Starting the fine-tuning process Via CLI

Alternatively, start the training and fine-tuning using the CLI commands:

args = dict(
 stage="sft", # Specifies the stage of training. Here, it's set to "sft" for supervised fine-tuning
 do_train=True,
 model_name_or_path="unsloth/llama-3-8b-Instruct-bnb-4bit", # use bnb-4bit-quantized Llama-3-8B-Instruct model
 dataset="identity,alpaca_gpt4_en", # use the alpaca and identity datasets
 template="llama3", # use llama3 for prompt template
 finetuning_type="lora", # use the LoRA adapters which saves up memory
 lora_target="all", # attach LoRA adapters to all linear layers
 output_dir="llama3_lora", # path to save LoRA adapters
 per_device_train_batch_size=2, # specify the batch size
 gradient_accumulation_steps=4, # the gradient accumulation steps
 lr_scheduler_type="cosine", # use the learning rate as cosine learning rate scheduler
 logging_steps=10, # log every 10 steps
 warmup_ratio=0.1, # use warmup scheduler
 save_steps=1000, # save checkpoint every 1000 steps
 learning_rate=5e-5, # the learning rate
 num_train_epochs=3.0, # the epochs of training
 max_samples=500, # use 500 examples in each dataset
 max_grad_norm=1.0, # clip gradient norm to 1.0
 quantization_bit=4, # use 4-bit QLoRA
 loraplus_lr_ratio=16.0, # use LoRA+ with lambda=16.0
 use_unsloth=True, # use UnslothAI's LoRA optimization for 2x faster training
 fp16=True, # use float16 mixed precision training
)

json.dump(args, open("train_llama3.json", "w", encoding="utf-8"), indent=2)

Step 8: Run Training

Open a terminal and run the below command

!llamafactory-cli train train_llama3.json

This will start the training process.

Step 9: Run Inference Command

Run the below code using your terminal.

!llamafactory-cli chat infer_llama3.json

Fine-Tune Llama 3 with Llama Factory: A Step-by-Step Guide

What is Llama Factory? A User-Friendly Fine-Tuning Tool

Why Fine-Tune Llama 3? Enhance Performance for Specific Tasks

Llama Board: Your No-Code Interface for Llama Factory

Key Features of Llama Board

Fine-Tuning Llama 3 with Llama Factory: A Practical Guide

Conclusion: Empowering LLM Customization with Llama Factory

References

Fine-Tune Llama 3 with Llama Factory: A Step-by-Step Guide

What is Llama Factory? A User-Friendly Fine-Tuning Tool

Why Fine-Tune Llama 3? Enhance Performance for Specific Tasks

Llama Board: Your No-Code Interface for Llama Factory

Key Features of Llama Board

Fine-Tuning Llama 3 with Llama Factory: A Practical Guide

Conclusion: Empowering LLM Customization with Llama Factory

References

Related Posts