Types of Fine Tuning

Let us explore various types of fine-tuning methods.

Supervised fine-tuning

Supervised fine-tuning takes a pre-trained model and trains it further on a task-specific dataset with labeled examples. This task-specific dataset includes input-output pairs, where the model learns to map inputs to corresponding outputs.
Process:
- Take a pre-trained – model.
- Prepare the dataset in the format of input and output pair as expected by the model.
- Train the model – The pre-trained weights are adjusted during fine-tuning to adapt the model to the specific task.
Use Cases:
- Supervised fine-tuning is commonly used when you have a labeled dataset for a specific task such as sentiment analysis, text classification, or named entity recognition. you’re interested in, and you want to leverage a pre-trained model to improve performance.

Instruction fine-tuning

Instruction fine-tuning is a type of fine-tuning in which the input-output examples are further augmented with instructions in the prompt template, which enables instruction-tuned models to generalize more easily to new tasks.
Process
- Take a pre-trained model.
- Prepare data set. For instruction fine-tuning, we need to prepare the data in the form of an instruction response pair.
- We train the model using the instruction fine-tuning. This training process is the same as training a neural network.
Use cases.
- Instruction fine-tuning is generally used where we need the model to behave like a chatbot example question answering.

PEFT methods

Training a full model is generally challenging. A model with 1 B parameters will generally take 12-15 times memory for training. During training, we need extra memory for gradients, optimizer stares, activation, and temp memory for variables. Hence the maximum size of a model that can be fit on a 16 GB memory is 1 billion. Model beyond this size needs higher memory resulting in high compute cost and other training challenges.

To efficiently train large models on small compute resources we have PEFT methods which stand for Parameter efficient fine tuning. This method does not update all the weights of the model thereby reducing the memory requirements significantly. PEFT can further be classified as

1. Selective Method

In the selective method, we freeze most of the model’s layers and unfreeze only selective layers. We train and modify the weights of this selective layer to adapt to our specific task. This method is generally not used.

2. Reparameterization Method

This is the most common method. It reparameterizes model weights with low-rank matrices. This is also known as LoRA (Low-RAnk matrices). We keep the model weights frozen. Instead, we inject the small new trainable parameters with low-dimension matrices.

Example

Let’s say we have a model with a dimension of d* k = 512* 64 (where d is the dimension or token length, and k is the embedding dimension)
Now if we use the standard fine-tuning we would be updating 512*64 = 32768 parameters
With LoRA we take two matrices of low rank. Let that rank be 8. So we take two matrices A and B such that the size of A is 8*64 and the size of B is 512*8. So B*A size is 512*64.
We train the weights of A and B matrices instead of the model weights. We multiply these two matrices and add them to the model weights.
The total number of parameters comes to be 512 + 4096 = 4608 which is much less compared to the number of parameters required for full fine-tuning

QLoRA – It’s a further extension of the LoRA method. Here we further optimize memory requirements by quantizing our weights. Normally we use 32 bytes for storing model weights and other parameters while model training. Using quantizing methods we can use 16 bytes for storing model weight and parameters. This results in loss of precision but considerably reduces the memory.

3. Additive Method

Adaptive method – In the adaptive method we add new layers either in the encoder or decoder side of the model and train this new layer for our specific task.

Soft prompting – There is also a method of soft prompting or prompt tuning where we add new trainable tokens to the model prompt. These new tokens are trained while all other tokens and model weights are kept frozen. Only the newly added tokens are trained.

RLHF

RLHF stands for Reinforcement Learning Human Feedback. It is used to align a model to generate output that is preferred for human consumption.

RLHF is generally used after fine-tuning. It takes a fine-tuned model and aligns its output concerning human preference. The RLHF method uses the concept of reinforcement learning to align the model.

RLHF has below steps

Prepare dataset. We need to prompt our fine-tuned model to generate different completions. These prompt completion Pairs are ranked by human evaluators on the alignment criteria. This is the most critical and time-consuming step in RLHF
Train Reward Model – Based on the prepared dataset we prepare a reward model that will output a good score for preferred completion and a low score for unpreferred completion.
Update Model – Once the reward model is ready, we can use the RL algorithm to further update the weights of our fine-tuned model. Generally, the PPO algorithm is used as it has been shown to perform well.

Fine Tuning Large Language Model (LLM)

Large Language Models (LLMs) have revolutionized the natural language processing by excelling in tasks such as text generation, translation, summarization and question answering. Despite their impressive capabilities, these models may not always be suitable for specific tasks or domains due to compatibility issues. To overcome this fine tuning is performed. Fine tuning allows the users to customize pre-trained language models for specialized tasks. This involves refining the model on a limited dataset of task-specific information, enhancing its performance in that particular task while retaining its overall language proficiency.

Table of Content

What is Fine Tuning?
Why Fine-tune?
Types of Fine Tuning
Prompt Engineering vs RAG vs Fine tuning.
When to use fine-tuning?
How is fine-tuning performed?
Fine Tuning Large Language Model Implementation

Types of Fine Tuning

Supervised fine-tuning

Instruction fine-tuning

PEFT methods

1. Selective Method

2. Reparameterization Method

3. Additive Method

RLHF

Fine Tuning Large Language Model (LLM)

Categories

Contact US

Types of Fine Tuning

Supervised fine-tuning

Instruction fine-tuning

PEFT methods

1. Selective Method

2. Reparameterization Method

3. Additive Method

RLHF

Fine Tuning Large Language Model (LLM)

Similar Reads

Categories

Contact US