Different Algorithms Supported by LightGBM

LightGBM supports several boosting algorithms, each with its unique characteristics. Let’s see some of the most commonly used ones:

1. Gradient Boosting Decision Tree (GBDT)

Gradient Boosting Decision Trees (GBDT) is a machine learning ensemble technique that combines the forecasts of various decision trees to produce a reliable and precise predictive model. These decision trees are constructed consecutively by GBDT, with each tree rectifying the mistakes of the ones before it. By modifying model parameters, it uses a gradient descent optimization approach to reduce prediction errors.

GBDT is renowned for its capability to manage complicated relationships within data and is extremely effective for both regression and classification tasks. It enhances model performance with each iteration by giving examples that earlier trees incorrectly classified a higher priority. Techniques for regularization aid in preventing overfitting.

Key characteristics:

  • Sequential tree building.
  • It’s prone to overfitting if not carefully tuned.
  • Suitable for a wide range of regression and classification problems.

When to use: Gradient Boosting is a good choice when you have a sufficient amount of data and can spend time tuning hyperparameters for optimal performance.

Python Implementation


#importing Libraries
import lightgbm as lgb
import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the breast cancer dataset
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create a LightGBM dataset
train_data = lgb.Dataset(X_train, label=y_train)
# Define parameters for GBDT
params = {
    'objective': 'binary',
    'boosting_type': 'gbdt',
    'metric': 'binary_logloss',
    'num_leaves': 11,
    'learning_rate': 0.05,
    'feature_fraction': 0.9
# Train the GBDT model
gbm = lgb.train(params, train_data, num_boost_round=100)
# Make predictions on the test set
y_pred = gbm.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, (y_pred > 0.5).astype(int))
print("Accuracy:", accuracy)


[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines
[LightGBM] [Info] Number of positive: 286, number of negative: 169
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000311 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 4548
[LightGBM] [Info] Number of data points in the train set: 455, number of used features: 30
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.628571 -> initscore=0.526093
[LightGBM] [Info] Start training from score 0.526093
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
Accuracy: 0.9736842105263158

In this code, we load the breast cancer dataset, split it, and train a LightGBM GBDT model. The output will be the accuracy of the model on the test data. This code demonstrates the use of LightGBM, a gradient boosting framework, for binary classification on the breast cancer dataset. It includes loading and slicing the dataset, making a LightGBM training dataset, defining the GBDT model’s parameters, and training and testing the model using accuracy.

2. LightGBM’s Gradient-Based One-Side Sampling (GOSS)

In gradient boosting algorithms like LightGBM, Gradient-Based One-Side Sampling (GOSS) is an optimization technique used to increase training efficiency without sacrificing predictive accuracy. GOSS separates the training data into two subsets: one with instances with significant gradients (denoting data points when model updates are critical), and another with examples having small gradients. GOSS selectively downsamples from the latter group as opposed to subsampling the entire dataset, which enables the model to concentrate on useful data points.

GOSS is particularly useful for large datasets since it prioritizes the informative samples, which lowers the computational overhead associated with evaluating and updating the model during training. This dynamic sampling technique aids in achieving a balance between model performance and training speed.

  • GOSS focuses on instances with larger gradients while discarding less informative instances, reducing memory and time requirements.
  • This technique enhances LightGBM’s speed and memory efficiency.

Python Implementation


#importing Libraries
import lightgbm as lgb
import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the breast cancer dataset
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create a LightGBM dataset
train_data = lgb.Dataset(X_train, label=y_train)
# ... (Data loading and splitting as in GBDT example)
# Define parameters for GOSS
params = {
    'objective': 'binary',
    'boosting_type': 'goss',
    'metric': 'binary_logloss',
    'num_leaves': 3,
    'learning_rate': 0.05,
    'feature_fraction': 0.9
# Train the GOSS model
gbm_goss = lgb.train(params, train_data, num_boost_round=100)
# Make predictions on the test set
y_pred_goss = gbm_goss.predict(X_test)
# Evaluate the model
accuracy_goss = accuracy_score(y_test, (y_pred_goss > 0.5).astype(int))
print("Accuracy (GOSS):", accuracy_goss)


[LightGBM] [Warning] Found boosting=goss. For backwards compatibility reasons, LightGBM interprets this as boosting=gbdt, data_sample_strategy=goss.To suppress this warning, set data_sample_strategy=goss instead.
[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines
[LightGBM] [Warning] Found boosting=goss. For backwards compatibility reasons, LightGBM interprets this as boosting=gbdt, data_sample_strategy=goss.To suppress this warning, set data_sample_strategy=goss instead.
[LightGBM] [Info] Number of positive: 286, number of negative: 169
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000176 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 4548
[LightGBM] [Info] Number of data points in the train set: 455, number of used features: 30
[LightGBM] [Info] Using GOSS
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.628571 -> initscore=0.526093
[LightGBM] [Info] Start training from score 0.526093
Accuracy (GOSS): 0.9649122807017544

This code demonstrates training a LightGBM model with GOSS. You can compare the accuracy with the GBDT model to see the impact of the algorithm on performance. This code expands on the prior example by showcasing the use of LightGBM in conjunction with the GOSS (Gradient-based One-Side Sampling) boosting strategy for binary classification on the breast cancer dataset. It entails loading and dividing the dataset, producing a LightGBM dataset, and defining the parameters for the GBDT model. Following training, the model is assessed for accuracy and compared to the prior GBDT model. This demonstrates the adaptability of LightGBM in providing various boosting strategies for enhancing binary classification model performance on the breast cancer dataset.

3. LightGBM’s Exclusive Feature Bundling (EFB)

Exclusive Feature Bundling (EFB) is a feature engineering advancement that was first implemented in LightGBM, a well-liked gradient boosting framework. By grouping related features together and enabling only one feature from each group to be used for splitting decision trees, EFB increases the efficiency and interpretability of model training. EFB is especially helpful for high-dimensional datasets since it speeds up training and uses less memory by lowering the number of candidate features at each split. Additionally, EFB streamlines the model’s structure, making it less prone to overfitting and improving overall predictive performance. Using this method, training efficiency and forecasting precision are both optimized.

Python Implementation


#importing Libraries
import lightgbm as lgb
import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the breast cancer dataset
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create a LightGBM dataset
train_data = lgb.Dataset(X_train, label=y_train)
# ... (Data loading and splitting as in GBDT example)
# Define parameters for EFB
params = {
    'objective': 'binary',
    'boosting_type': 'gbdt',
    'metric': 'binary_logloss',
    'num_leaves': 5,
    'learning_rate': 0.05,
    'feature_fraction': 0.9,
    'enable_bundle': True  # Enable EFB
# Train the EFB model
gbm_efb = lgb.train(params, train_data, num_boost_round=100)
# Make predictions on the test set
y_pred_efb = gbm_efb.predict(X_test)
# Evaluate the model
accuracy_efb = accuracy_score(y_test, (y_pred_efb > 0.5).astype(int))
print("Accuracy (EFB):", accuracy_efb)


[LightGBM] [Warning] Found whitespace in feature_names, replace with underlines
[LightGBM] [Info] Number of positive: 286, number of negative: 169
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000183 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 4548
[LightGBM] [Info] Number of data points in the train set: 455, number of used features: 30
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.628571 -> initscore=0.526093
[LightGBM] [Info] Start training from score 0.526093
Accuracy (EFB): 0.9649122807017544

In this code, we load the breast cancer dataset, split it, and train a LightGBM GBDT model. The output will be the accuracy of the model on the test data.This code expands on the other examples by applying LightGBM with EFB (Exclusive Feature Bundling) for binary classification on the breast cancer dataset. Model training, parameter definition with EFB enabled, dataset preparation, and accuracy assessment are all included.

4. LightGBM’s Histogram-Based Learning

A key optimization method in LightGBM, a potent gradient boosting framework, is histogram-based learning. By discretizing continuous features into histograms, it speeds up model training and makes it possible to compute split points for decision trees quickly. This method is extremely effective, especially for huge datasets, as it eliminates the requirement for sorting and scanning all data points. To further increase training speed and memory effectiveness, LightGBM combines histogram-based learning with a leaf-wise growth strategy. Users can adjust several factors to tailor the learning process. Overall, LightGBM’s Histogram-Based Learning dramatically reduces memory use and training time, making it the method of choice for handling large-scale and high-dimensional datasets.

Key characteristics:

  • Faster training and lower memory usage compared to traditional gradient boosting.
  • Supports various boosting types, making it versatile for different use cases.

When to use: LightGBM is a fantastic choice when dealing with large datasets, real-time prediction requirements, or when you want a speed boost in model training without sacrificing accuracy.

Python Implementation


#(Data loading and splitting same in the GBDT model)
# Define parameters for histogram-based learning
params = {
    'objective': 'binary',
    'boosting_type': 'gbdt',
    'metric': 'binary_logloss',
    'num_leaves': 11,
    'learning_rate': 0.05,
    'histogram_pool_size': 1024  # Adjust pool size as needed
# Train the histogram-based model
gbm_hist = lgb.train(params, train_data, num_boost_round=100)
# Make predictions on the test set
y_pred_hist = gbm_hist.predict(X_test)
# Evaluate the model
accuracy_hist = accuracy_score(y_test, (y_pred_hist > 0.5).astype(int))
print("Accuracy (Histogram-Based):", accuracy_hist)


[LightGBM] [Info] Number of positive: 286, number of negative: 169
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000184 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 4548
[LightGBM] [Info] Number of data points in the train set: 455, number of used features: 30
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.628571 -> initscore=0.526093
[LightGBM] [Info] Start training from score 0.526093
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
Accuracy (Histogram-Based): 0.9736842105263158

This code provides a histogram-based learning method for binary classification on the breast cancer dataset. It uses LightGBM. It specifies particular variables for histogram-based learning, like the histogram_pool_size, which can be changed as necessary. The algorithm subsequently uses the training data to develop the histogram-based GBDT model (gbm_hist), makes predictions using the test set, and calculates accuracy as a performance parameter.

5. LightGBM’s DART (Dropouts meet Multiple Additive Regression Trees)

DART (Dropouts meet Multiple Additive Regression Trees) is a regularization method developed by LightGBM to improve the accuracy and durability of gradient boosting models. It uses dropout regularization from neural networks to decision trees. At each cycle of training, DART randomly eliminates (or “drops out”) a subset of trees. This dropout procedure lessens overfitting and promotes the model to rely on a variety of poor learners. The predictions from several subsets of trees are then combined using DART to produce forecasts that are more dependable and precise. LightGBM models become more resilient and capable of obtaining greater performance on a variety of tasks because to this regularization technique’s effectiveness in reducing overfitting and enhancing the generalization of the models.

Python Implementation


# (Data loading and splitting same in the GBDT model)
# Define parameters for DART
params = {
    'objective': 'binary',
    'boosting_type': 'dart',
    'metric': 'binary_logloss',
    'num_leaves': 31,
    'learning_rate': 0.05,
# Train the DART model
gbm_dart = lgb.train(params, train_data, num_boost_round=100)
# Make predictions on the test set
y_pred_dart = gbm_dart.predict(X_test)
# Evaluate the model
accuracy_dart = accuracy_score(y_test, (y_pred_dart > 0.5).astype(int))
print("Accuracy (DART):", accuracy_dart)


[LightGBM] [Info] Number of positive: 286, number of negative: 169
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000190 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 4548
[LightGBM] [Info] Number of data points in the train set: 455, number of used features: 30
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.628571 -> initscore=0.526093
[LightGBM] [Info] Start training from score 0.526093
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
Accuracy (DART): 0.9736842105263158

This code demonstrates how to use LightGBM for binary classification using the DART (Dropouts meet Multiple Additive Regression Trees) boosting type on a dataset of breast cancer patients. The objective, boosting type, and learning rate are among the parameters specific to the DART model that are first defined. After training on the training set, predictions are made using the test set using the DART-based GBDT model. An alternate method for binary classification with LightGBM is provided by the code, which calculates and prints the accuracy of the DART-based model as a performance metric.

Efficiency and Speed Advantages of LightGBM

LightGBM’s efficiency and speed advantages stem from its unique features:

  • Histogram-Based Splitting: LightGBM constructs histograms of features during tree building, which reduces the number of data scans required. This results in faster training times.
  • Leaf-Wise Tree Growth: Instead of the level-wise growth used in traditional gradient boosting, LightGBM adopts a leaf-wise growth strategy. This leads to more accurate models with fewer nodes, further improving efficiency.
  • Parallel and GPU Learning: LightGBM can leverage multi-core CPUs and GPUs for parallel processing, making it even faster on modern hardware.
  • Sparse Data Handling: It handles sparse data efficiently, which is often a challenge for other boosting methods.

Fine-tuning in LightGBM

In LightGBM, fine-tuning is the act of changing the model’s parameters to enhance its performance on a particular job or dataset. When the model has to adapt to new or changing data or when it was trained on a different or more general domain than the target domain, fine-tuning in LightGBM might be helpful. In LightGBM, fine-tuning can also assist in preventing over- or underfitting, which are frequent issues in machine learning.

There are different ways to fine-tune a model in LightGBM, depending on the type and complexity of the model, the size and quality of the data, and the objective and metric of the task. Some common methods are:

1. Transfer learning: A pre-trained model can be applied to a new task or area using a technique called transfer learning. The objective is to apply the information and features gained from a larger, more comprehensive dataset—like ImageNet or Wikipedia—to a smaller, more focused dataset, like CIFAR-10 or IMDB. Transfer learning can enhance the performance and generalizability of the model while also saving time and resources. Depending on how closely the source and target domains are related, transfer learning may involve freezing or fine-tuning any or all of the layers of the trained model. Using the init_model parameter in LightGBM, users can load an existing model as the starting model for additional training to perform transfer learning.

2. Hyperparameter optimization: A technique for determining the best values for the model’s hyperparameters, such as learning rate, number of trees, number of leaves, etc., is known as hyperparameter optimization. Hyperparameters are settings made by the user prior to training that the model does not learn. Hyperparameters can significantly affect the model’s effectiveness and performance, yet they are frequently challenging to adjust manually. Numerous search techniques, including grid search, random search, Bayesian optimization, etc., can be used for hyperparameter optimization. The lightgbm.cv function in LightGBM may be used to perform cross-validation with provided parameters and provide the best score and ideal settings for hyperparameter tuning.

3. Regularization: Regularization is a technique for applying restrictions or fines to the model to avoid overfitting or to scale back complexity. When a model learns too much from the training data and is unable to generalize to new or untried data, overfitting occurs. Regularization can improve the model’s stability and robustness by lowering variance and noise. Different methods, such as dropout, weight decay, early halting, etc., can be used to regularize. Regularization in LightGBM may be accomplished by adjusting certain model complexity and shrinkage parameters, such as lambda_l1, lambda_l2, min_split_gain, min_child_weight, etc.


With an emphasis on LightGBM and its characteristics, we have discussed the idea of boosting and how it functions in this post. Additionally, we provided several examples of how to use LightGBM to classification and regression problems in Python. As we’ve seen, Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) are two unique methods used by LightGBM to quickly and effectively perform gradient boosting for tree-based models. These methods lessen the memory and computational requirements of histogram-based algorithms, which are frequently employed in various gradient boosting frameworks. Additional capabilities supported by LightGBM include categorical feature support, parallel and distributed learning, GPU learning, sparse data optimization, and custom goal and metric functions. LightGBM can handle various types of data and problems, and achieve high accuracy and generalization.

LightGBM is a powerful tool in the field of machine learning due to its variety of boosting methods, efficiency, and speed. We’ve examined several methods in this post, including Python implementations and explanations of their results. LightGBM’s adaptability guarantees that you have the proper tools to create precise and effective models for your machine learning projects, regardless of whether you’re working with huge datasets, high-dimensional data, or noisy data. You will surely be better equipped to handle a variety of data science difficulties by experimenting with these methods and learning about their advantages.

Similar Reads


