Bootstrapping Approach In XGboost

1. Importing Necessary Libraries And Generating Synthetic Data

We import the necessary libraries and generate synthetic data.

Python

import xgboost as xgb
import numpy as np
import matplotlib.pyplot as plt

# Generate synthetic data
np.random.seed(42)
X_train = np.random.rand(100, 10)
y_train = np.random.rand(100)
X_test = np.random.rand(20, 10)

2. Bootstrapping

We apply bootstrapping to the model.

Python

n_iterations = 100  # Number of bootstrapped models
predictions = []

for i in range(n_iterations):
    # Create a bootstrapped dataset
    indices = np.random.choice(len(X_train), len(X_train), replace=True)
    X_resampled, y_resampled = X_train[indices], y_train[indices]
    
    # Train an XGBoost model
    model = xgb.XGBRegressor()
    model.fit(X_resampled, y_resampled)
    
    # Predict on test data
    preds = model.predict(X_test)
    predictions.append(preds)

# Convert predictions to a NumPy array
predictions = np.array(predictions)

# Calculate the mean and standard deviation of the predictions
mean_preds = np.mean(predictions, axis=0)
std_preds = np.std(predictions, axis=0)

# Confidence intervals
lower_bound = mean_preds - 1.96 * std_preds
upper_bound = mean_preds + 1.96 * std_preds

3. Vizualising the results

The results are visualized by plotting the mean prediction and filling the area between the lower and upper bounds, effectively illustrating the prediction interval around the mean predictions.

Python

# Visualization
plt.figure(figsize=(10, 6))
plt.plot(mean_preds, label='Mean Prediction', color='blue')
plt.fill_between(range(len(mean_preds)), lower_bound, upper_bound, color='gray', alpha=0.5, label='95% Confidence Interval')
plt.title('Bootstrapping Prediction Interval')
plt.xlabel('Test Data Points')
plt.ylabel('Predictions')
plt.legend()
plt.show()

Output:

By applying Quantile Regression and Bootstrapping methods, we can estimate the uncertainty of predictions made by an XGBoost model. These approaches help us generate confidence intervals that provide a range within which the true predictions are likely to lie, enhancing the interpretability and reliability of our machine learning models.

Confidence Intervals for XGBoost

Confidence intervals provide a range within which we expect the true value of a parameter to lie, with a certain level of confidence. In the context of XGBoost, confidence intervals can be used to quantify the uncertainty of predictions. In this article we explain how to compute confidence intervals for predictions made by an XGBoost model.

Bootstrapping Approach In XGboost

1. Importing Necessary Libraries And Generating Synthetic Data

2. Bootstrapping

3. Vizualising the results

Confidence Intervals for XGBoost

Categories

Contact US

Bootstrapping Approach In XGboost

1. Importing Necessary Libraries And Generating Synthetic Data

2. Bootstrapping

3. Vizualising the results

Confidence Intervals for XGBoost

Similar Reads

Categories

Contact US