Code Implementation of ARIMAX Model in Python

We take an example of Python code that generates synthetic data and fits an ARIMAX model using the statsmodels library.

Pre requisite:

pip install statsmodels

It is important to note that the statsmodels library does not provide a distinct ARIMAX class separate from SARIMAX because SARIMAX is used for a variety of state space models including ARIMAX. Therefore, when using statsmodels, SARIMAX is the appropriate choice even when referring to ARIMAX models.

Step 1: Data Generation Function

This function generates synthetic time series data

Python

import numpy as np import pandas as pd def generate_data(n=100, seed=42): """ Generate synthetic time series data with exogenous variables. Parameters: n (int): Number of data points. seed (int): Seed for reproducibility. Returns: pd.DataFrame: DataFrame containing the endogenous and exogenous variables. """ np.random.seed(seed) Y = np.cumsum(np.random.randn(n)) # Random walk X1 = np.random.randn(n) X2 = np.random.randn(n) data = pd.DataFrame({'Y': Y, 'X1': X1, 'X2': X2}, index=pd.date_range(start='2020-01-01', periods=n)) return data


Step 2: ARIMAX Model Fitting and Forecasting Function

This function fits an ARIMAX model to the provided time series data and makes future forecasts.

  • data: The DataFrame containing the time series and exogenous variables.
  • order: A tuple (p, d, q) specifying the orders of the ARIMA model components.
  • exog_cols: A list of column names representing the exogenous variables.
  • forecast_steps: The number of future time steps to forecast.
Python

from statsmodels.tsa.statespace.sarimax import SARIMAX def fit_arimax(data, order=(1, 1, 1), exog_cols=['X1', 'X2'], forecast_steps=10): """ Fit an ARIMAX model to the data and make forecasts. Parameters: data (pd.DataFrame): DataFrame containing the time series and exogenous variables. order (tuple): The (p,d,q) order of the ARIMA model. exog_cols (list): List of column names for the exogenous variables. forecast_steps (int): Number of steps to forecast. Returns: pd.DataFrame: DataFrame containing the observed, forecasted values, and confidence intervals. """ exog = data[exog_cols] model = SARIMAX(data['Y'], exog=exog, order=order) results = model.fit() print(results.summary()) forecast = results.get_forecast(steps=forecast_steps, exog=np.random.randn(forecast_steps, len(exog_cols))) forecast_index = pd.date_range(start=data.index[-1] + pd.Timedelta(days=1), periods=forecast_steps) forecast_mean = forecast.predicted_mean forecast_ci = forecast.conf_int() forecast_df = pd.DataFrame({'Forecast': forecast_mean}, index=forecast_index) forecast_df['Lower CI'] = forecast_ci.iloc[:, 0] forecast_df['Upper CI'] = forecast_ci.iloc[:, 1] return forecast_df

Step 3: Plotting Function

This function plots the observed data along with the forecasted values and their confidence intervals.

Python

import matplotlib.pyplot as plt def plot_results(data, forecast_df): """ Plot the observed data and forecasted values. Parameters: data (pd.DataFrame): DataFrame containing the observed time series. forecast_df (pd.DataFrame): DataFrame containing the forecasted values and confidence intervals. """ plt.figure(figsize=(12, 6)) plt.plot(data.index, data['Y'], label='Observed') plt.plot(forecast_df.index, forecast_df['Forecast'], label='Forecast') plt.fill_between(forecast_df.index, forecast_df['Lower CI'], forecast_df['Upper CI'], color='pink', alpha=0.3) plt.legend() plt.title('ARIMAX Model Forecast') plt.show()

Step 4: Plotting the forecast values

The main script coordinates the execution of the functions to generate data, fit the ARIMAX model, and plot the results.

Python

# Generate synthetic data data = generate_data() # Fit ARIMAX model and forecast forecast_df = fit_arimax(data, order=(1, 1, 1), exog_cols=['X1', 'X2'], forecast_steps=10) # Plot the results plot_results(data, forecast_df)

Output:

What Is an ARIMAX Model?

In the world of time series analysis and forecasting, various models help us understand and predict future values based on past data. Among these models, the ARIMAX model stands out due to its ability to incorporate external variables, providing a more robust and accurate forecasting mechanism. This article delves into the intricacies of the ARIMAX model, exploring its components, mathematical formulation, applications, and key points to understand its functionality better.

Similar Reads

What Is an ARIMAX Model?

An ARIMAX model, which stands for AutoRegressive Integrated Moving Average with eXogenous inputs, is an advanced version of the ARIMA (AutoRegressive Integrated Moving Average) model. The ARIMAX model extends the ARIMA framework by integrating exogenous variables, which are external factors that can influence the time series being studied. This integration allows the model to leverage additional information that can significantly enhance forecasting accuracy....

Code Implementation of ARIMAX Model in Python

We take an example of Python code that generates synthetic data and fits an ARIMAX model using the statsmodels library....

Applications of ARIMAX Models

ARIMAX models are particularly useful in scenarios where the time series data is influenced by external factors. Some common applications include:...

Conclusion

The ARIMAX model is a powerful tool in the realm of time series forecasting, offering a sophisticated approach by incorporating external variables. Its ability to account for exogenous factors makes it highly valuable in various fields, from economics to environmental science. By understanding its components, mathematical foundation, and applications, analysts and forecasters can leverage the ARIMAX model to gain deeper insights and make more accurate predictions....