Steps to Implement VAR on Time Series Model
The code conducts Vector Autoregression (VAR) analysis on randomly generated time series data, including stationarity testing, VAR modeling, forecasting, and visualization of the forecasted outcomes.
Step 1: Importing necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.api import VAR
from statsmodels.tsa.stattools import adfuller
Step 2: Generate Sample Data
# Sample data generation
np.random.seed(0)
dates = pd.date_range(start='2024-01-01', periods=100)
data = pd.DataFrame(np.random.randn(100, 3), index=dates, columns=['A', 'B', 'C'])
Step 3: Function to plot time series
# Function to plot time series
def plot_series(data):
fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(10, 8))
for i, col in enumerate(data.columns):
data[col].plot(ax=axes[i], title=col)
axes[i].set_ylabel('Values')
axes[i].set_xlabel('Date')
plt.tight_layout()
plt.show()
plot_series(data)
Output:
Step 4: Function to check stationarity
Checking for stationarity in time series data is crucial for VAR (Vector Autoregression) modeling because VAR assumes that the time series variables are stationary. Stationarity implies that the statistical properties of the time series remain constant over time, such as mean, variance, and autocorrelation.
# Check stationarity of time series using ADF test
def check_stationarity(timeseries):
result = adfuller(timeseries)
print('ADF Statistic:', result[0])
print('p-value:', result[1])
print('Critical Values:')
for key, value in result[4].items():
print('\t%s: %.3f' % (key, value))
Step 5: VAR analysis
This part defines a function var_analysis(data)
that conducts Vector Autoregression (VAR) analysis on the given dataset. It consists of four steps: checking stationarity and visualizing the original data, applying the VAR model, forecasting future values, and visualizing the forecast. Finally, it calls the var_analysis()
function with the provided data to execute the analysis.
In the third step, the code forecasts future values using the VAR model. It first determines the lag order of the model (lag_order
) and then uses this information to generate forecasts for the next 10 steps (steps=10
) and in fourth step, the forecasted values are visualized. A new set of date indices (forecast_index
) starting from ‘2024-04-11’ for the next 10 periods is created.
# Section for VAR analysis
def var_analysis(data):
# Step 1: Check stationarity and visualize the original data
print("Step 1: Checking stationarity")
for col in data.columns:
print('Stationarity test for', col)
check_stationarity(data[col])
# Step 2: Applying VAR model
print("\nStep 2: Applying VAR model")
model = VAR(data)
results = model.fit()
# Step 3: Forecasting
print("\nStep 3: Forecasting")
lag_order = results.k_ar
forecast = results.forecast(data.values[-lag_order:], steps=10)
# Step 4: Visualizing forecast
print("\nStep 4: Visualizing forecast")
forecast_index = pd.date_range(start='2024-04-11', periods=10)
forecast_data = pd.DataFrame(forecast, index=forecast_index, columns=data.columns)
plot_series(pd.concat([data, forecast_data]))
# Perform VAR analysis
var_analysis(data)
Output:
Step 1: Checking stationarity and visualizing the original data Stationarity test for A ADF Statistic: -8.43759993424834 p-value: 1.7990274249398063e-13 Critical Values: 1%: -3.498 5%: -2.891 10%: -2.583 Stationarity test for B ADF Statistic: -11.229664527662438 p-value: 1.9214648218450937e-20 Critical Values: 1%: -3.498 5%: -2.891 10%: -2.583 Stationarity test for C ADF Statistic: -9.028783852793346 p-value: 5.516998045646418e-15 Critical Values: 1%: -3.498 5%: -2.891 10%: -2.583 Step 2: Applying VAR model Step 3: Forecasting Step 4: Visualizing forecast
Output Explanation
The results of the Augmented Dickey-Fuller (ADF) test for each variable in the dataset.
- Stationarity test for A: The ADF statistic is -8.438, and the p-value is approximately 1.799e-13. Since the p-value is much smaller than 0.05 (a common significance level), we reject the null hypothesis of non-stationarity. The critical values at 1%, 5%, and 10% significance levels are also provided for reference.
- Stationarity test for B: The ADF statistic is -11.230, and the p-value is approximately 1.921e-20. Again, since the p-value is much smaller than 0.05, we reject the null hypothesis of non-stationarity. The critical values at different significance levels are also provided.
- Stationarity test for C: The ADF statistic is -9.029, and the p-value is approximately 5.517e-15. Similar to variables A and B, the small p-value indicates that we reject the null hypothesis of non-stationarity for variable C. Critical values at different significance levels are also provided.
All three variables (A, B, and C) in the dataset are stationary based on the results of the Augmented Dickey-Fuller test.
Vector Autoregression (VAR) for Multivariate Time Series
Vector Autoregression (VAR) is a statistical tool used to investigate the dynamic relationships between multiple time series variables. Unlike univariate autoregressive models, which only forecast a single variable based on its previous values, VAR models investigate the interconnectivity of many variables. They accomplish this by modeling each variable as a function of not only its previous values but also of the past values of other variables in the system. In this article, we are going to explore the fundamentals of Vector Autoregression.
Table of Content
- What is Vector Autoregression?
- Mathematical Intuition of VAR Equations
- Assumptions underlying the VAR model
- Steps to Implement VAR on Time Series Model
- Step 1: Importing necessary libraries
- Step 2: Generate Sample Data
- Step 3: Function to plot time series
- Step 4: Function to check stationarity
- Step 5: VAR analysis
- Output Explanation
- Applications of VAR Models