Implementation SPSA
Python3
import numpy as np import matplotlib.pyplot as plt # Defining the seed to have same results np.random.seed( 42 ) |
Below is a helper function to calculate the output of a polynomial, given the coefficients in the array a and the x value (point to measure).
Python3
def polynomial(a, x): N = len (a) S = 0 for k in range (N): S + = a[k] * x * * k return S |
Below function will calculate the mean squared error between the predicted values and the real values. But the final error will be passed with some noise added/subtracted from it.
Python3
def Loss(parameters, X, Y): # Predictions of our model Y_pred = polynomial(parameters, X) # mse (mean square error) L = ((Y_pred - Y) * * 2 ).mean() # Noise in range: [-5, 5] noise = 5 * np.random.random() return L + noise |
Below function will calculate the gradients given the loss, weights or coefficients of the functions and
Python3
def grad(L, w, ck): # number of parameters p = len (w) # bernoulli-like distribution deltak = np.random.choice([ - 1 , 1 ], size = p) # simultaneous perturbations ck_deltak = ck * deltak # gradient approximation DELTA_L = L(w + ck_deltak) - L(w - ck_deltak) return (DELTA_L) / ( 2 * ck_deltak) |
By using the below helper function we will choose the constants that are required for the algorithm to work well.
Python3
def initialize_hyperparameters(alpha, lossFunction, w0, N_iterations): c = 1e - 2 # a small number # A is <= 10% of the number of iterations A = N_iterations * 0.1 # order of magnitude of first gradients magnitude_g0 = np. abs (grad(lossFunction, w0, c).mean()) # the number 2 in the front is an estimative of # the initial changes of the parameters, # different changes might need other choices a = 2 * ((A + 1 ) * * alpha) / magnitude_g0 return a, A, c |
Below functions are the main optimization algorithm and make use of all the helper functions we have defined above.
Python3
# optimization algorithm def SPSA(LossFunction, parameters, alpha = 0.602 ,\ gamma = 0.101 , N_iterations = int ( 1e5 )): # model's parameters w = parameter a, A, c = initialize_hyperparameters( alpha, LossFunction, w, N_iterations) for k in range ( 1 , N_iterations): # update ak and ck ak = a / ((k + A) * * (alpha)) ck = c / (k * * (gamma)) # estimate gradient gk = grad(LossFunction, w, ck) # update parameters w - = ak * gk return w |
Now, we will show the working of the algorithm we have implemented above with an example.
Python3
# Y is the polynomial to be approximated X = np.linspace( 0 , 10 , 100 ) Y = 1 * X * * 2 - 4 * X + 3 noise = 3 * np.random.normal(size = len (X)) Y + = noise # plot polynomial plt.title( "polynomial with noise" ) plt.plot(X, Y, 'go' ) plt.show() |
Output:
Before training
Python3
# Initial parameters are randomly # choosing in the range: [-10,10] parameters = ( 2 * np.random.random( 3 ) - 1 ) * 10 plt.title( "Before training" ) # Compare true and predicted values before # training plt.plot(X, polynomial(parameters, X), "bo" ) plt.plot(X, Y, 'go' ) plt.legend([ "predicted value" , "true value" ]) plt.show() |
Output:
After training
Python3
# Training with SPSA parameters = SPSA(LossFunction = lambda parameters: Loss(parameters, X, Y), parameters = parameters) plt.title( "After training" ) plt.plot(X, polynomial(parameters, X), "bo" ) plt.plot(X, Y, 'go' ) plt.legend([ "predicted value" , "true value" ]) plt.show() |
Output:
SPSA (Simultaneous Perturbation Stochastic Approximation) Algorithm using Python
SPSA is an algorithm of optimisation invented by James C. Spall specially useful for noisy cost functions and the ones which the exact gradient is not available. The general Idea is to increase the quality of the answer by estimating the gradient of the cost functions and using it to update the parameters in such way that the cost function is improved (lower its value).