Model Training

Now we will separate the features and target variables and split them into training and testing data by using which we will select the model which is performing best on the validation data.

Python3




features = df.drop(['User_ID', 'Calories'], axis=1)
target = df['Calories'].values
  
X_train, X_val,\
    Y_train, Y_val = train_test_split(features, target,
                                      test_size=0.1,
                                      random_state=22)
X_train.shape, X_val.shape


Output:

((13500, 5), (1500, 5))

Now, let’s normalize the data to obtain stable and fast training.

Python3




# Normalizing the features for stable and fast training.
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)


Now let’s train some state-of-the-art machine learning models and compare them which fit better with our data.

Python3




from sklearn.metrics import mean_absolute_error as mae
models = [LinearRegression(), XGBRegressor(),
          Lasso(), RandomForestRegressor(), Ridge()]
  
for i in range(5):
    models[i].fit(X_train, Y_train)
  
    print(f'{models[i]} : ')
  
    train_preds = models[i].predict(X_train)
    print('Training Error : ', mae(Y_train, train_preds))
  
    val_preds = models[i].predict(X_val)
    print('Validation Error : ', mae(Y_val, val_preds))
    print()


Output:

LinearRegression() : 
Training Error :  17.893463692619434
Validation Error :  18.007896272831253

XGBRegressor() : 
Training Error :  10.110870876925963
Validation Error :  10.16210130894184

Lasso() : 
Training Error :  17.915089584958036
Validation Error :  17.995033362288662

RandomForestRegressor() : 
Training Error :  3.982735208112875
Validation Error :  10.472395222222223

Ridge() : 
Training Error :  17.893530494767777
Validation Error :  18.00781790803129

Out of all the above models, we have trained RandomForestRegressor and the XGB model’s performance is the same as their MAE for the validation data is same.



Calories Burnt Prediction using Machine Learning

In this article, we will learn how to develop a machine learning model using Python which can predict the number of calories a person has burnt during a workout based on some biological measures.

Similar Reads

Importing Libraries and Dataset

Python libraries make it easy for us to handle the data and perform typical and complex tasks with a single line of code....

Exploratory Data Analysis

...

Model Training

...