Prediction

We are provided with the testing dataset on which we have to perform the prediction. To predict, we will pass the test dataset into our trained model and save it into a CSV file containing the information, passengerid and survival. PassengerId will be the passengerid of the passengers in the test data and the survival will column will be either 0 or 1.

Python3




ids = test['PassengerId']
predictions = randomforest.predict(test.drop('PassengerId', axis=1))
  
# set the output as a dataframe and convert 
# to csv file named resultfile.csv
output = pd.DataFrame({'PassengerId': ids, 'Survived': predictions})
output.to_csv('resultfile.csv', index=False)


This will create a resultfile.csv which looks like this

 



Titanic Survival Prediction Using Machine Learning

In this article, we will learn to predict the survival chances of the Titanic passengers using the given information about their sex, age, etc. As this is a classification task we will be using random forest.

There will be three main steps in this experiment:

  • Feature Engineering
  • Imputation
  • Training and Prediction

Similar Reads

Dataset

The dataset for this experiment is freely available on the Kaggle website. Download the dataset from this link https://www.kaggle.com/competitions/titanic/data?select=train.csv. Once the dataset is downloaded it is divided into three CSV files gender submission.csv train.csv and test.csv...

Importing Libraries and Initial setup

Python3 import warnings import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns plt.style.use('fivethirtyeight') %matplotlib inline warnings.filterwarnings('ignore')...

Visualization

...

Feature Engineering

...

Model Training

...

Prediction

...