Visualization
Now let us visualize the data using some pie charts and histograms to get a proper understanding of the data.
Let us first visualize the number of survivors and death counts.
Python3
f, ax = plt.subplots( 1 , 2 , figsize = ( 12 , 4 )) train[ 'Survived' ].value_counts().plot.pie( explode = [ 0 , 0.1 ], autopct = '%1.1f%%' , ax = ax[ 0 ], shadow = False ) ax[ 0 ].set_title( 'Survivors (1) and the dead (0)' ) ax[ 0 ].set_ylabel('') sns.countplot( 'Survived' , data = train, ax = ax[ 1 ]) ax[ 1 ].set_ylabel( 'Quantity' ) ax[ 1 ].set_title( 'Survivors (1) and the dead (0)' ) plt.show() |
Sex feature
Python3
f, ax = plt.subplots( 1 , 2 , figsize = ( 12 , 4 )) train[[ 'Sex' , 'Survived' ]].groupby([ 'Sex' ]).mean().plot.bar(ax = ax[ 0 ]) ax[ 0 ].set_title( 'Survivors by sex' ) sns.countplot( 'Sex' , hue = 'Survived' , data = train, ax = ax[ 1 ]) ax[ 1 ].set_ylabel( 'Quantity' ) ax[ 1 ].set_title( 'Survived (1) and deceased (0): men and women' ) plt.show() |
Titanic Survival Prediction Using Machine Learning
In this article, we will learn to predict the survival chances of the Titanic passengers using the given information about their sex, age, etc. As this is a classification task we will be using random forest.
There will be three main steps in this experiment:
- Feature Engineering
- Imputation
- Training and Prediction