Exploratory data analysis and visualization
To find out the correlation between the features, Let’s make the heatmap.
Python3
plt.figure(figsize = ( 12 , 6 )) sns.heatmap(dataset.corr(), cmap = 'BrBG' , fmt = '.2f' , linewidths = 2 , annot = True ) |
Output :
Now we can also explore the distribution of CreditScore, Age, Balance, ExtimatedSalary using displot.
Python3
lis = [ 'CreditScore' , 'Age' , 'Balance' , 'EstimatedSalary' ] plt.subplots(figsize = ( 15 , 8 )) index = 1 for i in lis: plt.subplot( 2 , 2 , index) sns.distplot(dataset[i]) index + = 1 |
Output :
We can also check the categorical count of each category in Geography and Gender.
Python3
lis2 = [ 'Geography' , 'Gender' ] plt.subplots(figsize = ( 10 , 5 )) index = 1 for col in lis2: y = dataset[col].value_counts() plt.subplot( 1 , 2 , index) plt.xticks(rotation = 90 ) sns.barplot(x = list (y.index), y = y) index + = 1 |
Output :
Data Preprocessing, Analysis, and Visualization for building a Machine learning model
In this article, we are going to see the concept of Data Preprocessing, Analysis, and Visualization for building a Machine learning model. Business owners and organizations use Machine Learning models to predict their Business growth. But before applying machine learning models, the dataset needs to be preprocessed.
So, let’s import the data and start exploring it.