Scatter Plot
Scatter plots or scatter graphs is a bivariate plot having greater resemblance to line graphs in the way they are built. A line graph uses a line on an X-Y axis to plot a continuous function, while a scatter plot relies on dots to represent individual pieces of data. These plots are very useful to see if two variables are correlated. Scatter plot could be 2 dimensional or 3 dimensional.
Syntax: seaborn.scatterplot(x=None, y=None, hue=None, style=None, size=None, data=None, palette=None, hue_order=None, hue_norm=None, sizes=None, size_order=None, size_norm=None, markers=True, style_order=None, x_bins=None, y_bins=None, units=None, estimator=None, ci=95, n_boot=1000, alpha=’auto’, x_jitter=None, y_jitter=None, legend=’brief’, ax=None, **kwargs)
Parameters:
x, y: Input data variables that should be numeric.data: Dataframe where each column is a variable and each row is an observation.
size: Grouping variable that will produce points with different sizes.
style: Grouping variable that will produce points with different markers.
palette: Grouping variable that will produce points with different markers.
markers: Object determining how to draw the markers for different levels.
alpha: Proportional opacity of the points.
Returns: This method returns the Axes object with the plot drawn onto it.
Advantages of a scatter plot
- Displays correlation between variables
- Suitable for large data sets
- Easier to find data clusters
- Better representation of each data point
Python3
# import module import matplotlib.pyplot as plt # scatter plot illustration plt.scatter(diabetes[ 'DiabetesPedigreeFunction' ], diabetes[ 'BMI' ]) |
Python3
# import required modules from mpl_toolkits.mplot3d import Axes3D # assign axis values x = [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 ] y = [ 5 , 6 , 2 , 3 , 13 , 4 , 1 , 2 , 4 , 8 ] z = [ 2 , 3 , 3 , 3 , 5 , 7 , 9 , 11 , 9 , 10 ] # adjust size of plot sns. set (rc = { 'figure.figsize' : ( 8 , 5 )}) fig = plt.figure() ax = fig.add_subplot( 111 , projection = '3d' ) ax.scatter(x, y, z, c = 'r' , marker = 'o' ) # assign labels ax.set_xlabel( 'X Label' ), ax.set_ylabel( 'Y Label' ), ax.set_zlabel( 'Z Label' ) # display illustration plt.show() |
Data Visualisation in Python using Matplotlib and Seaborn
It may sometimes seem easier to go through a set of data points and build insights from it but usually this process may not yield good results. There could be a lot of things left undiscovered as a result of this process. Additionally, most of the data sets used in real life are too big to do any analysis manually. This is essentially where data visualization steps in.
Data visualization is an easier way of presenting the data, however complex it is, to analyze trends and relationships amongst variables with the help of pictorial representation.
The following are the advantages of Data Visualization
- Easier representation of compels data
- Highlights good and bad performing areas
- Explores relationship between data points
- Identifies data patterns even for larger data points
While building visualization, it is always a good practice to keep some below mentioned points in mind
- Ensure appropriate usage of shapes, colors, and size while building visualization
- Plots/graphs using a co-ordinate system are more pronounced
- Knowledge of suitable plot with respect to the data types brings more clarity to the information
- Usage of labels, titles, legends and pointers passes seamless information the wider audience