Plotting Data using Plotnine and ggplot in Python
Here we will use the three main components i.e. data, aesthetics, and geometric objects for plotting our data. Let’s go through each component in detail.
Data
The data is the dataset which is needed to be plotted. We can specify the data using the ggplot constructor and passing the dataset to that constructor.
Example: Specifying dataset for the ggplot
We will use the Iris dataset and will read it using Pandas.
Python3
import pandas as pd from plotnine import ggplot # reading dataset df = pandas.read_csv( "Iris.csv" ) # passing the data to the ggplot # constructor ggplot(df) |
Output:
This will give us a blank output as we have not specified the other two main components.
Aesthetics
Now let’s define the variable that we want to use for each axis in the plot. Aesthetics maps data variables to graphical attributes, like 2D position and color.
Example: Defining aesthetics of the plotnine and ggplot in Python
Python3
import pandas as pd from plotnine import ggplot, aes # reading dataset df = pd.read_csv( "Iris.csv" ) ggplot(df) + aes(x = "Species" , y = "SepalLengthCm" ) |
Output:
In the above example, we can see that Species is shown on the x-axis and sepal length is shown on the y-axis. But still there is no figure in the plot. This can be added using geometric objects.
Geometric Objects
After defining the data and the aesthetics we need to define the type of plot that we want for visualization. This tells the plotline that how the data points should be shown. It provides a variety of geometric objects like scatter plots, line charts, bar charts, box plots, etc. Let’s see a variety of them and how to use them.
Note: For the list of all the geoms refer to the plotnine’s geom API reference.
Example 1: Adding geometric objects to the plotnine and ggplot in Python
Python3
import pandas as pd from plotnine import ggplot, aes, geom_col # reading dataset df = pd.read_csv( "Iris.csv" ) ggplot(df) + aes(x = "Species" , y = "SepalLengthCm" ) + geom_col() |
Output:
In the above example, we have used the geam_col() geom that is a bar plot with the base on the x-axis. We can change this to different types of geoms that we find suitable for our plot.
Example 2: Plotting Histogram with plotnine and ggplot in Python
Python3
import pandas as pd from plotnine import ggplot, aes, geom_histogram # reading dataset df = pd.read_csv( "Iris.csv" ) ggplot(df) + aes(x = "SepalLengthCm" ) + geom_histogram() |
Output:
Example 3: Plotting Scatter plot with plotnine and ggplot in Python
Python3
import pandas as pd from plotnine import ggplot, aes, geom_point # reading dataset df = pd.read_csv( "Iris.csv" ) ggplot(df) + aes(x = "Species" , y = "SepalLengthCm" ) + geom_point() |
Output:
Example 4: Plotting Box plot with plotnine and ggplot in Python
Python3
import pandas as pd from plotnine import ggplot, aes, geom_boxplot # reading dataset df = pd.read_csv( "Iris.csv" ) # passing the data to the ggplot # constructor ggplot(df) + aes(x = "Species" , y = "SepalLengthCm" ) + geom_boxplot() |
Output:
Example 5: Plottin Line chart with plotnine and ggplot in Python
Python3
import pandas as pd from plotnine import ggplot, aes, geom_line # reading dataset df = pd.read_csv( "Iris.csv" ) ggplot(df) + aes(x = "Species" , y = "SepalLengthCm" ) + geom_line() |
Output:
Till now we have learnt about how to create a basic chart using the concept of grammar of graphics and it’s three main components. Now let’s learn how to customize these charts using the other optional components.
Data Visualization using Plotnine and ggplot2 in Python
Data Visualization is the technique of presenting data in the form of graphs, charts, or plots. Visualizing data makes it easier for the data analysts to analyze the trends or patterns that may be present in the data as it summarizes the huge amount of data in a simple and easy-to-understand format.
In this article, we will discuss how to visualize data using plotnine in Python which is a strict implementation of the grammar of graphics. Before starting let’s understand a brief about what is the grammar of graphics.