Enhacing Data visualizations using plotnine and ggplot

 Here we will learn about the remaining optional components. These components are – 

  • Facets
  • Statistical transformations
  • Coordinates
  • Themes 

Facets

Facets are used to plot subsets of data. it allows an individual plot for groups of data in the same image.

For example, let’s consider the tips dataset that contains information about people who probably had food at a restaurant and whether or not they left a tip, their age, gender and so on. Lets have a look at it.

Note: To download the dataset used, click here.

Now let’s suppose we want to plot about what was the total bill according to the gender and on each day. In such cases facets can be very useful, let’s see how.

Example: Facets with plotnine and ggplot in Python 

Python3




import pandas as pd
from plotnine import ggplot, aes, facet_grid, labs, geom_col
  
# reading dataset
df = pd.read_csv("tips.csv")
  
(
    ggplot(df)
    + facet_grid(facets="~sex")
    + aes(x="day", y="total_bill")
    + labs(
        x="day",
        y="total_bill",
    )
    + geom_col()
)


Output:

Statistical transformations

Statistical transformations means computing data before plotting it. It can be seen in the case of a histogram. Now let’s consider the above example, where we wanted to find the measurement of the sepal length column and now we want to distribute that measurement into 15 columns. The geom_histogram() function of the plotnine computes and plot this data automatically.

Example: Statistical transformations using plotnine and ggplot in Python

Python3




import pandas as pd
from plotnine import ggplot, aes, geom_histogram
  
# reading dataset
df = pd.read_csv("Iris.csv")
  
ggplot(df) + aes(x="SepalLengthCm") + geom_histogram(bins=15)


Output:

Coordinates

The coordinates system defines the imappinof the data point with the 2D graphical location on the plot. Let’s see the above example of histogram, we want to plot this histogram horizontally. We can simply do this by using the coord_flip() function.

Example: Coordinate system in plotnine and ggplot in Python

Python3




import pandas as pd
from plotnine import ggplot, aes, geom_histogram, coord_flip
  
# reading dataset
df = pd.read_csv("Iris.csv")
  
(
    ggplot(df)
    + aes(x="SepalLengthCm")
    + geom_histogram(bins=15)
    + coord_flip()
)


Output:

Themes

Themes are used for improving the looks of the data visualization. Plotnine includes a lot of theme which can be found in the plotnine’s themes API. Let’s use the above example with facets and try to make the visualization more interactive.

Example: Themes in plotnine and ggplot in Python

Python3




import pandas as pd
from plotnine import ggplot, aes, facet_grid, labs, geom_col, theme_xkcd
  
# reading dataset
df = pd.read_csv("tips.csv")
  
(
    ggplot(df)
    + facet_grid(facets="~sex")
    + aes(x="day", y="total_bill")
    + labs(
        x="day",
        y="total_bill",
    )
    + geom_col()
    + theme_xkcd()
)


Output:

We can also fill the color according to add more information to this graph. We can add color for the time variable in the above graph using the fill parameter of the aes function.

Data Visualization using Plotnine and ggplot2 in Python

Data Visualization is the technique of presenting data in the form of graphs, charts, or plots. Visualizing data makes it easier for the data analysts to analyze the trends or patterns that may be present in the data as it summarizes the huge amount of data in a simple and easy-to-understand format. 

In this article, we will discuss how to visualize data using plotnine in Python which is a strict implementation of the grammar of graphics. Before starting let’s understand a brief about what is the grammar of graphics.

Similar Reads

What is the Grammar of Graphics?

A grammar of graphics is basically a tool that enables us to describe the components of a given graphic. Basically, this allows us to see beyond the named graphics, (scatter plot, to name one) and to basically see the underlying statistics behind it. Consider grammar of graphics as the grammar of English where we use different words, tenses, punctuations to form a sentence....

Components of Grammar of graphics

Typically, to build or describe any visualization with one or more dimensions, we can use the components shown in the below image....

Installation

The plotnine is based on ggplot2 in R Programming language which is used to implement grammar of graphics in Python. To install plotnine type the below command in the terminal....

Plotting Data using Plotnine and ggplot in Python

Here we will use the three main components i.e. data, aesthetics, and geometric objects for plotting our data. Let’s go through each component in detail....

Enhacing Data visualizations using plotnine and ggplot

...

Plotting Multidimensional Data

...

Saving the Plot

...