If you are familiar with ggplot2 in R, you know that this library is one of the best-structured ways to make plots. We will show you how to create plots in python with the syntax of ggplot2, using the library plotnine.
Installation
# Using pip
$ pip install plotnine
# Or using conda
$ conda install -c conda-forge plotnine
Firstly, let’s import the libraries and create our dummy data.
import pandas as pd import numpy as np import plotnine as p9 import random data = np.random.randint(1,10, size=300) df = pd.DataFrame(data, columns=['variable']) df['category']=random.choices(['A','B','C'],k=300) df['variable2']=random.sample(range(10, 1000), 300) df['variable3']=df['variable2'].apply(lambda x: x*random.random())
variable category variable2 variable3
0 3 A 747 356.282975
1 6 A 837 432.941801
2 2 A 941 195.533003
3 4 A 679 131.990057
4 7 A 912 696.910478
Now, Let’s create some basic plots using plotnine.
Histogram
p9.ggplot(df)+ p9.aes(x='variable')+p9.geom_histogram(binwidth=2)
As you can see, it’s almost identical to ggplot. Let’s see some other basic examples.
Density Plot
p9.ggplot(df)+ p9.aes(x='variable') + p9.geom_density(fill="darkgrey")
Boxplot
p9.ggplot(df)+p9.aes(y='variable',x='category')+p9.geom_boxplot()+ p9.coord_flip()
Barchart
p9.ggplot(df)+p9.aes(x='category')+ p9.geom_bar()
Scatterplot
p9.ggplot(df)+p9.aes(y='variable3',x='variable2')+p9.geom_point(size=4)
p9.ggplot(df)+p9.aes(y='variable3',x='variable2',color='category')+p9.geom_point(size=4)
Violin Plot
p9.ggplot(df)+p9.aes(y='variable2',x='category',fill='category')+ p9.geom_violin(scale = "width")
As you can see, the syntax is almost identical to ggplot2 in R. Be sure to check out dplyr pipes in python.