Popular Tags

Matplotlib: Horizontal Bar Chart

In this tutorial, we’ll create a static horizontal bar chart from dataframe with the help of Python libraries: Pandas, Matplotlib, and Seaborn.

Matplotlib: Horizontal Bar Chart

Contents

  1. Prerequisites
  2. Getting Started
  3. Data Preparation
  4. Plotting

Prerequisites

To create a Matplotlib bar chart, we’ll need the following:

  • Python installed on your machine
  • Pip: package management system (it comes with Python)
  • Jupyter Notebook: an online editor for data visualization
  • Pandas: a library to create data frames from data sets and prepare data for plotting
  • Numpy: a library for multi-dimensional arrays
  • Matplotlib: a plotting library
  • Seaborn: a plotting library (we’ll only use part of its functionally to add a gray grid to the plot and get rid of borders)

You can download the latest version of Python for Windows on the official website.

To get other tools, you’ll need to install recommended Scientific Python Distributions. Type this in your terminal:

    
        
pip install numpy scipy matplotlib ipython jupyter pandas sympy nose seaborn
    

Getting Started

Create a folder that will contain your notebook (e.g. “matplotlib-bar-chart”). And open Jupyter Notebook by typing this command in your terminal (change the pathway):

    
        
cd C:\Users\Shark\Documents\code\matplotlib-bar-chart
py -m notebook
    

This will automatically open the Jupyter home page at http://localhost:8888/tree. Click on the “New” button in the top right corner, select the Python version installed on your machine, and a notebook will open in a new browser window.

In the first line of the notebook, import all the necessary libraries:

    
        
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
import pandas as pd
import seaborn as sns
sns.set()
%matplotlib notebook
    

You’ll need the last line (%matplotlib notebook) to display charts in input cells.

Data Preparation

Let’s create a bar plot that will show the top 10 movies with the highest revenue. You can download the data set from Kaggle. We’ll need the file named movies_metadata.csv. Put it in your “matplotlib-bar-chart” folder.

On the second line in your Jupyter notebook, type this code to read the file and to display the first 5 rows:

    
        
df = pd.read_csv('movies_metadata.csv')
df.head()
    
Pandas reading file

Next, create a data frame, sort and format values:

    
        
data = pd.DataFrame(df, columns=['revenue', 'title'])
data_sorted = data.sort_values(by='revenue', ascending=False)
data_sorted['revenue'] = data_sorted['revenue'] / 1000000
pd.options.display.float_format = '{:,.0f}'.format
data_sorted.set_index('title', inplace=True)
ranking = data_sorted.head(10)
ranking
    

The output will look like this:

Pandas data output

We’ll use this piece of data frame to create our chart.

Plotting

We’ll create a horizontal descending bar chart in 7 steps. All the code snippets below should be placed inside one cell in your Jupyter Notebook.

Here’s the list of variables that will be used in our code. You can insert your values or names if you like.

    
        
# Variables
index = ranking.index
values = ranking['revenue']
plot_title = 'Top 10 movies by revenue, USD million'
title_size = 18
subtitle = 'Source: Kaggle / The Movies Dataset'
x_label = 'Revenue, USD million'
filename = 'barh-plot'
    

1. Create subplots and set a colormap

First, sort data for plotting to create a descending bar chart:

    
        
ranking.sort_values(by='revenue', inplace=True, ascending=True)
    

Next, draw a figure with a subplot. We’re using the viridis color scheme to create gradients later.

    
        
fig, ax = plt.subplots(figsize=(10,6), facecolor=(.94, .94, .94))
mpl.pyplot.viridis()
    

figsize=(10,6) creates a 1000 × 600 px figure.

facecolor means the color of the plot’s background. 

mpl.pyplot.viridis() sets colormap (gradient colors from yellow to blue to purple). Other colormaps are plasma, inferno, magma, and cividis. See more examples in Matplotlib’s official documentation.

2. Create bars

    
        
bar = ax.barh(index, values)
plt.tight_layout()
ax.xaxis.set_major_formatter(mpl.ticker.StrMethodFormatter('{x:,.0f}'))
    

ax.bar() would create vertical bar plots, while ax.barh() would draw horizontal bar plots. We’re using Matplotlib barh, and our chart will have a horizontal layout.

plt.tight_layout() adjusts subplot params so that subplots are nicely fit in the figure.

We’re also using set_major_formatter() to format ticks with commas (like 1,500 or 2,000).

3. Set title, its font size, and position

    
        
title = plt.title(plot_title, pad=20, fontsize=title_size)
title.set_position([.33, 1])
plt.subplots_adjust(top=0.9, bottom=0.1)
    

pad=20 sets the title’s padding, and .33 sets its left margin.

subplots_adjust(top=0.9) allows us to keep the title from being cropped, and subplots_adjust(bottom=0.1) prevents the x-axis label from being cropped.

4. Create gradient background

Set grid z-order to 0 and bar z-order to 1. This will hide the grid behind bars.

    
        
ax.grid(zorder=0)

def gradientbars(bars):
    grad = np.atleast_2d(np.linspace(0,1,256))
    ax = bars[0].axes
    lim = ax.get_xlim()+ax.get_ylim()
    for bar in bars:
        bar.set_zorder(1)
        bar.set_facecolor('none')
        x,y = bar.get_xy()
        w, h = bar.get_width(), bar.get_height()
        ax.imshow(grad, extent=[x+w, x, y, y+h], aspect='auto', zorder=1)
    ax.axis(lim)
gradientbars(bar)
    

extent=[x+w, x, y ,y+h] means values in the following order: left, right, bottom, top. Change this to change the order of colors in the gradient, if needed.

5. Create bar labels/annotations

Since we’re creating a Python bar graph with labels, we need to define label values and label position.

    
        
rects = ax.patches
# Place a label for each bar
for rect in rects:
    # Get X and Y placement of label from rect
    x_value = rect.get_width()
    y_value = rect.get_y() + rect.get_height() / 2

    # Number of points between bar and label; change to your liking
    space = -30
    # Vertical alignment for positive values
    ha = 'left'

    # If value of bar is negative: place label to the left of the bar
    if x_value < 0:
        # Invert space to place label to the left
        space *= -1
        # Horizontally align label to the right
        ha = 'right'

    # Use X value as label and format number
    label = '{:,.0f}'.format(x_value)

    # Create annotation
    plt.annotate(
        label,                      # Use `label` as label
        (x_value, y_value),         # Place label at bar end
        xytext=(space, 0),          # Horizontally shift label by `space`
        textcoords='offset points', # Interpret `xytext` as offset in points
        va='center',                # Vertically center label
        ha=ha,                      # Horizontally align label differently for positive and negative values
        color = 'white')            # Change label color to white
    

6. Set a subtitle and labels, if needed

    
        
# Set subtitle
tfrom = ax.get_xaxis_transform()
ann = ax.annotate(subtitle, xy=(5, 1), xycoords=tfrom, bbox=dict(boxstyle='square,pad=1.3', fc='#f0f0f0', ec='none'))

#Set x-label
ax.set_xlabel(x_label, color='#525252')
    

7. Save the chart as a picture

    
        
plt.savefig(filename+'.png', facecolor=(.94, .94, .94))
    

You might need to repeat facecolor in savefig(). Otherwise, plt.savefig might ignore it.

That’s it, your horizontal bar chart is ready. You can download the notebook on GitHub to get the full code.


Read also:

→ Matplotlib: Vertical Bar Chart