Home » PyNarrative: An Excellent Python Library for Data Storytelling

PyNarrative: An Excellent Python Library for Data Storytelling

PyNarrative: An Excellent Python Library for Data Storytelling
Image by Author | Canva

 

Let’s say there are two people, person A and person B. You give them the same dataset to analyze. But somehow, A’s story comes out better than B’s. Why? Because it’s not just the data itself that matters. But how well you can turn that data into a story that people can actually understand. And let’s be real. Most of us developers struggle with that part. We’re logical. We’re straight to the point. But storytelling? Not always our strong suit.

There are tons of libraries you’ve probably heard of like Matplotlib, Seaborn, or Altair that are widely used for data visualization. But they mostly focus on just drawing charts and they usually take more time and more lines of code. So, they’re better for technical analysis than storytelling. But here’s the good news. There’s a new Python library called PyNarrative that makes storytelling way easier. It can add captions, highlight key points, and guide your audience through the data.This makes your reports and dashboards more engaging by producing results that actually speak to the reader. In this article, I’ll walk you through how to use PyNarrative. We’ll cover installation, how to build narratives, and I’ll share some useful resources at the end. So, let’s get started:

 

Getting Started with PyNarrative

 

Installation & Imports

To start, you’ll need Python (version 3.7 or later) and some common libraries. Open your terminal and run the following command:

pip install pynarrative pandas altair

 

This will install PyNarrative along with its required dependencies (Pandas and Altair). You can also create a virtual environment first to keep things tidy. After installing, import the following libraries:

import pandas as pd
import pynarrative as pn
import altair as alt  # Optional if you want to customize charts

 

Using PyNarrative to Build a Story

 
Once you have the data, its easier to create the narrative chart. There is a class in PyNarrative called Story that wraps around an Altair chart. Here’s the basic flow to build the story:

  1. Create a PyNarrative Story: Pass your DataFrame to pn.Story, and define the chart with Altair encodings (like mark_line(), encode(), etc.).
  2. Add Narrative Elements: Chain methods like .add_title(), .add_context(), .add_annotation(), and .add_next_steps() to include text components.
  3. Render the Story: Finally, call .render() to display the complete narrative visualization.

Suppose you have a DataFrame df with columns Year and Value. Here’s how to tell a story around it:

chart = (pn.Story(df, width=600, height=400)
           .mark_line(color="steelblue")
           .encode(x='Year:O', y='Value:Q')
           .add_title("Yearly Trend", "2000-2020", title_color="#333")
           .add_context("Values have increased over time", position='top')
           .render())
chart

 

Here’s what each part does:

  • .add_title("Yearly Trend", "2000-2020"): Places a main title and a subtitle on the plot.
  • .add_context("Values have increased..."): Adds a descriptive note at the top of the chart.
  • .render(): Shows the final combined chart with all narrative elements.

You can also use .add_annotation() to point out a specific data point, or .add_next_steps() to suggest actions (e.g. “Review Q4” or link to more info).

 

First Example: COVID-19 Data

 
Let’s try a small example using made-up COVID-19 case counts:

covid_df = pd.DataFrame({
    'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
    'Cases': [1000, 3000, 7000, 5000, 2000]
})

# Create a narrative chart
covid_story = (pn.Story(covid_df)
                 .mark_line(color="firebrick")
                 .encode(x='Month:O', y='Cases:Q')
                 .add_title("COVID-19 Cases Over Time",
                            "Monthly trend",
                            title_color="#b22222")
                 .add_context("Cases peaked in March and declined in April/May", position='top')
                 .add_annotation('Mar', 7000, "Peak in March", arrow_color="gray", label_color="black")
                 .render())
covid_story

 

Output:

 
Output: Example 1Output: Example 1
 

This code produced a line chart of cases by month. The add_context call writes a sentence at the top explaining the trend (March peak, then decline). The add_annotation call puts a label at the March point (“Peak in March”) with an arrow pointing to that data point. Instead of just seeing numbers on a graph, your audience now knows what happened and why it matters. If you had to do the same thing using plain Altair or Matplotlib, you would have to manually figure out the coordinates and text placements, which can multiple lines of code.

 

Second Example: Unemployment Data

 
PyNarrative works with any numeric data as well. For a second example, let’s use public unemployment data:

unemp_df = pd.DataFrame({
    'Year': [2018, 2019, 2020, 2021, 2022],
    'UnemploymentRate': [4.5, 3.9, 8.1, 6.2, 5.3]
})

unemp_story = (pn.Story(unemp_df, width=600)
                 .mark_bar(color="teal")
                 .encode(x='Year:O', y='UnemploymentRate:Q')
                 .add_title("State Unemployment Rate", "2018-2022",
                            title_color="#333")
                 .add_context("Sharp increase in 2020 due to the pandemic", position='top')
                 .add_annotation(2020, 8.1, "Pandemic impact", arrow_color="red", label_color="darkred")
                 .render())
unemp_story

 

Output:
 
Output: Example 1Output: Example 1
 

In this case, we use a bar chart to show unemployment rates over time. The 2020 spike is called out directly, making the message clear even to someone unfamiliar with the data.

 

Wrapping Up and Next Steps

 
You can use PyNarrative almost anywhere you want to present data and ensure the audience “gets it.” As you explore, check out the official PyNarrative documentation and examples. Start by installing and importing the library, then load your favorite public dataset with pandas (for example, CSVs from Kaggle or data.gov). If you are new to programming refer to Python.org beginner’s guide or the “10 minutes to pandas” tutorial. With a little practice, you’ll be adding clear, engaging narratives to your data in no time.
 
 

Kanwal Mehreen Kanwal is a machine learning engineer and a technical writer with a profound passion for data science and the intersection of AI with medicine. She co-authored the ebook “Maximizing Productivity with ChatGPT”. As a Google Generation Scholar 2022 for APAC, she champions diversity and academic excellence. She’s also recognized as a Teradata Diversity in Tech Scholar, Mitacs Globalink Research Scholar, and Harvard WeCode Scholar. Kanwal is an ardent advocate for change, having founded FEMCodes to empower women in STEM fields.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *