is everywhere. There are countless books, articles, tutorials, and videos, some of which I have written or created.
In my experience, most of these resources tend to present data storytelling in an overwhelmingly positive light. But lately, one concern has been on my mind:
What if our stories, instead of clarifying, mislead?
The image above shows one of the apartment buildings in my neighborhood. Now, take a look at the photo on the left and imagine one of the apartments in the white building is up for sale. You are considering buying it. You would likely focus on the immediate surroundings, especially as presented in the seller’s photos. Notice anything unusual? Probably not, at least not right away.
Should the immediate setting be a dealbreaker? In my opinion, not necessarily. It’s not the most picturesque or charming spot—just a typical block in an average neighborhood in Warsaw. Or is it?
Let’s take a short walk around to the back of the building. And… surprise: there’s a public lavatory right there. Still feel good about the location? Maybe yes, maybe no. One thing is clear: you would want to know that a public toilet sits just below your future balcony.
Additionally, the apartment is located in the lower part of the building, while the rest of the towers rise above it. This is another factor that may be significant. Both these “issues” for sure can be brought up in price negotiations.
This simple example illustrates how easily stories (in this case, using photos) can be misinterpreted. From one angle, everything looks fine, even inviting. Take a few steps to the right, and… whoops.
The same situation can happen in our “professional” lives. What if audiences, convinced they’re making informed, data-backed decisions, are being subtly steered in the wrong direction—not by false data, but by the way it’s presented?
This post builds on an article I wrote in 2024 about misleading visualizations [1]. Here, I want to take a bit broader perspective, exploring how the structure and flow of a story itself can unintentionally (or deliberately) lead people to incorrect conclusions, and how we can avoid that.
Data storytelling is subjective
We often like to believe that “data speaks for itself.” But in reality, it rarely does. Every chart, dashboard, or headline built around a dataset is shaped by human choices:
- what to include,
- what to leave out,
- how to frame the message?
This highlights a core challenge of data-driven storytelling: it’s inherently subjective. That subjectivity comes from the discretion we have in proving the point we want to make:
- choosing which data to present,
- selecting appropriate analysis technique,
- deciding on arguments to emphasise,
- and even what to to use.
Subjectivity also lies in interpretation — both ours and our audience’s — and in their willingness to act on the information. This opens the door to biases. If we are not careful, we can easily cross the line from subjectivity into unethical storytelling.
This article examines the hidden biases embedded in data storytelling and how we can transition from manipulation to meaningful insights.
We need stories
Subjective or not, we need stories. Stories are essential to us because they help make sense of the world. They carry our values, preserve our history, and spark our imagination. Through stories, we connect with others, learn from past experiences, and explore what it means to be human. No matter your nationality, culture, or religion, we have all heard countless stories that have shaped us. Told us by our grandparents, parents, teachers, friends, and colleagues at work. Stories evoke emotion, inspire action, and shape our identity, both individually and collectively. In every culture and across every age, storytelling has been a powerful means of understanding life, sharing knowledge, and building community.
But while stories can enlighten, they can also mislead. A compelling narrative has the power to shape perception, even when it distorts facts or oversimplifies complex issues. Stories often rely on emotion, selective detail, and a clear message, which can make them persuasive, but also dangerously reductive. When used carelessly or manipulatively, storytelling can reinforce biases, obscure truth, or drive decisions based more on feeling than reason.
In the next part of this article, I’ll explore the potential problems with stories — especially in data-driven contexts — and how their power can unintentionally (or intentionally) misguide our understanding.

Narrative biases in data-driven storytelling
Bias 1. Data is far, far away from interpretation
Here’s an example of a visual from a report titled “Kentucky Juvenile Justice Reform Evaluation: Assessing the Effects of SB 200 on Youth Dispositional Outcomes and Racial and Ethnic Disparities.”

The graph shows that young offenders in Kentucky are less likely to reoffend if, after their first offense, they are routed through a diversion program. This program connects them with community support, such as social workers and therapists, to address deeper life challenges. That’s a powerful narrative with real-world implications: it supports reducing our reliance on an expensive criminal justice system, justifies increased funding for non-profits, and points toward meaningful ways to improve lives.
But here’s the problem: unless you already have strong data literacy and subject knowledge, those conclusions are not immediately obvious from the graph. While the report does make this point, it doesn’t do so until nearly 20 pages later. This is a classic example of how the structure of academic reporting can mute the story’s impact. It results from the fact that data is presented visually in one section and interpreted textually in different (and sometimes distant) sections of the document.
Bias 2. The Tale of the Missing Map: Selection Bias

Choosing which data points (cherries 😊) to include (and which to ignore) is one of the strongest — and often most overlooked — acts of bias. And perhaps no industry illustrated this better than Big Tobacco.
The now-famous summary of their legal strategy says it all:
Yes, smoking causes lung cancer, but not in people who sue us.
That quote perfectly captures the tone of tobacco litigation in the late 20th century, where companies faced a wave of lawsuits from customers suffering from diseases linked to smoking. Despite overwhelming medical and scientific consensus, tobacco firms routinely deflected responsibility using a series of arguments that, while sometimes legally strategic, were scientifically absurd.
Here are four of the most egregious cherry-picking tactics they used in court, based on this article [2].
Cherry-pick tactic 1: use “exception fallacy” tactic in legal or rhetorical contexts.
Yes, smoking causes cancer — but not this one.
- The plaintiff had a rare form of cancer, like bronchioloalveolar carcinoma (BAC) or mucoepidermoid carcinoma, which they claimed weren’t conclusively linked to smoking.
- In one case, they argued the cancer was from the thymus, not the lungs, despite overwhelming medical evidence.
Cherry-pick tactic 2: Highlight obscure exceptions or rare cancer types to challenge general epidemiological evidence.
It wasn’t our brand.
- “Sure, tobacco may have caused the disease — but not our cigarettes.”
- In Ierardi v. Lorillard, the company argued that the plaintiff’s exposure to asbestos-laced cigarette filters (Micronite) occurred outside the narrow 4-year window when they were used, even though 585 million packs were sold during that time.
Cherry-pick tactic 3: Focus on brand or product variation as a way to shift blame.
In several cases, such as Ierardi v. Lorillard and Lacy v. Lorillard, the defense admitted that cigarettes can cause cancer but argued that the plaintiff:
- Didn’t use their brand at the time of exposure,
- Or didn’t use the specific version of the product that was most dangerous (e.g., Kent cigarettes with the asbestos-containing Micronite filter),
- Or didn’t use the specific version of the product that was most dangerous (e.g., Kent cigarettes with the asbestos-containing Micronite filter),
- window years ago, making it unlikely the plaintiff was exposed.
This tactic shifts the narrative from
Our product caused harm.
to
Maybe smoking caused harm—but not ours.
Cherry-pick tactic 4: Emphasize every other possible risk factor — regardless of plausibility — to deflect from tobacco’s role.
There were other risk factors.
- In many lawsuits, companies pointed to alternative causes of illness: asbestos, diesel fumes, alcohol, genetics, diet, obesity, and even spicy food.
- In Allgood v. RJ Reynolds, the defense blamed the plaintiff’s condition partly on his fondness for “Tex-Mex food.”
Cherry-picking isn’t always obvious. It can hide in legal defenses, marketing copy, dashboards, or even academic reports. But when only the data that serves the story gets told, it stops being insight and starts becoming manipulation.
Bias 3: The Mirror in the Forest: How the Same Data Tells Different Tales
How we phrase results can skew interpretation. Should we say “Unemployment drops to 4.9%” or “Millions still jobless despite gains”? Both can be accurate. The difference lies in emotional framing.
In essence, framing is a strategic storytelling technique that can significantly impact how a story is received, understood, and remembered. By understanding the power of framing, storytellers can craft narratives that resonate deeply with their audience and achieve their desired goals. I present some examples in Table 1.
Frame A | Frame B | Objective description | |
Unemployment | “Unemployment hits 5-year low” Suggests progress, recovery, and strong leadership. |
“Millions still without jobs despite slight drop” Highlights the persistent problem and unmet needs. | A modest drop in the unemployment rate. |
Vaccine Effectiveness | “COVID vaccine reduces risk by 95%” Emphasizes protection, encourages uptake. |
“1 in 20 still gets infected even after the jab.” Focuses on vulnerability and doubt. |
A clinical trial showed a 95% relative risk reduction. |
Climate Data | “2023 was the hottest year on record.” Calls attention to the global crisis. |
“Earth has always gone through natural cycles.” Implies nothing unusual is happening. |
Long-term temperature records. |
Company Financial Reports | “Revenue grows 10% in Q2.” Celebrates short-term gain. |
“Still below pre-pandemic levels”. Signals underperformance in the long run. |
Quarterly earnings report. |
Election Polls | “Candidate A leads by 3 points!” Creates a sense of momentum. |
“Within margin of error: race too close to call.” Emphasizes uncertainty. |
A poll with +/- 3% margin. |
Health Warnings | “This drink has 25 grams of sugar.” Sounds scientific, neutral. |
“This drink contains over six teaspoons of sugar.” Sounds excessive and dangerous. |
25 grams of sugar. |
Bias 4: “The Dragon of Design: How Beauty Beguiles the Truth”
Visuals simplify data, but they can also manipulate perception. In my older article [1], I listed 14 deceptive visualization tactics. Here is a summary of them.
- Using the wrong chart type: Choosing charts that confuse rather than clarify — like 3D pie charts or inappropriate comparisons — makes it harder to see the story the data tells.
- Adding distracting elements: Stuffing visuals with logos, decorations, dark gridlines, or clutter hides the important insights behind noise and visual overload.
- Overusing colors: Using too many colors can distract from the focus. Without a clear color hierarchy, nothing stands out, and the viewer is overwhelmed.
- Random data ordering: Scrambling categories or time series data obscures patterns and prevents clear comparisons.
- Manipulating axis scales: Truncating the y-axis exaggerates differences. Extending it minimizes meaningful variation. Both distort perception.
- Creating trend illusions: Using inconsistent time frames, selective data points, or poorly spaced axes to make non-trends look significant.
- Cherry-picking data: Only showing the parts of the data that support your point, ignoring the full story or contradicting evidence.
- Omitting visual cues: Removing labels, legends, gridlines, or axis scales to make data hard to interpret, or hard to challenge.
- Overloading charts: Packing too much data into one chart can be distracting and confusing, especially when critical data is buried in visual chaos.
- Showing only cumulative values: Using cumulative plots to imply smooth progress while hiding volatility or declines in individual periods.
- Using 3D effects: 3D charts skew perception and make comparisons more difficult, often leading to misleading information about size or proportion.
- Applying gradients and shading: Fancy textures or gradients shift focus and add visual weight to areas that might not deserve it.
- Misleading or vague titles: A neutral or technical title can downplay the urgency of findings. A dramatic one can exaggerate a minor change.
- Using junk charts: Visually overdesigned, complex, or overly artistic charts that are hard to interpret and easy to misread.
Bias 5: “The Story-Spinning Machine: But Who Holds the Thread?”
Modern tools like Power BI Copilot or Tableau Pulse are increasingly generating summaries and “insights” on your behalf. Not to mention crafting summaries, narratives, or whole presentations prepared by LLMs like ChatGPT or Gemini.
But here’s the catch:
These tools are trained on patterns, not ethics.
AI can’t tell when it’s creating a misleading story. If your prompt or dataset is biased, the output will likely be biased as well, and at a machine scale.
This raises a critical question: Are we using AI to democratize insight, or to mass-produce narrative spin?

A recent BBC investigation found that leading AI chatbots frequently distort or misrepresent current events, even when using BBC articles as their source. Over half of the tested responses contained significant issues, including outdated facts, fabricated or altered quotes, and confusion between opinion and reporting. Examples ranged from incorrectly stating that Rishi Sunak was still the UK prime minister to omitting key legal context in high-profile criminal cases. BBC executives warned that these inaccuracies threaten public trust in news and urged AI companies to collaborate with publishers to improve transparency and accountability.[3]
Feeling overwhelmed? You’ve only seen the beginning. Data storytelling can fall prey to numerous cognitive biases, each subtly distorting the narrative.
Take confirmation bias, where the storyteller highlights only data that supports their assumptions—proclaiming, “Our campaign was a success!”—while ignoring contradictory evidence. Then there’s outcome bias, which credits success to sound strategy: “We launched the product and it thrived, so our approach was perfect,”—even if luck played a major role.
Survivorship bias focuses only on the winners—startups that scaled or campaigns that went viral—while neglecting the many that failed using the same methods. Narrative bias oversimplifies complexity, shaping messy realities into tidy conclusions, such as “Vaping is always safer,” without sufficient context.
Anchoring bias causes people to fixate on the first number presented—like a 20% forecast—distorting how subsequent information is interpreted. Omission bias arises when important data is left out, for instance, only highlighting top-performing regions while ignoring underperforming ones.
Projection bias assumes that others interpret data the same way the analyst does: “This dashboard speaks for itself,”—yet it may not, especially for stakeholders unfamiliar with the context. Scale bias misleads with disproportionate framing—“A 300% increase!” sounds impressive until you learn it went from just one to three users.
Finally, causality bias draws unfounded conclusions from correlations: “Users stayed longer after we added popups—they must love them!”—without testing whether popups were the actual cause.
How to “Unbias” Data Storytelling
Every data story is a choice. In a world where attention spans are short and AI writes faster than humans, those choices are more powerful — and dangerous — than ever.
As data scientists, analysts, and storytellers, we must approach narrative choices with the same level of rigor and thoughtfulness that we apply to statistical models. Crafting a story from data is not just about clarity or engagement—it’s about responsibility. Every choice we make in framing, emphasis, and interpretation shapes how others perceive the truth. And at the end of the day, the most dangerous stories are not the false ones—they’re the ones that feel like facts.
In this part of the article, I’ll share several practical strategies to help you strengthen your data storytelling. These ideas will focus on how to be both compelling and credible—how to craft narratives that engage your audience without oversimplifying or misleading them. Because when done well, data storytelling doesn’t just communicate insight—it builds trust.
Strategy 1: The Wise Wizard’s Rule: Ask, Don’t Enchant
In the world of data and analysis, the most insightful storytellers don’t announce their conclusions with dramatic flair—they lead with thoughtful questions. Instead of presenting bold declarations, they invite reflection by asking, “What do you see?” This approach encourages others to discover insights on their own, fostering understanding rather than passive acceptance.
Consider a graph showing a decline in test scores. A surface-level interpretation might immediately claim, “Our schools are failing,” sparking concern or blame. But a more careful, analytical response would be, “What factors could explain this change? Could it be a new testing format, changes in student demographics, or something else?” Similarly, when sales rise following the launch of a new feature, it’s tempting to attribute the increase solely to the feature. Yet a more rigorous approach would ask, “What other variables changed during this period?”
By leading with questions, we create space for interpretation, dialogue, and deeper thinking. This method guards against false certainty and encourages a more collaborative, thoughtful exploration of data. A strong narrative should guide the audience, rather than forcing them toward a predetermined conclusion.
Strategy 2: The Mirror of Many Truths: Offer Counter-Narratives
Good data storytelling doesn’t stop at a single interpretation. Complex datasets often allow for multiple valid perspectives, and it’s the storyteller’s responsibility to acknowledge them. Presenting a counter-narrative—“here’s another way to look at this”—invites critical thinking and builds credibility.
For example, a chart may show that heart disease rates are declining overall. That seems like a success. But a closer look may reveal that the improvement is concentrated in higher-income areas, while rates in rural or underserved communities remain high. Presenting both views—progress and disparity—provides a more comprehensive and honest picture of the issue.
By offering counter-narratives, we guard against oversimplification and help our audience understand the nuance behind the numbers.

Strategy 3: The Curse of Crooked Charts: Avoid Deceptive Visuals
Visuals are powerful, but that power must be used responsibly. Misleading charts can distort perception through subtle tricks, such as truncated axes that exaggerate differences, unlabeled units that obscure the scale, or decorative clutter that distracts from the message. To avoid these pitfalls, always clearly label axes, start scales from zero when appropriate, and choose chart types that best fit the data, not just their aesthetic appeal. Deception doesn’t always come from malice—sometimes it’s just careless design. But either way, it erodes trust. A clean, honest visual is far more persuasive than a flashy one that hides the details.

Take, for example, the two charts shown in Image 7. The one on the left is cluttered and hard to interpret. Its title is vague, the excessive use of color is distracting, and unnecessary elements—like heavy borders, gridlines, and shading—only add to the confusion. There are no visual cues to guide the viewer, leaving the audience to guess what the author is trying to say.
In contrast, the chart on the right is far more effective. It strips away the noise, using just three colors: grey for context, blue to highlight key information, and a clean white background. Most importantly, the title conveys the main message, allowing the audience to grasp the point at a glance.
Strategy 4: Speak Honestly of Shadows: The Wisdom of Embracing Uncertainty
Uncertainty is an inherent part of working with data, and acknowledging it doesn’t weaken your story—it strengthens your credibility. Transparency around uncertainty is a hallmark of responsible data communication. When you communicate elements like confidence intervals, margins of error, or the assumptions behind a model, you’re not just being technically accurate—you’re demonstrating honesty and humility. It shows that you respect your audience’s ability to engage with complexity, rather than oversimplifying to maintain a clean narrative.
Uncertainty can arise from various sources, including limited sample sizes, noisy or incomplete data, changing conditions, or the assumptions inherent in predictive models. Instead of ignoring or smoothing over these limitations, good storytellers bring them to the forefront—visually and verbally. Doing so encourages critical thinking and opens the door for discussion. It also protects your work from misinterpretation, misuse, or overconfidence in results. In short, by being open about what the data can’t tell us, we give more weight to what it can. Below, I present several examples of how you could include information on uncertainty in your data story.
- Update on confidence intervals
Instead of: “Revenue will grow by 15% next quarter.”
Use: “We project a 15% growth, with a 95% confidence interval of 12%–18%.” - Leave a margin of error.
Instead of: “Customer satisfaction is at 82%.”
Use: “Customer satisfaction is 82%, ±3% margin of error.” - Missing data indicators
Use visual cues, such as faded bars, dashed lines, or shaded areas, on charts to indicate gaps.
Add footnotes: “Data for Q2 is incomplete due to reporting delays.” - Model assumptions
Example: “This forecast assumes no significant change in user behavior or market conditions.” - Multiple scenarios
Present best-case, worst-case, and most-likely scenarios to reflect a range of possible outcomes. - Probabilistic language
Instead of: “This will happen.”
Use: “There’s a 70% chance this outcome occurs under current conditions.” - Data quality notes
Highlight issues like small sample sizes or self-reported data:
“Results are based on a survey of 100 respondents and may not reflect the broader population.” - Error bars on charts
Visually show uncertainty by including error bars or shaded confidence bands in graphs. - Transparency in limitations
Example: “This analysis does not account for seasonal variation or external economic factors.” - Qualitative clarification
Use captions or callouts in presentations or dashboards:
“Data trends are indicative, but further validation is needed.”
You might wonder, “But won’t highlighting these uncertainties weaken my story or make me seem unsure of the results?” On the contrary, acknowledging uncertainty doesn’t signal a lack of confidence; it shows depth, professionalism, and integrity. It conveys to your audience that you understand the complexity of the data and are not trying to oversell a simplistic conclusion. Sharing what you do know, alongside what you don’t, creates a more balanced and credible narrative. People are far more likely to trust your insights when they see that you’re being honest about the limitations. It’s not about dampening your story—it’s about grounding it in reality.
Strategy 5: Reveal the Roots of the Tale: Let Truth Travel with Its Sources
Every story needs roots, and in the world of data storytelling, those roots are your sources. A beautiful chart or striking number means little if your audience can’t see where it came from. Was it a randomized survey? Administrative data? Social media scraping? Just like a traveler trusts a guide who knows the path, readers are more likely to trust your insights when they can trace them back to their origins. Transparency about data sources, collection methods, assumptions, and even limitations is not a sign of weakness—it’s a mark of integrity. When we reveal the roots of the tale, we give our story depth, credibility, and resilience. Informed decisions can only grow in well-tended soil.

Closing remarks
Data-driven storytelling is both an art and a responsibility. It gives us the power to make information meaningful—but also the power to mislead, even unintentionally. In this article, we’ve explored a forest of biases, design traps, and narrative temptations that can subtly shape perception and distort the truth. Whether you’re a data scientist, communicator, or decision-maker, your stories carry weight—not just for what they show, but for how they are told.
So let us tell stories that illuminate, not obscure. Let us lead with questions, not conclusions. Let us reveal uncertainty, not hide behind false clarity. And above all, let us anchor our insights in transparent sources and humble interpretation. The goal isn’t perfection—it’s integrity. Because in a world filled with noise and narrative spin, the most powerful story you can tell is one that’s both clear and honest.
In the end, storytelling is not about controlling the message—it’s about earning trust. And trust, once lost, is not easily won back. So choose your stories carefully. Shape them with care. And remember: the truth may not always be flashy, but it always finds its way to the light.
And one more thing: if you’ve ever spotted (or unintentionally created) a biased data story, share your experience in the comments. The more we surface these narratives, the better we all get at telling data truths, not just data tales.
References
[1] How not to Cheat with Data Visualizations, Michal Szudejko, Towards Data Science
[2] Tobacco manufacturers’ defence against plaintiffs’ claims of cancer causation: throwing mud at the wall and hoping some of it will stick, Multiple Authors, National Library of Medicine
[3] AI chatbots distort and mislead when asked about current affairs, BBC finds, Matthew Weaver
Disclaimer
This post was originally written using Microsoft Word, and the spelling and grammar were checked with Grammarly. I reviewed and adjusted any modifications to ensure that my intended message was accurately reflected. All other uses of AI (for instance image and sample data generation) were disclosed directly in the text.