Introduction
have always walked side-by-side, holding hands.
I remember hearing “Learn Statistics to know what’s behind the algorithms” when I started studying Data Science. While all of that was fascinating to me, it was also really overwhelming.
The fact is that there are too many statistical concepts, tests, and distributions to keep track of. If you don’t know what I am talking about, just visit the Scipy.stats page, and you will understand.
If you are old enough in the Data Science field, you probably bookmarked (or even printed) one of those statistical test cheatsheets. They were popular for a while. But now, the Large Language Models are becoming kind of a “second brain” for us, helping us to quickly consult virtually any information we li,ke with the extra benefit of getting it summarized and adapted to our needs.
With that in mind, my thinking was that choosing the right statistical test can be confusing because it depends on variable types, assumptions, etc.
So, I thought I could get an assistant to help with that. Then, my project took form.
- I used LangGraph to build a multi-step agent
- The front-end was built with Streamlit
- The Agent can quickly consult Scipy Stats documentation and retrieve the right code for every specific situation.
- Then, it gives us a sample Python code
- It is deployed in Streamlit Apps, in case you want to try it.
- App Link: https://ai-statistical-advisor.streamlit.app/
Amazing!
Let’s dive in and learn how to build this agent.
LangGraph
LangGraph is a library that helps build complex, multi-step applications with large language models (LLMs) by representing them as a graph. This graph architecture enables the developers to create conditions, loops, which make it useful for creating sophisticated agents and chatbots that can decide what to do next based on the results of a previous step
It essentially turns a rigid sequence of actions into a flexible, dynamic decision-making process. In LangGraph, each node is a function or tool.
Next, let’s learn more about the agent we are going to create in this post.
Statistical Advisor Agent
This agent is a Statistical Advisor. So, the main idea is that:
- The bot receives a statistics-related question, such as “How to compare the means of two groups“.
- It checks the question and determines if it needs to consult Scipy’s documentation or just give a direct answer.
- If needed, the agent uses a RAG tool on embedded SciPy documentation
- Returns an answer.
- If applicable, it returns a sample Python code on how to perform the statistical test.
Let’s quickly look at the Graph generated by LangGraph to show this agent.
Great. Now, let’s cut to the chase and start coding!
Code
To make things easier, I will break the development down into modules. First, let’s install the packages we will need.
pip install chromadb langchain-chroma langchain-community langchain-openai
langchain langgraph openai streamlit
Chunk and Embed
Next, we will create the script to take our documentation and create chunks of text, as well as embed those chunks. We do that to make it easier for vector databases like ChromaDB
to search and retrieve information.
So, I created this function embed_docs()
that you can see in the GitHub repository linked here.
- The function takes Scipy’s documentation (which is open source under BSD license)
- Splits it into chinks of 500 tokens and overlap of 50 tokens.
- Makes the embedding (transform text into numerical values for optimized vector db search) using
OpenAIEmbedding
- Saves the embeddings in an instance of
ChromaDB
Now the data is ready as a knowledge base for a Retrieval-Augmented Generation (RAG). But it needs a retriever that can search and find the data. That is what the retriever
does.
Retriever
The get_doc_answer()
function will:
- Load the ChromaDB instance previously created.
- Create an instance of
OpenAI GPT 4o
- Create a
retriever
object - Glue everything together in a
retrieval_chain
that gets a question from the user, sends it to the LLM - The model uses the
retriever
to access the ChromaDB instance, get relevant data about statistical tests, and return the answer to the user.
Now we have the RAG completed with the documents embedded and the retriever ready. Let’s move on to the Agent nodes.
Agent Nodes
LangGraph has this interesting architecture that considers each node as a function. Therefore, now we must create the functions to handle each part of the agent.
We will follow the flow and start with the classify_intent
node. Since some nodes need to interact with an LLM, we need to generate a client.
from rag.retriever import get_doc_answer
from openai import OpenAI
import os
from dotenv import load_dotenv
load_dotenv()
# Instance of OpenAI
client = OpenAI()
Once we start the agent, it will receive a query from the user. So, this node will check the question and decide if the next node will be a simple response or if it needs to search Scipy’s documentation.
def classify_intent(state):
"""Check if the user question needs a doc search or can be answered directly."""
question = state["question"]
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are an assistant that decides if a question about statistical tests needs document lookup or not. If it is about definitions or choosing the right test, return 'search'. Otherwise return 'simple'."},
{"role": "user", "content": f"Question: {question}"}
]
)
decision = response.choices[0].message.content.strip().lower()
return {"intent": decision} # "search" or "simple"
If a question about statistical concepts or tests is asked, then the retrieve_info()
node is activated. It performs the RAG in the documentation.
def retrieve_info(state):
"""Use the RAG tool to answer from embedded docs."""
question = state["question"]
answer = get_doc_answer(question=question)
return {"rag_answer": answer}
Once the proper chunk of text is retrieved from ChromaDB, the agent goes to the next node to generate an answer.
def respond(state):
"""Build the final answer."""
if state.get("rag_answer"):
return {"final_answer": state["rag_answer"]}
else:
return {"final_answer": "I'm not sure how to help with that yet."}
Finally, the last node is to generate a code, if that is applicable. Meaning, if there is an answer where the test can be done using Scipy, there will be a sample code.
def generate_code(state):
"""Generate Python code to perform the recommended statistical test."""
question = state["question"]
suggested_test = state.get("rag_answer") or "a statistical test"
prompt = f"""
You are a Python tutor.
Based on the following user question, generate a short Python code snippet using scipy.stats that performs the appropriate statistical test.
User question:
{question}
Answer given:
{suggested_test}
Only output code. Don't include explanations.
"""
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
return {"code_snippet": response.choices[0].message.content.strip()}
Notice something important here: all functions in our nodes always have state
as an argument because the state is the single source of truth for the entire workflow. Each function, or “node,” in the graph reads from and writes to this central state object.
For example:
- The
classify_intent
function reads the question from the state and adds an intent key. - The
retrieve_info
function can read the same question and add a rag_answer, which the respond function finally reads to construct the final_answer. This shared state dictionary is how the different steps in the agent’s reasoning and action-taking process stay connected.
Next, let’s put everything together and build our graph!
Building the Graph
The graph is the agent itself. So, what we’re doing here is basically telling LangGraph what the nodes are that we have and how they connect to each other, so the framework can make the information run according to that flow.
Let’s import the modules.
from langgraph.graph import StateGraph, END
from typing_extensions import TypedDict
from langgraph_agent.nodes import classify_intent, retrieve_info, respond, generate_code
Define our state schema. Remember that dictionary that the agent uses to connect the steps of the process? That is it.
# Define the state schema (just a dictionary for now)
class TypedDictState(TypedDict):
question: str
intent: str
rag_answer: str
code_snippet: str
final_answer: str
Here, we will create a function that builds the graph.
- To tell LangGraph what the steps (functions) in the process are, we use
add_node
- Once we have listed all the functions, we start creating the edges, which are the connections between the nodes.
- We start the process with
set_entry_point
. This is the first function to be used. - We use
add_edge
to connect one node to another, using the first argument as the function from which the information comes, and the second argument is where it goes. - If we have a condition to follow, we use
add_conditional_edges
- We use
END
to finish the graph andcompile
to build it.
def build_graph():
# Build the LangGraph flow
builder = StateGraph(TypedDictState)
# Add nodes
builder.add_node("classify_intent", classify_intent)
builder.add_node("retrieve_info", retrieve_info)
builder.add_node("respond", respond)
builder.add_node("generate_code", generate_code)
# Define flow
builder.set_entry_point("classify_intent")
builder.add_conditional_edges(
"classify_intent",
lambda state: state["intent"],
{
"search": "retrieve_info",
"simple": "respond"
}
)
builder.add_edge("retrieve_info", "respond")
builder.add_edge("respond", "generate_code")
builder.add_edge("generate_code", END)
return builder.compile()
With our graph builder function ready, all we have to do now is create a beautiful front-end where we can interact with this agent.
Let’s do that now.
Streamlit Front-End
The front-end is the final piece of the puzzle, where we create a User Interface that allows us to easily enter a question in a proper text box and see the answer properly formatted.
I chose Streamlit because it is very easy to prototype and deploy. Let’s begin with the imports.
import os
import time
import streamlit as st
Then, we configure the page’s look.
# Config page
st.set_page_config(page_title="Stats Advisor Agent",
page_icon='🤖',
layout="wide",
initial_sidebar_state="expanded")
Create a sidebar, where the user can enter their OpenAI API key, along with a “Clear” session button.
# Add a place to enter the API key
with st.sidebar:
api_key = st.text_input("OPENAI_API_KEY", type="password")
# Save the API key to the environment variable
if api_key:
os.environ["OPENAI_API_KEY"] = api_key
# Clear
if st.button('Clear'):
st.rerun()
Next, we set up the page title and instructions and add a text box for the user to enter a question.
# Title and Instructions
if not api_key:
st.warning("Please enter your OpenAI API key in the sidebar.")
st.title('Statistical Advisor Agent | 🤖')
st.caption('This AI Agent is trained to answer questions about statistical tests from the [Scipy](https://docs.scipy.org/doc/scipy/reference/stats.html) package.')
st.caption('Ask questions like: "What is the best statistical test to compare two means".')
st.divider()
# User question
question = st.text_input(label="Ask me something:",
placeholder= "e.g. What is the best test to compare 3 groups means?")
Finally, we can run the graph builder and display the answer on screen.
# Run the graph
if st.button('Search'):
# Progress bar
progress_bar = st.progress(0)
with st.spinner("Thinking..", show_time=True):
from langgraph_agent.graph import build_graph
progress_bar.progress(10)
# Build the graph
graph = build_graph()
result = graph.invoke({"question": question})
# Progress bar
progress_bar.progress(50)
# Print the result
st.subheader("📖 Answer:")
# Progress bar
progress_bar.progress(100)
st.write(result["final_answer"])
if "code_snippet" in result:
st.subheader("💻 Suggested Python Code:")
st.write(result["code_snippet"])
Let’s see the result now.

Wow, the result is impressive!
- I asked: What is the best test to compare two groups means?
- Answer: To compare the means of two groups, the most appropriate test is typically the independent two-sample t-test if the groups are independent and the data is normally distributed. If the data is not normally distributed, a non-parametric test like the Mann-Whitney U test might be more suitable. If the groups are paired or related, a paired sample t-test would be appropriate.
Mission accomplished for what we proposed to create.
Try It Yourself
Do you want to give this Agent a Try?
Go ahead and test the deployed version now!
https://ai-statistical-advisor.streamlit.app
Before You Go
This is a long post, I know. But I hope it was worth to read it to the end. We learned a lot about LangGraph. It makes us think in a different way about creating AI agents.
The framework forces us to think about every step of the information, from the moment a question is prompted to the LLM until the answer that will be displayed. Questions like these start to pop in your mind during the development process:
- What happens after the user asks the question?
- Does the agent need to verify something before moving on?
- Are there conditions to consider during the interaction?
This architecture becomes an advantage because it makes the whole process cleaner and scalable, since adding a new feature can be as simple as adding a new function (node).
On the other hand, LangGraph is not as user-friendly as frameworks like Agno or CrewAI, which encapsulate many of these abstractions in simpler methods, making the process much easier to learn and develop, but also less flexible.
In the end, it is all a matter of what problem is being solved and how flexible you need it to be.
GitHub Repository
https://github.com/gurezende/AI-Statistical-Advisor
About Me
If you liked this content and want to learn more about my work, here is my website, where you can also find all my contacts.
[1. LangGraph Docs] https://langchain-ai.github.io/langgraph/concepts/why-langgraph/
[2. Scipy Stats] https://docs.scipy.org/doc/scipy/reference/stats.html
[3. Streamlit Docs] https://docs.streamlit.io/
[4. Statistical Advisor App] https://ai-statistical-advisor.streamlit.app/