LangGraph + SciPy: Building an AI That Reads Documentation and Makes Decisions

Introduction

have always walked side-by-side, holding hands.

I remember hearing “Learn Statistics to know what’s behind the algorithms” when I started studying Data Science. While all of that was fascinating to me, it was also really overwhelming.

The fact is that there are too many statistical concepts, tests, and distributions to keep track of. If you don’t know what I am talking about, just visit the Scipy.stats page, and you will understand.

If you are old enough in the Data Science field, you probably bookmarked (or even printed) one of those statistical test cheatsheets. They were popular for a while. But now, the Large Language Models are becoming kind of a “second brain” for us, helping us to quickly consult virtually any information we li,ke with the extra benefit of getting it summarized and adapted to our needs.

With that in mind, my thinking was that choosing the right statistical test can be confusing because it depends on variable types, assumptions, etc.

So, I thought I could get an assistant to help with that. Then, my project took form.

I used LangGraph to build a multi-step agent
The front-end was built with Streamlit
The Agent can quickly consult Scipy Stats documentation and retrieve the right code for every specific situation.
Then, it gives us a sample Python code
It is deployed in Streamlit Apps, in case you want to try it.
App Link: https://ai-statistical-advisor.streamlit.app/

Amazing!

Let’s dive in and learn how to build this agent.

LangGraph

LangGraph is a library that helps build complex, multi-step applications with large language models (LLMs) by representing them as a graph. This graph architecture enables the developers to create conditions, loops, which make it useful for creating sophisticated agents and chatbots that can decide what to do next based on the results of a previous step

It essentially turns a rigid sequence of actions into a flexible, dynamic decision-making process. In LangGraph, each node is a function or tool.

Next, let’s learn more about the agent we are going to create in this post.

Statistical Advisor Agent

This agent is a Statistical Advisor. So, the main idea is that:

The bot receives a statistics-related question, such as “How to compare the means of two groups“.
It checks the question and determines if it needs to consult Scipy’s documentation or just give a direct answer.
If needed, the agent uses a RAG tool on embedded SciPy documentation
Returns an answer.
If applicable, it returns a sample Python code on how to perform the statistical test.

Let’s quickly look at the Graph generated by LangGraph to show this agent.

Agent created with LangGraph. Image by the author.

Great. Now, let’s cut to the chase and start coding!

Code

To make things easier, I will break the development down into modules. First, let’s install the packages we will need.

pip install chromadb langchain-chroma langchain-community langchain-openai 
langchain langgraph openai streamlit

Chunk and Embed

Next, we will create the script to take our documentation and create chunks of text, as well as embed those chunks. We do that to make it easier for vector databases like ChromaDB to search and retrieve information.

So, I created this function embed_docs() that you can see in the GitHub repository linked here.

The function takes Scipy’s documentation (which is open source under BSD license)
Splits it into chinks of 500 tokens and overlap of 50 tokens.
Makes the embedding (transform text into numerical values for optimized vector db search) using OpenAIEmbedding
Saves the embeddings in an instance of ChromaDB

Now the data is ready as a knowledge base for a Retrieval-Augmented Generation (RAG). But it needs a retriever that can search and find the data. That is what the retriever does.

Retriever

The get_doc_answer() function will:

Load the ChromaDB instance previously created.
Create an instance of OpenAI GPT 4o
Create a retriever object
Glue everything together in a retrieval_chain that gets a question from the user, sends it to the LLM
The model uses the retriever to access the ChromaDB instance, get relevant data about statistical tests, and return the answer to the user.

Now we have the RAG completed with the documents embedded and the retriever ready. Let’s move on to the Agent nodes.

Agent Nodes

LangGraph has this interesting architecture that considers each node as a function. Therefore, now we must create the functions to handle each part of the agent.

We will follow the flow and start with the classify_intent node. Since some nodes need to interact with an LLM, we need to generate a client.

from rag.retriever import get_doc_answer
from openai import OpenAI
import os
from dotenv import load_dotenv
load_dotenv()

# Instance of OpenAI
client = OpenAI()

Once we start the agent, it will receive a query from the user. So, this node will check the question and decide if the next node will be a simple response or if it needs to search Scipy’s documentation.

def classify_intent(state):
    """Check if the user question needs a doc search or can be answered directly."""
    question = state["question"]

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are an assistant that decides if a question about statistical tests needs document lookup or not. If it is about definitions or choosing the right test, return 'search'. Otherwise return 'simple'."},
            {"role": "user", "content": f"Question: {question}"}
        ]
    )
    decision = response.choices[0].message.content.strip().lower()

    return {"intent": decision}  # "search" or "simple"

If a question about statistical concepts or tests is asked, then the retrieve_info() node is activated. It performs the RAG in the documentation.

def retrieve_info(state):
    """Use the RAG tool to answer from embedded docs."""
    question = state["question"]
    answer = get_doc_answer(question=question)
    return {"rag_answer": answer}

Once the proper chunk of text is retrieved from ChromaDB, the agent goes to the next node to generate an answer.

def respond(state):
    """Build the final answer."""
    if state.get("rag_answer"):
        return {"final_answer": state["rag_answer"]}
    else:
        return {"final_answer": "I'm not sure how to help with that yet."}

Finally, the last node is to generate a code, if that is applicable. Meaning, if there is an answer where the test can be done using Scipy, there will be a sample code.

def generate_code(state):
    """Generate Python code to perform the recommended statistical test."""
    question = state["question"]
    suggested_test = state.get("rag_answer") or "a statistical test"

    prompt = f"""
    You are a Python tutor. 
    Based on the following user question, generate a short Python code snippet using scipy.stats that performs the appropriate statistical test.

    User question:
    {question}

    Answer given:
    {suggested_test}

    Only output code. Don't include explanations.
    """

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return {"code_snippet": response.choices[0].message.content.strip()}

Notice something important here: all functions in our nodes always have state as an argument because the state is the single source of truth for the entire workflow. Each function, or “node,” in the graph reads from and writes to this central state object.

For example:

The classify_intent function reads the question from the state and adds an intent key.
The retrieve_info function can read the same question and add a rag_answer, which the respond function finally reads to construct the final_answer. This shared state dictionary is how the different steps in the agent’s reasoning and action-taking process stay connected.

Next, let’s put everything together and build our graph!

Building the Graph

The graph is the agent itself. So, what we’re doing here is basically telling LangGraph what the nodes are that we have and how they connect to each other, so the framework can make the information run according to that flow.

Let’s import the modules.

from langgraph.graph import StateGraph, END
from typing_extensions import TypedDict
from langgraph_agent.nodes import classify_intent, retrieve_info, respond, generate_code

Define our state schema. Remember that dictionary that the agent uses to connect the steps of the process? That is it.

# Define the state schema (just a dictionary for now)
class TypedDictState(TypedDict):
    question: str
    intent: str
    rag_answer: str
    code_snippet: str
    final_answer: str

Here, we will create a function that builds the graph.

To tell LangGraph what the steps (functions) in the process are, we use add_node
Once we have listed all the functions, we start creating the edges, which are the connections between the nodes.
We start the process with set_entry_point. This is the first function to be used.
We use add_edge to connect one node to another, using the first argument as the function from which the information comes, and the second argument is where it goes.
If we have a condition to follow, we use add_conditional_edges
We use END to finish the graph and compile to build it.

def build_graph():
    # Build the LangGraph flow
    builder = StateGraph(TypedDictState)

    # Add nodes
    builder.add_node("classify_intent", classify_intent)
    builder.add_node("retrieve_info", retrieve_info)
    builder.add_node("respond", respond)
    builder.add_node("generate_code", generate_code)

    # Define flow
    builder.set_entry_point("classify_intent")

    builder.add_conditional_edges(
        "classify_intent",
        lambda state: state["intent"],
        {
            "search": "retrieve_info",
            "simple": "respond"
        }
    )

    builder.add_edge("retrieve_info", "respond")
    builder.add_edge("respond", "generate_code")
    builder.add_edge("generate_code", END)

    return builder.compile()

With our graph builder function ready, all we have to do now is create a beautiful front-end where we can interact with this agent.

Let’s do that now.

Streamlit Front-End

The front-end is the final piece of the puzzle, where we create a User Interface that allows us to easily enter a question in a proper text box and see the answer properly formatted.

I chose Streamlit because it is very easy to prototype and deploy. Let’s begin with the imports.

import os
import time
import streamlit as st

Then, we configure the page’s look.

# Config page
st.set_page_config(page_title="Stats Advisor Agent",
                   page_icon='🤖',
                   layout="wide",
                   initial_sidebar_state="expanded")

Create a sidebar, where the user can enter their OpenAI API key, along with a “Clear” session button.

# Add a place to enter the API key
with st.sidebar:
    api_key = st.text_input("OPENAI_API_KEY", type="password")

    # Save the API key to the environment variable
    if api_key:
        os.environ["OPENAI_API_KEY"] = api_key

    # Clear
    if st.button('Clear'):
        st.rerun()

Next, we set up the page title and instructions and add a text box for the user to enter a question.

# Title and Instructions
if not api_key:
    st.warning("Please enter your OpenAI API key in the sidebar.")
    
st.title('Statistical Advisor Agent | 🤖')
st.caption('This AI Agent is trained to answer questions about statistical tests from the [Scipy](https://docs.scipy.org/doc/scipy/reference/stats.html) package.')
st.caption('Ask questions like: "What is the best statistical test to compare two means".')
st.divider()

# User question
question = st.text_input(label="Ask me something:",
                         placeholder= "e.g. What is the best test to compare 3 groups means?")

Finally, we can run the graph builder and display the answer on screen.

# Run the graph
if st.button('Search'):
    
    # Progress bar
    progress_bar = st.progress(0)

    with st.spinner("Thinking..", show_time=True):
        
        from langgraph_agent.graph import build_graph
        progress_bar.progress(10)
        # Build the graph
        graph = build_graph()
        result = graph.invoke({"question": question})
        
        # Progress bar
        progress_bar.progress(50)

        # Print the result
        st.subheader("📖 Answer:")
        
        # Progress bar
        progress_bar.progress(100)

        st.write(result["final_answer"])

        if "code_snippet" in result:
            st.subheader("💻 Suggested Python Code:")
            st.write(result["code_snippet"])

Let’s see the result now.

Wow, the result is impressive!

I asked: What is the best test to compare two groups means?
Answer: To compare the means of two groups, the most appropriate test is typically the independent two-sample t-test if the groups are independent and the data is normally distributed. If the data is not normally distributed, a non-parametric test like the Mann-Whitney U test might be more suitable. If the groups are paired or related, a paired sample t-test would be appropriate.

Mission accomplished for what we proposed to create.

Try It Yourself

Do you want to give this Agent a Try?

Go ahead and test the deployed version now!

https://ai-statistical-advisor.streamlit.app

Before You Go

This is a long post, I know. But I hope it was worth to read it to the end. We learned a lot about LangGraph. It makes us think in a different way about creating AI agents.

The framework forces us to think about every step of the information, from the moment a question is prompted to the LLM until the answer that will be displayed. Questions like these start to pop in your mind during the development process:

What happens after the user asks the question?
Does the agent need to verify something before moving on?
Are there conditions to consider during the interaction?

This architecture becomes an advantage because it makes the whole process cleaner and scalable, since adding a new feature can be as simple as adding a new function (node).

On the other hand, LangGraph is not as user-friendly as frameworks like Agno or CrewAI, which encapsulate many of these abstractions in simpler methods, making the process much easier to learn and develop, but also less flexible.

In the end, it is all a matter of what problem is being solved and how flexible you need it to be.