Talk to my Agent | Towards Data Science

the past several months, I’ve had the opportunity to immerse myself in the task of adapting APIs and backend systems for consumption by LLMs, specifically agents using the MCP protocol. Initially, I expected the experience to be no different than any other similar development projects I’ve done in the past. I was fascinated to discover, however, that these autonomous clients are a new type of creature. Consequently, evolving APIs to yield the most value from agent interaction required more than simply making them accessible.

This post is a result of my experimentations and field testing, hopefully it can be useful to other practitioners.

The power and curse of autonomy

Image generated by author (Midjourney)

We developers are used to 3rd-party tools and automation processes interacting with the application APIs. Our interfaces have therefore evolved around best practices that best support these use cases. Transactional, versioned, contract-driven APIs, minded to enforce forward/backward compatibility and built for efficiency. These are all important concerns that are secondary in priority and often simply irrelevant when considering the autonomous user.

With agents as clients, there is no need to worry about backward/forward compatibility as each session is stateless and unique. The model will study how to use tools each time it discovers them, arriving at the right combination of API calls to achieve its objective. As enthusiastic as this agent may be, however, it will also give up after a few failed attempts unless given proper incentive and guidelines.

More importantly, without such clues it could succeed in the API call but fail to meet its objectives. Unlike scripted automations or experienced developers, it only has the API documentation and responses to go on in planning out how to meet its goals. The dynamic nature of its response is both a blessing and a curse as these two sources are also the sum of knowledge it can draw upon to be effective.

Conversation-Driven APIs

I had first realized that the agent would require a different type of design while troubleshooting some cases in which the agent was not able to get to the desired results. I provided MCP tool access to an API that provides usage information for any code function based on tracing data. Sometimes it seemed the agent was simply not using it correctly. Looking more closely at the interaction, it seemed that the model was correctly calling the tool and for various reasons received an empty array as a response. This behavior would be 100% correct for any similar operation in our API.

The agent, however, had trouble comprehending why this was happening. After trying a few simple variations, it gave up and decided to move on to other avenues of exploration. To me, that interaction spelled out a missed opportunity. No one was at fault; transactionally, the behavior was correct. All of the relevant tests would pass, but in measuring the effectiveness of using this API, we found out the ‘success rate’ was ridiculously low.

The solution turned out to be a simple one, instead of returning an empty response, I decided to provide a more detailed set of instructions and ideas:

var emptyResult = new NoDataFoundResponse()
{
    Message = @"There was no info found based on the criteria sent.
        This could mean that the code is not called, or that it is not manually instrumented 
        using OTEL annotations.",
    SuggestedNextSteps = @"Suggested steps: 
    1. Search for endpoints (http, consumers, jobs etc.) that use this function. 
       Endpoints are usually automatically instrumented with OTEL spans by the 
       libraries using them.
    2. Try calling this tool using the method and class of the endpoint 
       itself or use the GetTraceForEndpoint tool with the endpoint route. 
    3. Suggest manual instrumentation for the specific method depending on the language used in the project
       and the current style of instrumentation used (annotations, code etc.)"

};

Instead of just returning the results to the agent, I was trying to do something agents will often attempt as well — keep the conversation going. My perceptions of API responses, therefore, changed. When being consumed by LLMs, beyond serving functional purposes, they are, in essence, a reverse prompt. An ended interaction is a dead end, however, any data we return back to the agent gives it a chance to pull on another thread in its investigative process.

HATEOAS, the ‘choose your own adventure’ APIs

Thinking about the philosophy of this approach, I realized that there was something vaguely familiar about it. A long time ago, when I was taking my first steps crafting modern REST APIs, I was introduced to the concept of hypermedia APIs and HATEOAS: Hypertext As Engine of the Application State. The concept was outlined by Fielding in his seminal 2008 blog post REST APIs must be hypertext-driven. One sentence in that post completely blew my mind at the time:

“Application state transitions must be driven by client selection of server-provided choices that are present in the received representations”

In other words, the server can teach the client what to do next instead of simply sending back the requested data. The canonical example is a simple GET request for a specific resource, whereby the response provides information on actions the client can take next on that resource. A self-documenting API where the client was not required to know anything about it ahead of time except a single entry point from which a branch of choices emerges. Here is a good example from the Wikipedia page:

HTTP/1.1 200 OK

{
    "account": {
        "account_number": 12345,
        "balance": {
            "currency": "usd",
            "value": 100.00
        },
        "links": {
            "deposits": "/accounts/12345/deposits",
            "withdrawals": "/accounts/12345/withdrawals",
            "transfers": "/accounts/12345/transfers",
            "close-requests": "/accounts/12345/close-requests"
        }
    }
}

At the time, I was fascinated by this concept, which reminded me of what is commonly referred to as ‘choose your own adventure’ books or ‘gamebooks’. This genre of books, an inseparable part of my childhood, did not simply relay a story (or provide an API response by this metaphor), but also gave the reader a key as to what are the next set set of options available to it. Hypermedia REST APIs were self-documenting and offered users a way to understand the state of the application and the operations available for each entity or process resource without having to read through extensive documentation.

Hypermedia on steroids

One way to look at Hypermedia APIs is that they provide more context to the user as a part of the response. Context, to agents, is everything, and it certainly does not need to stop with available options or operation. An API is an interaction point in which context can be relayed and provided in a form that would encourage further interaction. Let’s take a look at another example!

Another tool I was working on, allows the model to retrieve runtime issues found in the deployment environment, again based on observability data. The specific result I was testing prompt response for, was an anomaly found in the performance of a specific endpoint. It seems that at times responses were EXTREMELY slow, ~70X slower than the median. Providing that information piece to the LLM was helpful, but ultimately did not accomplish more than simple repeats of the data provided.

For reference, here is the response provided, as well as the agent output:

{
      "Name": "Performance Anomaly",
      "Category": "Performance",
      "P50":
      {
          "Value": 12.33,
          "Unit": "ms",
          "Raw": 12331700.0
      },
      "P95":
      {
          "Value": 909.62,
          "Unit": "ms",
          "Raw": 909625000.0
      },
      "SlowerByPercentage": 7376.314701136097,

      "SpanInfo":
      {
          ....
      },
      
      #more data  
      ....

        
}

There is nothing functionally wrong with the API response or the way the information was mediated to the user by the agent. The only problem is that there is a lot of context and ideas missing from it that could leverage the agent’s ability to take that conversation forward. In other words, this is a traditional API request/response interaction, but agents through reasoning are capable of so much more. Let’s see what happens if we modify our API to inject additional state and suggestions to try to carry the conversation forward:

{

  "_recommendation": 
      "This asset's P95 (slowest 5%) duration is disproportionally slow 
       compared to the median to an excessive degree
       Here are some suggested investigative next steps to get to the 
       root cause or correct the issue: 
       1. The issue includes example traces for both the P95 and median
          duration, get both traces and compare them find out which asset 
          or assets are the ones that are abnormally slow sometimes
       2. Check the performance graphs for this asset P95 and see if there 
          has been a change recently, if so check for pull requests 
          merged around that time that may be relevan tot his area 
       3. Check for fruther clues in the slow traces, for example maybe 
          ALL spans of the same type are slow at that time period indicating
          a systematic issue"

    "Name": "Performance Anomaly",
    "Category": "Performance",
    "P50":
    {
        ...
    },
      #more data

All we’ve done is give the AI model a little more to go on. Instead of simply returning the result, we can feed the model with ideas on how to use the information provided to it. Surely enough, these suggestions are immediately put to use. This time, the agent continues to investigate the problem by calling other tools to inspect the behavior, compare the traces and understand the problem lineage:

With the new information in place, the agent is happy to continue the exploration, examine the timeline and synthesize the results from the various tools until it comes up with new data, that was in no way part of the original response scope:

Wait… Shouldn’t all APIs be designed like that?

Absolutely! I definitely believe that this approach could benefit users, automation developers, and everyone else — even if they use brains for reasoning rather than LLM models. In essence, a conversation-driven API can expand the context beyond the realm of data and into the realm of possibilities. Opening up more branches of exploration for agents and users alike and improving the effectiveness of APIs in solving the underlying use case.

There is definitely more room for evolution. For example, the hints and ideas provided to the client by the API in our example were static, what if they were AI-generated as well? There are many different A2A models out there, but at some point, it could just be a backend system and a client brainstorming about what the data means and what could be done to understand it better. As for the user? Forget about him, talk to his agent.

Talk to my Agent | Towards Data Science

The power and curse of autonomy

Conversation-Driven APIs

HATEOAS, the ‘choose your own adventure’ APIs

Hypermedia on steroids

Wait… Shouldn’t all APIs be designed like that?

Related Posts

The Stanford Framework That Turns AI into Your PM Superpower

Smarter Decisions at Scale: How Lotus’s Uses AI and NLQ to Empower 3,000+ Stores with Real-Time Intelligence

Leave a Reply Cancel reply