All posts
Tagged

AI Agent Testing

5 posts

Agent Reliability

A better model can make your agent worse

A stronger model with higher scores looks like a free upgrade, until the agent that worked last week starts getting things wrong, quietly. Here is what happened when we ran one agent on two frontier models and changed nothing else.

Voxli

AI Agent Testing

Testing for Speculation using Voxli

In our last post we covered the risks of agent speculation. Today we look at how to set up Voxli to catch those speculations — using a feature called Hallucination detection.

Mahey Qadir

AI Agents

The Risks of Agent Speculation

It's no surprise that hallucinations are a common known failure during agentic AI testing. The agent starts to overpromise, begins to fabricate answers and even claims that it…

Voxli