
Basata scales healthcare voice agent testing
6x faster with Hamming
- Location
- United States
- Industry
- Healthcare AI
- Stage
- Private
Use Cases:
- •Pre-deployment voice agent testing
- •Automated test case generation via API
- •Accessibility and persona testing
- •Agentic prompt optimization
Meet Basata
Basata uses AI to eliminate administrative overhead for healthcare providers, automating patient communication so clinicians can focus entirely on care. Their voice AI agents handle scheduling, patient outreach, and practice communication across specialties like cardiology.
Healthcare practices are old institutions with older technologies and processes. Patients rely on the phone as their central way to interact with their practice. Basata's AI agents bridge that gap, but the stakes are high: in cardiology, patients are dealing with heart problems, and missed communications have real-world consequences.
With every new customer bringing unique configurations, conversation flows, and guardrails, Basata needed a way to validate agent behavior programmatically. Manual testing simply could not keep pace with the growing complexity.
With Hamming, Basata was able to
The Challenge: Manual Testing Doesn't Scale in High-Stakes Healthcare
Basata's voice AI agents handle patient communication for healthcare practices where urgency is the norm. A missed appointment in cardiology isn't an inconvenience, it means a patient isn't getting care. Every agent deployment requires rigorous validation to ensure it behaves correctly, follows guardrails, and doesn't make promises the practice can't keep.
But each healthcare practice requires deep customization. Older institutions with established processes need AI that fits their workflows, not the other way around. Every customer agent has its own conversation flows, scheduling rules, and edge cases. Testing every path manually meant making individual phone calls, one at a time, for an hour and a half per cycle. And with LLMs producing nondeterministic responses, the same test needed to run multiple times for confidence.
The core challenge in summary:
- ✕1.5 hours of manual phone calls per test cycle, growing with every new customer
- ✕One person, one call at a time, no parallelism possible
- ✕Cannot simulate diverse patient accents, demographics, or speech patterns
- ✕Socially exhausting, cognitive energy spent on calling instead of building
- ✕High-stakes healthcare context where failures mean patients don't get care
The Impact
| Metric | Result |
|---|---|
| Faster testing cycles vs manual calls | 6x |
| Reduction in manual testing time | 83% |
| Full test suite execution, down from 90 min | 15 min |
Before & After Hamming
Solution: API-First Testing That Scales with Every Customer
Programmatic Test Case Generation via API
Basata's AI agents are highly configurable per healthcare practice, each with its own conversation flows, scheduling rules, and guardrails. Manually writing and maintaining test cases for every configuration was not sustainable.
With Hamming's API, the team defines agent behavior in YAML and programmatically converts it into test cases. This approach enables them to:
- Auto-generate test cases from configured agent functionality
- Manage state across multi-step conversations (e.g., verifying appointments were actually booked)
- Inject scenario-specific test data into each run
- Keep tests in sync with agent configuration changes
The programmatic layer eliminates the need to manually create or update test cases through a UI.
Concurrent Test Execution Across All Customers
Every healthcare practice Basata serves has its own agent with unique conversation flows and conditionals. Manually testing every path required individual phone calls, one at a time, with a single person on each.
With Hamming, the entire test suite runs concurrently. Instead of spending 90 minutes on serial calls, the team fires off all tests at once and gets results in 15 minutes. This concurrent execution is critical because:
- Every agent has its own conversation flow with unique conditionals
- LLMs are stochastic, so tests must run multiple times for confidence
- The number of required tests grows exponentially with each new customer
What previously required an hour and a half of focused manual effort now completes in a fraction of the time, freeing the team to focus on building rather than testing.
Diverse Persona Testing for Patient Accessibility
Basata serves a diverse patient population, many of whom rely on the phone as their primary way to interact with their healthcare practice. The team has observed that regional differences in voice and communication style significantly affect how patients respond to the AI agent, with some combinations causing immediate hang-ups.
With Hamming's persona library, the team tests across demographics they simply cannot replicate manually. Switching between genders, accents, and speaking speeds doesn't scale with a human team, but it's built into the platform:
- Different accents and regional speech patterns
- Varying speaking speeds and communication styles
- Voice and gender combinations that trigger different patient responses
- Demographics representative of real patient populations
The team plans to leverage this capability further through A/B testing different voice and persona combinations in production, using Hamming's data to validate which configurations work best for different patient demographics and regions, moving from reactive fixes to proactive optimization based on real data.
Agentic Feedback Loops for Prompt Optimization
One of the most sophisticated uses of Hamming at Basata goes beyond traditional testing. The team feeds Hamming's LLM judge evaluator descriptions back into a local agent that automatically tunes system prompts, model selection, and agent configuration.
This creates a closed-loop optimization cycle:
- Run test suite via Hamming API
- Collect evaluator descriptions and failure analysis
- Feed results to a local agent that updates the system prompt
- Re-run tests to validate improvements
For the team, manual prompt engineering felt less like engineering and more like trial and error. This closed-loop system replaces that guesswork with data-driven optimization, letting the agents improve themselves.
Why Basata Chose Hamming
Basata evaluated their options and chose Hamming specifically for its programmatic, API-first approach. The ability to define, run, and manage tests entirely through code was the key differentiator that led them to switch to the platform.
The team values being realistic about what AI can and cannot do. They engineer their product around the ability to audit AI behavior, validate outputs, and integrate into existing healthcare workflows. Hamming fits that philosophy.
Basata chose Hamming because:
“I find talking to the agents completely socially exhausting. The platform offers all of these different personas that the team is not able to replicate. We want to make sure that our diverse customer base is represented in the agents themselves.”
Blake Jones, AI Engineer at Basata
The Results
Hamming has fundamentally changed how Basata validates their healthcare voice agents. What once required dedicated hours of exhausting manual calling now runs automatically, giving the team more time to focus on building better agents and serving more healthcare practices.
The impact grows with every new customer. Each new healthcare practice adds unique configurations that would have multiplied the manual testing burden. With Hamming, the testing infrastructure scales automatically.
6x Faster Testing Cycles
Testing time dropped from 90 minutes of manual phone calls to 15 minutes of automated concurrent execution. The team fires off all tests at once and reviews results, instead of spending an hour and a half on individual calls. This time savings compounds with every new customer deployment.
Tighter Development Feedback Loops
Before Hamming, each iteration cycle during development required another round of manual phone calls. The cognitive and social exhaustion of talking to AI agents drained energy from actual engineering work. Now the team can iterate rapidly: make a change, run the suite, review results, and repeat, all without picking up a phone.
Automated Prompt Optimization
The team built an agentic feedback loop that takes Hamming's evaluator descriptions and automatically tunes agent configuration. This closed-loop system replaces manual prompt engineering with data-driven optimization, improving agent quality while reducing human effort to near zero.
“It's been such a relief to be able to start shifting a lot of this stuff off our plates so we can go focus on more of the interesting problems.”
Blake Jones, AI Engineer at Basata
Featured customer stories
How Grove AI ensures reliable clinical trial recruitment with Hamming
How Hamming enables Podium to consistently deliver multi-language AI voice support at scale

How NextDimensionAI ships safer, faster healthcare voice agents with Hamming
How Grove AI ensures reliable clinical trial recruitment with Hamming