Try our Voice Characters to automate voice agent testing.

Launch trustworthyAI voice apps in weeks

Prompt optimization, automated voice testing, monitoring and more.

Test your AI voice agent against 1000s of simulated users in minutes.

Backed by
yc
Hamming AI (YC S24) - Automated testing for Voice Agents | Product HuntProduct Hunt Daily Badge

Experimentation platform for AI voice agents

AI voice agents are hard to get right. A small change in prompts, function call definitions or model providers can cause large changes in LLM outputs.

We're the only end-to-end platform that supports you from development to production.

Our voice agents call your voice agents

Teams currently spend hours testing their voice agents by hand. Use our voice characters (see demo) to create 1000s of concurrent phone calls to your voice agents and find bugs.

This is 1000x more efficient than testing your voice agents by hand.

Featured in our YC launch here.

Prompt Management

For B2B teams, each of your customers requires a slightly different prompt.

You can store, manage, version and keep your prompts synced with voice infra providers from Hamming.

Featured in our YC launch here.

Prompt Optimizer & Playground

Writing prompts by hand is slow and tedius. Use our prompt optimizer (free to try) to automatically generate optimized prompts for your LLM.

Use our prompt playground to test LLM outputs on a dataset of inputs. Our LLM judges the quality of generated outputs. Save 80% of manual prompt engineering effort.

Featured in our YC launch here.

Active monitoring

Go beyond passive monitoring. We actively track and score how users are using your AI app in production and flag cases using LLM judges that need your attention.

Easily convert calls and traces into test cases and add them to your golden dataset.

Featured in our YC launch here.

Trusted by AI-forward enterprises

Shelden Shi

Shelden Shi

Co-Founder & CTO @ Lilac Labs

We automate the person taking orders at the drive-thru with voice. Order accuracy is extremely critical for our customers. Getting an order wrong (i.e., missing allergies) means significant financial loss.

Hamming helps us simulate 1000s of customer calls (dietary restrictions, allergies, large group orders, etc.) and find gaps in our system in minutes. This gives us a huge peace of mind and clarity on where we need to improve.

Hamming is an essential part of our infra that directly unlocks revenue.

Yossi Eliyahu
Yossi Eliyahu
VP of Engineering @ Fora
There are a lot of low quality AI apps out there. We care a lot about quality. Hamming helps us launch accurate, robust and resilient AI apps that our users love.
Chris Chen
Chris Chen
PM @ Fora
Hamming allows me to test new changes to my AI pipeline 100x faster than vibe checking.
Mark Wai
Mark Wai
Co-Founder & CTO @ Inkly
At Inkly, we're building the modern legal experience for startups using GenAI. Being able to test our system against a dataset of test cases gives us a huge peace of mind and clarity on where we need to improve.
Conner Swann
Conner Swann
Co-Founder @ Intuitive Systems
The team is tackling a huge pain point for me - running evaluations continuously while I'm fine-tuning custom models.

Built for inbound & outbound agents

We're experts in supporting companies tackling high-stakes domains where making mistakes leads to high churn or regulatory consequences.

Our agents call your agent

Icon 01
Icon 02
Icon 02
Icon 03
Icon 04
Icon 05
Icon 06
Built for teams

Our platform scales with you

Building AI agents is a team effort. Hamming is built to support cross-team collaboration.

ML Engineer
Mark
ML Engineer
I love being able to press a button and run 1000s of parallel test calls to my voice agent in minutes. I love that I get detailed reports on where I need to improve. I can now make change to my prompts and see how my agent performs in an extremely tight feedback loop.
Data Scientist
Julia
Data Scientist
I love being able to understand the reasoning behind why the AI judge picked a specific score.
Product Engineer
Victor
Product Engineer
This is like Optimizely for building AI products.
DevOps Engineer
Sarah
DevOps Engineer
We catch regressions before they reach users.

Create scenarios

Create scenarios and characters for your agents that test the entire conversational space.

Experiment Tracking

For each experiment, track your hypothesis, proposed changes and learnings.

Manual override

Override AI scores. Every override aligns the AI judge with your preferences.

Powerful Search

Search across all traces and quickly root-cause why your AI system produced a particular answer.

Sampling

AI pipelines are non-deterministic. You can run multiple runs for the same experiment - visualize performance distributions and isolate flake tests.

Sharing

Share datasets, experiment results and traces with teammates.

FAQs

Use any LLM, agent and voice infra providers

We provide platform-agnostic hooks to simulate conversations, evaluate and log your traces.

Anthropic
OpenAI
Google
Bland
VoCode
Retell
Vapi
Logo

Ship reliable AI voice agents with confidence

We've built mission critical data products at
  • Tesla
  • Microsoft
  • Anduril
  • Square
  • Citizen