Test LiveKit Agents

Run LiveKit-to-LiveKit WebRTC tests with no phone numbers required. Auto-provision rooms, auto-generate test scenarios from your prompt, and replay sessions with full transcripts and 50+ quality metrics.

Hamming works with

LiveKit
Vapi
Retell AI
Pipecat
OpenAI
Synthflow
Daily
11 Labs
LiveKit
Vapi
Retell AI
Pipecat
OpenAI
Synthflow
Daily
11 Labs
LiveKit
Vapi
Retell AI
Pipecat
OpenAI
Synthflow
Daily
11 Labs

Time to value

First test report in under 10 minutes

Connect your provider, sync your agents, and validate real calls in one workflow.

1
Connect LiveKit

Add your LiveKit API key, secret, and server URL.

2
Configure rooms

Choose auto-provisioned or customer-controlled rooms.

3
Run a WebRTC test

Trigger a test room and replay the session after the run.

What you need

  • LiveKit API key and secret (LiveKit Cloud > Settings > Keys).
  • Room creation plan (auto-provisioned or customer-controlled).
  • Optional: enable recording if you want MP4 archives.

Connect in minutes

  1. 1
    Go to Agents > Providers > Connect LiveKit.
  2. 2
    Enter API key, secret, and server URL.
  3. 3
    Select auto or customer-controlled room provisioning.
  4. 4
    Run a test room and confirm transcripts populate.

Validation checklist

Confirm the integration is working before scaling your tests.

  • Provider shows Connected in Agents > Providers.
  • Agents appear in Agents > List with the provider badge.
  • A test run produces transcripts and audio in the run summary.
  • Test room appears and session replay works after the run.

Provider-specific capabilities

Built for LiveKit teams

Provider-aware testing and monitoring without changing your stack.

WebRTC-only testing

Test LiveKit agents with no phone numbers or SIP setup.

Flexible room provisioning

Auto-create rooms or use your own webhook workflow.

Replayable sessions

Review LiveKit sessions with full audio and transcripts.

50+ quality metrics

What we measure

Comprehensive evaluation across accuracy, conversation quality, voice performance, and task completion.

Accuracy & Correctness

  • Factual accuracy
  • Intent recognition
  • Response relevance
  • Hallucination detection

Conversation Quality

  • Turn-taking flow
  • Interruption handling
  • Context retention
  • Conversation completion

Voice & Audio

  • Latency (time to first word)
  • Speech clarity
  • Background noise handling
  • Accent robustness

Task Completion

  • Tool call success rate
  • API integration reliability
  • Goal completion rate
  • Error recovery

Independent evaluation

Why vendor-neutral testing?

Get unbiased results with consistent metrics across all providers—not self-reported scores from your vendor.

Comparison between provider built-in testing and Hamming
AspectProvider built-in testingHamming
ObjectivityOptimized for their platformVendor-neutral evaluation
ConsistencyMetrics vary by providerSame 50+ metrics across all providers
Cross-vendor comparisonCan't compare across vendorsA/B test agents across any provider
IndependenceSelf-reported resultsThird-party validation
ComplianceLimited audit trailSOC 2 certified, audit-ready reports
ScalePlayground-level testing1000+ concurrent production tests

What you get with Hamming

  • Auto-generate test cases and assertions from your prompt.
  • Pull tool call data, transcripts, and recordings directly from your provider.
  • Run your first test in under 10 minutes with 50+ built-in metrics quality metrics.
  • Test both voice and chat agents with unified evaluation.

Frequently Asked Questions

Everything you need to know about testing LiveKit agents with Hamming.

Connect your LiveKit API credentials to Hamming, configure room provisioning (auto or customer-controlled), and run automated tests. Hamming simulates real conversations via WebRTC and evaluates agent responses with 50+ quality metrics.

No. Hamming tests LiveKit agents via native WebRTC, eliminating the need for phone numbers or SIP infrastructure. Tests run directly through LiveKit rooms.

Hamming evaluates 50+ metrics including response accuracy, latency, conversation flow, intent recognition, and custom assertions. All metrics are audio-native with 95-96% agreement with human evaluators.

Yes. Every test run captures full transcripts, audio recordings, and tool call data. You can replay sessions and debug issues with complete conversation context.

Most teams run their first test within 10-15 minutes. Connect your LiveKit API key and secret, configure room settings, and Hamming auto-generates test scenarios from your agent's prompt.

Hamming simulates real user behavior including interruptions, overlapping speech, and mid-sentence changes. Tests validate that agents handle barge-in gracefully without losing conversation context.