We Jailbroke Grok's AI Companion: Ani

How prompt injection exposed latency, QA, and guardrail failures

At Hamming, we red-teamed Grok's AI companion using prompt injection testing, uncovering critical security vulnerabilities and performance issues. Learn how continuous testing and observability keep voice agents safe.

Voice AI audio waveform showing jailbreak test patterns

Hamming works with

LiveKit
Vapi
Retell AI
Pipecat
OpenAI
Synthflow
Daily
11 Labs
LiveKit
Vapi
Retell AI
Pipecat
OpenAI
Synthflow
Daily
11 Labs
LiveKit
Vapi
Retell AI
Pipecat
OpenAI
Synthflow
Daily
11 Labs

How We Broke Grok

A step-by-step breakdown of our prompt injection attack

1

Set Up Red Team Testing

We ran 14 test calls against Grok's AI companion Ani, designed to probe its safety systems and measure reliability metrics.

Our goal: test how easily a production voice agent could be manipulated using prompt injection techniques.

14 test calls executed

What We Learned

Key takeaways from our red teaming experiment

IssueSignal ObservedRiskDetails
Performance FailuresTTFW averaged 4.5s vs 1.5s targetBroken UXLong periods of silence affected the voice user experience. In a customer setting, these breakdowns would feel like the agent had simply stopped listening.
Prompt Adherence FailuresAgent ignored safety defaultsUnsafe behaviorThe agent routinely broke expected behaviors, ignoring its own constraints. Instead of reverting to safe defaults, it followed the injected prompts.
Guardrail FailuresJailbreak bypassed constraintsReputational & legal exposureMost critically, the agent was jailbreakable. By reframing its role as a human, we bypassed safety systems completely.

Why Voice AI Security Matters

Voice agents handle sensitive interactions in real-time. Security isn't optional.

Reputational Risk

A jailbroken voice agent can say things that damage your brand. One viral clip of an agent gone wrong can undo years of trust.

Compliance Exposure

Voice agents in healthcare, finance, and enterprise must meet strict compliance requirements. Security failures mean regulatory risk.

Real-time Stakes

Unlike chatbots, voice agents operate in real-time with no review step. A compromised agent can cause harm before anyone notices.

The bottom line: If you're deploying voice agents, red teaming isn't optional—it's essential.

Frequently Asked Questions

Common questions about voice AI security and jailbreak testing

We used prompt injection techniques to override safety constraints. By layering personal details, quirks, and behavioral rules, we convinced the model to give unfiltered opinions on humanity. The point was to show how easily a voice agent can drift without proper testing and observability.

Voice agents handle sensitive customer interactions in real-time. Without deliberate attempts to break guardrails, it's impossible to see how easily agents can be manipulated. Red teaming is critical—you can't protect against what you haven't tested.

Track guardrail effectiveness, prompt adherence rates, safety policy violations, response quality under attack, and latency impacts. These metrics provide visibility into your agent's security posture with actionable insights.

Hamming provides automated red teaming and security testing for voice agents. We simulate adversarial scenarios, test prompt injection attacks, and monitor guardrail effectiveness in both testing and production environments.

A comprehensive test suite should include prompt injection attempts, role-playing attacks, instruction override tests, guardrail boundary testing, and persona manipulation scenarios. Test across different attack vectors and monitor for policy violations.