p90 Latency

Jump to Section

Overview

P90 latency represents the response time where 90% of requests are served faster than this value, effectively capturing the worst-case scenario that most users encounter. Hamming AI's analytics research emphasizes p90 as crucial for understanding the experience of users on slower connections or during system stress. This metric reveals performance issues that averages and even p50 might hide.

Use Case: Reveals performance issues affecting 10% of users - the worst-case scenarios.

Why It Matters

According to Hamming AI's voice agent analytics, p90 latency exceeding 2 seconds indicates potential infrastructure issues that need immediate attention. This metric is critical because it represents the experience of 1 in 10 users - frequent enough to generate complaints and damage reputation. Voice agents with high p90 latency see increased call transfers, user frustration markers, and reduced CSAT scores. Hamming's data shows that optimizing p90 latency has outsized impact on user satisfaction.

How It Works

P90 is calculated by taking all response times, sorting them, and finding the value at the 90th percentile. For voice agents, high p90 often indicates issues like cold starts, network congestion, or resource contention. The gap between p50 and p90 (called 'tail latency') reveals system stability - a large gap suggests inconsistent performance.

Common Issues & Challenges

Organizations implementing p90 Latency frequently encounter challenges with measurement accuracy, inconsistent performance across different network conditions, and difficulty achieving target benchmarks. High p90 Latency often results from inadequate infrastructure, unoptimized models, or poor network connectivity. Automated testing and monitoring can help identify these issues before they impact production callers.

Implementation Guide

Monitor p90 latency continuously with more aggressive alerting than p50. Set thresholds at 1.5 seconds for warnings and 2 seconds for critical alerts. Use Hamming AI's approach of drilling down from p90 spikes directly to affected transcripts and audio recordings for rapid debugging. Track p90 during peak hours separately as it often degrades under load. Source: https://hamming.ai/blog/anatomy-of-a-perfect-voice-agent-analytics-dashboard

Frequently Asked Questions

90th percentile latency - 90% of requests are faster than this value.

Reveals performance issues affecting 10% of users - the worst-case scenarios.

p90 Latency is supported by: Hamming, Deepgram, Vapi.

p90 Latency plays a crucial role in voice agent reliability and user experience. Understanding and optimizing p90 Latency can significantly improve your voice agent's performance metrics.

Overview

Why It Matters

How It Works

Common Issues & Challenges

Implementation Guide

Frequently Asked Questions

What is p90 Latency?

When should I use p90 Latency?

Which platforms support p90 Latency?

How does p90 Latency affect voice agent performance?