Poor Quality Metrics
Diagnose and resolve low MOS scores, high jitter, packet loss, high RTT, video freezes, and other quality issues in CallMeter test results.
When tests complete successfully but the metrics indicate poor call quality, this guide helps you identify the root cause and take corrective action. Poor quality metrics mean the SIP infrastructure is functional (calls connect and media flows) but the user experience would be unacceptable.
Start by identifying which metric is degraded, then follow the targeted diagnosis section for that metric.
Quick Diagnosis Matrix
| Symptom | Primary Metric | Most Likely Cause | Section |
|---|---|---|---|
| Calls sound choppy or robotic | Packet loss above 3% | Network congestion, firewall drops | High Packet Loss |
| Calls have echo or delay | RTT above 200ms | Geographic distance, routing inefficiency | High Round-Trip Time |
| Audio sounds jittery or stuttering | Jitter above 40ms | Network congestion, unstable path | High Jitter |
| Overall bad quality score | MOS below 3.5 | Combination of above factors | Low MOS Score |
| Video freezes or pixelation | Freeze count, PLI count | Bandwidth limitation, packet loss | Video Quality Issues |
| Audio drops or gaps | PLC events high | Burst packet loss | Audio-Specific Issues |
Low MOS Score
MOS (Mean Opinion Score) is a composite quality metric derived from jitter, packet loss, RTT, and codec characteristics. A low MOS score indicates that one or more underlying metrics are degraded.
Interpreting MOS Values
| MOS Range | Quality Level | User Perception |
|---|---|---|
| 4.0 - 5.0 | Good to Excellent | Clear conversation, no noticeable issues |
| 3.5 - 4.0 | Acceptable | Minor quality reduction, still usable |
| 3.0 - 3.5 | Poor | Noticeable issues, some effort required |
| 2.5 - 3.0 | Bad | Significant degradation, difficult conversation |
| Below 2.5 | Unusable | Communication is effectively impossible |
Diagnosis Steps
- Identify the contributing factor. Open the endpoint detail view and compare the individual metrics against their expected ranges:
| Metric | Good Range | Warning Range | Critical Range |
|---|---|---|---|
| Jitter | Below 20ms | 20ms - 40ms | Above 40ms |
| Packet Loss | Below 1% | 1% - 3% | Above 3% |
| RTT | Below 100ms | 100ms - 200ms | Above 200ms |
-
Check if the issue is symmetric or asymmetric. Use the direction filter (send/receive) on the endpoint detail view. If only one direction shows poor quality, the issue is on one network path, not both.
-
Check if the issue is consistent or intermittent. Look at the time-series chart for the degraded metric. A consistent value suggests a steady-state condition (e.g., long geographic path). Spikes suggest transient events (e.g., congestion bursts).
-
Compare across endpoints. If all endpoints show similar degradation, the issue is likely in the shared network infrastructure. If only some endpoints are affected, the issue may be specific to certain workers, registrars, or network paths.
-
Follow the specific metric section below for the primary degraded metric.
MOS Diagnostic Decision Tree
When MOS is below your threshold, check metrics in this order to find the root cause:
- Packet loss above 1%? This is the most common MOS killer. Go to High Packet Loss. Even 2% loss drops MOS from 4.3 to below 3.5 with G.711.
- Jitter above 30ms? High jitter means the jitter buffer is working overtime and may be discarding late packets. Go to High Jitter.
- RTT above 200ms? RTT above 200ms introduces conversational delay and is factored into the E-model calculation. Go to High Round-Trip Time.
- All three metrics look acceptable? Check the codec. G.711 starts with a higher theoretical MOS ceiling (4.4) than G.729 (3.9). A codec with a low impairment factor will produce a lower MOS even on a perfect network.
- MOS is low on receive but not send? The problem is on the inbound network path. Check the receive-direction metrics specifically.
- MOS degrades over time within a call? Look for network saturation that builds as more calls ramp up during the test's buildup phase.
High Jitter
Jitter measures the variation in packet arrival timing. Consistent, evenly-spaced packets have low jitter. Packets arriving with irregular spacing have high jitter. The jitter buffer at the receiving end compensates for moderate jitter, but excessive jitter overwhelms the buffer and causes audio artifacts.
What "High" Means
- Below 20ms: Normal for most networks
- 20ms to 40ms: Elevated. The jitter buffer is working harder. Minor quality impact.
- 40ms to 80ms: High. The jitter buffer may not fully compensate. Noticeable quality impact.
- Above 80ms: Very high. Expect audible glitches, late packet discards, and poor MOS.
Common Causes and Resolutions
Network congestion
- Indicator: Jitter spikes correlate with high overall network utilization. Jitter may be higher during business hours.
- Diagnosis: Check if other traffic on the same network path is competing for bandwidth.
- Resolution: Apply QoS markings (DSCP EF for voice, AF41 for video) and configure network equipment to prioritize marked traffic. Alternatively, use a dedicated network segment for VoIP traffic.
Shared or oversubscribed links
- Indicator: Jitter varies significantly by time of day or correlates with other network activity.
- Resolution: Move VoIP testing to a dedicated bandwidth allocation or schedule tests during low-traffic periods.
Wireless network segments
- Indicator: Irregular jitter patterns with high variance, not correlated with traffic volume.
- Resolution: Use wired connections for workers and SIP infrastructure. Wi-Fi introduces unpredictable jitter due to contention, interference, and retransmission.
Buffer bloat
- Indicator: Jitter is high but relatively stable (not spiky). RTT is also elevated.
- Resolution: Reduce buffer sizes on intermediate network equipment. Enable AQM (Active Queue Management) or ECN (Explicit Congestion Notification) on routers in the media path.
VPN or tunnel overhead
- Indicator: Jitter is significantly higher when traffic traverses a VPN or encrypted tunnel.
- Resolution: Exclude VoIP traffic from the VPN tunnel, or use a VPN solution optimized for real-time traffic. If testing must go through a VPN, establish baseline jitter through the VPN and set thresholds accordingly.
High Packet Loss
Packet loss occurs when RTP packets are transmitted but never arrive at the receiver. Even small amounts of packet loss significantly degrade voice quality because each lost packet removes 20ms (default ptime) of audio.
What "High" Means
- Below 0.5%: Normal. Imperceptible to users.
- 0.5% to 1%: Elevated. May cause occasional artifacts with PLC concealment.
- 1% to 3%: High. Noticeable quality degradation. PLC cannot fully conceal.
- 3% to 5%: Very high. Conversation becomes difficult.
- Above 5%: Severe. Communication is significantly impaired.
Common Causes and Resolutions
Network congestion
- Indicator: Loss rate correlates with bandwidth utilization. Loss increases during peak periods.
- Resolution: Increase available bandwidth, apply QoS prioritization, or reduce the number of concurrent media streams on the affected link.
Firewall or security device dropping packets
- Indicator: Consistent loss from the start of the call, not correlated with traffic volume. Loss rate is steady.
- Resolution: Verify that firewall rules allow UDP traffic on the negotiated RTP port range. Check for deep packet inspection (DPI) or intrusion prevention systems (IPS) that may be dropping or rate-limiting UDP traffic. Ensure connection tracking timeouts for UDP are long enough for the call duration (at least the call duration plus 30 seconds).
MTU / fragmentation issues
- Indicator: Loss appears at specific bitrates (typically when video is enabled). Loss is absent at lower bitrates.
- Resolution: Check the path MTU between the worker and the SIP infrastructure. If RTP packets are being fragmented, the loss of a single fragment causes the entire packet to be discarded. Reduce the video bitrate to keep packets below the path MTU, or configure the media encoder to generate smaller packets.
Rate limiting on network equipment
- Indicator: Steady, consistent low-level loss (e.g., exactly 1-2%) that does not vary with traffic conditions.
- Resolution: Check for per-flow or per-port rate limits on routers, switches, or firewalls in the media path. Disable rate limiting for VoIP traffic or increase the limit.
ISP-level issues
- Indicator: Loss correlates with geographic routing. All endpoints using the same ISP or transit path show similar loss.
- Resolution: Contact the ISP or use a different transit path. Deploying workers in the same data center or cloud region as the SIP infrastructure eliminates ISP transit as a variable.
Burst Loss vs Random Loss
Check the sequence gap metric alongside packet loss. A high sequence gap count with moderate overall loss indicates burst loss (multiple consecutive packets lost together). Burst loss is more damaging to audio quality than random loss because PLC cannot conceal multiple consecutive missing packets. Burst loss typically indicates queue overflow events on a congested link.
High Round-Trip Time (RTT)
RTT measures the time it takes for a signal to travel from the sender to the receiver and back. High RTT causes perceptible delay in conversation, making interactive communication difficult.
What "High" Means
- Below 100ms: Normal. Imperceptible delay.
- 100ms to 200ms: Elevated. Slight delay perceptible in conversation.
- 200ms to 300ms: High. Noticeable delay. Users begin to talk over each other.
- Above 300ms: Very high. Conversation becomes awkward. Users perceive significant echo.
- Above 500ms: Satellite-level delay. Real-time conversation is extremely difficult.
Common Causes and Resolutions
Geographic distance
- Indicator: RTT is consistently high but stable, and the numeric value roughly corresponds to the physical distance (light speed in fiber is approximately 5ms per 1000km, round trip).
- Resolution: Deploy workers geographically closer to the SIP infrastructure. If testing an international route, the high RTT is inherent to the path and should be factored into quality expectations.
Suboptimal routing
- Indicator: RTT is higher than expected for the geographic distance. For example, 150ms for endpoints in the same city.
- Resolution: Investigate the network routing with traceroute. Traffic may be taking an indirect path through distant peering points. Work with your ISP or cloud provider to optimize routing.
SIP proxy overhead
- Indicator: RTT increases proportionally with the number of SIP proxies in the path. RTCP RTT is lower than SIP-measured RTT.
- Resolution: Check if SIP proxies or B2BUAs in the path are adding processing delay. Optimize or remove unnecessary intermediaries.
VPN or tunnel latency
- Indicator: RTT increases significantly when traffic passes through a VPN endpoint in a distant location.
- Resolution: Ensure VPN routing is geographically optimal. The VPN endpoint should be in the same region as the SIP infrastructure or the worker.
Video Quality Issues
Video calls have additional quality dimensions beyond audio: visual clarity, frame rate, freeze events, and keyframe recovery.
Video Freezes
- Metric:
freeze_count(number of freeze events),total_freezes_duration_ms(total time frozen) - Cause: Packet loss causes the decoder to lose reference frames. Until a new keyframe arrives, the video appears frozen.
- Resolution:
- Reduce packet loss (see High Packet Loss section above)
- Reduce video bitrate to decrease the impact of each lost packet
- Increase keyframe frequency (shorter GOP length) so recovery happens faster
- Enable NACK-based retransmission (requires AVPF or SAVPF profile)
Low Frame Rate
- Metric:
frames_decoded_rate(frames per second at the decoder) - Cause: Insufficient bandwidth, encoder performance issue, or packet loss causing frame drops.
- Resolution:
- Verify available bandwidth supports the configured video bitrate
- Reduce video resolution or frame rate in the test configuration
- Check for competing bandwidth consumers on the network path
High PLI/FIR Count
- Metric:
pli_count,fir_count - Cause: PLI and FIR requests indicate the receiver is losing video picture integrity and needs a keyframe. High counts mean the receiver is frequently losing sync.
- Resolution:
- Address the underlying packet loss causing picture corruption
- Consider switching to a more error-resilient codec (VP8 has built-in error resilience features)
- Reduce video bitrate to lower the packet loss impact
High NACK Count
- Metric:
nack_count - Cause: The receiver is detecting lost packets and requesting retransmission. High NACK counts indicate a lossy network path.
- Resolution:
- Address the underlying packet loss
- NACK-based retransmission adds delay. If RTT is already high, NACK retransmission may arrive too late. In this case, relying on keyframe recovery (PLI) may be more effective.
Audio-Specific Issues
High PLC Events
- Metric:
plc_events(packet loss concealment activations per second) - Cause: Missing audio packets trigger PLC in the decoder.
- Resolution: Address the underlying packet loss. Opus codec provides significantly better PLC than G.711.
Audio Level Issues
- Metric:
audio_level - Cause: If audio levels are very low or zero, the media path may be broken (one-way audio) or the media file may be silent.
- Resolution:
- Check for one-way audio (NAT issue, firewall issue)
- Verify the media file or synthetic audio configuration is correct
- Check the send-direction audio level to confirm the endpoint is transmitting
One-Way Audio Diagnosis
One-way audio means one side can hear the other, but not vice versa. This is almost always a network-layer issue, not a codec or application problem.
How to Identify
- Check the
audio_levelmetric in both directions (send and receive) on the endpoint detail page - If the send direction shows normal audio levels but the receive direction shows zero (or vice versa), you have one-way audio
- Check the
packets_receivedmetric. If one direction shows zero packets received, media is not arriving at all in that direction.
Common Causes
- Asymmetric NAT/firewall: Outbound UDP from the worker reaches the remote endpoint, but the return path is blocked. The remote endpoint's RTP packets are dropped by the worker's NAT or firewall. Check that the firewall permits inbound UDP on the RTP port range.
- SDP IP address issue: The SDP
c=line contains a private IP that the remote endpoint cannot route to. The remote side sends media to an unreachable address. See the Firewall and NAT Troubleshooting section. - Media relay failure: If an SBC or media proxy relays the RTP stream, a failure in the relay will break one direction while the other continues through a different path.
- SDP direction attribute: Check the SDP for
a=sendonlyora=recvonlyattributes. If the media direction was inadvertently restricted during negotiation, one side will not transmit.
Echo and Noise Issues
Echo and background noise degrade call quality even when network metrics (loss, jitter, RTT) look acceptable.
Echo
Echo occurs when audio from the speaker feeds back into the microphone on the far end, returning to the original speaker with a delay. In CallMeter tests, echo typically indicates an issue on the remote SIP infrastructure side (since CallMeter endpoints use media files, not live microphones).
- Acoustic echo: The remote endpoint's speaker output is picked up by its microphone. This is a hardware/endpoint issue, not a network issue. CallMeter's metrics will show normal values, but a human listener would perceive echo.
- Hybrid echo: Impedance mismatch at a 2-wire to 4-wire conversion point (PSTN gateway). Common when the call path includes a PSTN leg.
- RTT contribution: Higher RTT makes echo more noticeable. If RTT is above 100ms, even small amounts of echo become perceptible.
Noise
If audio quality sounds noisy despite good network metrics:
- Comfort noise generation: Some codecs and endpoints generate comfort noise during silence periods. This is expected behavior, not a network problem.
- Transcoding artifacts: If an SBC or gateway transcodes between codecs (e.g., Opus to G.711 to G.729), each transcoding step introduces quality loss. Check whether the call path includes unnecessary codec conversions.
- Low bitrate codecs: Codecs configured at very low bitrates (e.g., Opus at 6 kbps) will produce audible compression artifacts even on a perfect network. Increase the bitrate or switch to a higher-quality codec.
Video-Specific Quality Problems
Video Freezes
Video freezes are the most common video quality complaint. The video image stops updating while audio continues.
Reading the metrics:
freeze_counttells you how many distinct freeze events occurredtotal_freezes_duration_mstells you the total time the video was frozen- A few short freezes (under 500ms each) may be imperceptible. Freezes longer than 1 second are noticeable and disruptive.
Root cause analysis:
- Check
packet_lossfor the video stream. Even 0.5% loss on a high-bitrate video stream causes frequent keyframe corruption. - Check
pli_count. A high PLI count means the receiver is repeatedly asking for keyframes because it lost reference frames. Each PLI request and keyframe response adds delay before the video recovers. - Check available bandwidth. If the video bitrate exceeds the available bandwidth, the network drops packets, causing freezes. Reduce the configured bitrate.
Low Resolution or Blurry Video
- Bandwidth adaptation: Some SBCs and endpoints dynamically reduce video resolution when bandwidth is limited. Check whether the test's configured resolution matches what the endpoint actually sends.
- Bitrate too low for resolution: A 720p video stream at 256 kbps will produce a blurry, heavily compressed image. Ensure the bitrate is appropriate for the configured resolution (at least 1 Mbps for 720p, 2-4 Mbps for 1080p).
Reading Patterns in Metric Charts
The time-series charts on the endpoint detail page reveal patterns that point to specific root causes. Here are the most common patterns and what they mean.
Steady Degradation from Call Start
Metrics start bad and stay bad for the entire call duration. This indicates a steady-state condition: geographic distance (high RTT), persistent network congestion, or a misconfigured path. The cause is not transient. Look at the infrastructure between the worker and the registrar.
Degradation That Builds During the Test
Metrics are good at the beginning but worsen as the test progresses. This pattern is classic for network saturation. As more endpoints come online during the buildup phase, they consume more bandwidth, pushing the network past its capacity. The fix is to reduce endpoint count, increase available bandwidth, or apply QoS.
Periodic Spikes
Metrics spike at regular intervals (e.g., every 30 seconds or every minute). This often indicates a competing periodic process on the network: backup jobs, cron tasks, or monitoring polls that burst traffic at fixed intervals. Identify and reschedule the competing traffic, or schedule tests during quiet periods.
Sudden Step Change Mid-Call
Metrics are stable, then abruptly shift to a new level and stay there. This indicates a routing change, a link failure with failover, or a new traffic source appearing on the network path. Check network logs for route changes or link events at the timestamp of the shift.
Compare Send vs Receive Charts Side by Side
Always check both send and receive direction charts for the same endpoint. If degradation appears only in one direction, the problem is on that specific network path. If both directions degrade simultaneously, the problem is likely at a shared point (the worker's network interface, or a bidirectional bottleneck).
Systematic Quality Analysis
When multiple metrics are degraded simultaneously, follow this systematic approach:
- Start with the network layer. Check packet loss and jitter first. These are the most common root causes of quality degradation.
- Check RTT next. High RTT affects MOS and creates conversational delay.
- Check video metrics if applicable. Video issues are almost always caused by underlying network problems (loss, insufficient bandwidth).
- Compare send vs receive. Asymmetric quality problems point to a directional network issue.
- Compare across endpoints. Uniform degradation points to shared infrastructure. Variable degradation points to per-path or per-worker issues.
- Check time-series patterns. Degradation that starts mid-call may indicate a transient network event. Degradation present from the start indicates a steady-state condition.
Related Pages
- Metrics Reference -- Full metric definitions and interpretation
- MOS Score -- Detailed MOS documentation
- Common Test Failures -- When tests fail entirely
- SIP Response Codes -- Response code reference
- Threshold Configuration -- Setting quality thresholds for probes
Common Test Failures
Step-by-step diagnostic guide for the most frequent causes of failed test runs in CallMeter, including registration failures, call setup errors, media issues, and worker problems.
SIP Registration Errors
Comprehensive diagnosis guide for SIP registration failures during CallMeter test and probe execution, covering authentication errors, network issues, DNS problems, and transport mismatches.