Threshold Configuration
Set warning and critical thresholds for probe metrics to control when health status changes and alerts fire.
Thresholds define the quality boundaries that determine a probe's health status. After each probe execution, every configured metric is compared against its thresholds. If any metric breaches a threshold, the probe's health status changes and configured webhooks fire. This guide covers how thresholds work, which metrics support them, and best practices for setting values.
How Thresholds Work
Each metric has two threshold levels:
| Level | Meaning |
|---|---|
| Warning | The metric has deviated from acceptable quality. Quality may be noticeable to users but calls are still usable. The probe transitions to DEGRADED |
| Critical | The metric has reached unacceptable quality. Calls are significantly impaired or unusable. The probe transitions to UNHEALTHY |
After each probe execution, the platform evaluates every metric against its thresholds:
- Retrieve the latest metric values from the completed probe call
- Compare each metric against its warning and critical thresholds
- Determine the individual result for each metric (Healthy, Degraded, or Unhealthy)
- The worst result across all metrics determines the overall probe health status
This means a single metric breaching its critical threshold makes the entire probe UNHEALTHY, even if all other metrics are within normal range.
Understanding Metric Direction
Metrics fall into two categories based on whether higher or lower values indicate better quality:
Higher Is Better (Inverted Metrics)
For these metrics, quality decreases as the value goes down:
| Metric | Healthy | Warning Triggers Below | Critical Triggers Below |
|---|---|---|---|
| MOS | Above warning threshold | Configured warning value | Configured critical value |
Example: MOS warning = 3.5, MOS critical = 2.5
- MOS 4.1 = HEALTHY (above 3.5)
- MOS 3.2 = DEGRADED (below 3.5, above 2.5)
- MOS 2.1 = UNHEALTHY (below 2.5)
Lower Is Better (Standard Metrics)
For these metrics, quality decreases as the value goes up:
| Metric | Healthy | Warning Triggers Above | Critical Triggers Above |
|---|---|---|---|
| Jitter | Below warning threshold | Configured warning value | Configured critical value |
| Packet Loss | Below warning threshold | Configured warning value | Configured critical value |
| RTT | Below warning threshold | Configured warning value | Configured critical value |
| Setup Time | Below warning threshold | Configured warning value | Configured critical value |
Example: Jitter warning = 40ms, Jitter critical = 50ms
- Jitter 25ms = HEALTHY (below 40ms)
- Jitter 45ms = DEGRADED (above 40ms, below 50ms)
- Jitter 55ms = UNHEALTHY (above 50ms)
Automatic direction handling
You do not need to configure the direction logic. CallMeter automatically applies the correct comparison direction based on each metric. You only need to set the numeric threshold values.
Configurable Metrics and Defaults
The following metrics support threshold configuration:
| Metric | Unit | Default Warning | Default Critical | Direction |
|---|---|---|---|---|
| MOS | Score (1.0-5.0) | Below 3.5 | Below 2.5 | Higher is better |
| Jitter | Milliseconds | Above 40ms | Above 50ms | Lower is better |
| Packet Loss | Percentage | Above 3% | Above 5% | Lower is better |
| RTT | Milliseconds | Above 200ms | Above 400ms | Lower is better |
| Setup Time | Milliseconds | Above 3000ms | Above 5000ms | Lower is better |
| Ring Time | Milliseconds | Above 5000ms | Above 8000ms | Lower is better |
| Connect Time | Milliseconds | Above 8000ms | Above 12000ms | Lower is better |
The default values provide a reasonable starting point based on industry standards and ITU-T recommendations. You should tune them based on your specific quality requirements.
Adjusting Thresholds
- Open your probe's detail page
- Navigate to the Settings tab
- Locate the Thresholds section
- Modify warning and critical values for each metric
- Click Save
Forward-looking changes
Threshold changes apply to the next probe execution and all future executions. Historical health evaluations are not recalculated with the new thresholds. Past health status entries reflect the thresholds that were active at the time.
Health Status Determination
The overall probe health status follows this logic after each execution:
For each metric with a configured threshold:
1. Retrieve the measured value from the probe call
2. Compare against warning and critical thresholds
3. Assign individual metric status:
- Within acceptable range --> HEALTHY
- Exceeds warning but not critical --> DEGRADED
- Exceeds critical --> UNHEALTHY
Overall probe status = WORST individual metric status
If no data was collected: status = UNKNOWNExamples:
| MOS | Jitter | Packet Loss | Overall Status | Reason |
|---|---|---|---|---|
| 4.2 | 15ms | 0.5% | HEALTHY | All metrics within acceptable range |
| 3.3 | 15ms | 0.5% | DEGRADED | MOS below warning threshold (3.5) |
| 4.2 | 55ms | 0.5% | UNHEALTHY | Jitter above critical threshold (50ms) |
| 3.3 | 45ms | 4% | UNHEALTHY | Multiple thresholds breached, worst wins |
| 4.2 | 15ms | 0.5% | HEALTHY | Even one excellent metric does not override a bad one |
Status Transitions and Webhooks
When the probe's health status changes from one state to another, configured webhooks are triggered:
| Transition | Severity | Common Cause |
|---|---|---|
| HEALTHY to DEGRADED | Warning | Quality starting to degrade -- investigate soon |
| HEALTHY to UNHEALTHY | Critical | Major quality problem -- investigate immediately |
| DEGRADED to UNHEALTHY | Escalation | Degradation has worsened to critical levels |
| DEGRADED to HEALTHY | Recovery | Issue resolved, quality returned to normal |
| UNHEALTHY to HEALTHY | Recovery | Critical issue resolved |
| UNHEALTHY to DEGRADED | Partial recovery | Situation improving but not yet fully resolved |
| Any to UNKNOWN | Data issue | Probe execution failed or no data collected |
See Webhooks for details on webhook payload format and delivery.
Consecutive Failure Counting
To avoid alerting on transient network glitches, you can configure consecutive failure counting. Instead of changing health status on a single threshold breach, the probe requires N consecutive breaches before transitioning.
Example with consecutive count of 3:
| Execution | MOS | Would Be | Status After |
|---|---|---|---|
| 1 | 3.2 | DEGRADED | HEALTHY (1 of 3 consecutive) |
| 2 | 3.1 | DEGRADED | HEALTHY (2 of 3 consecutive) |
| 3 | 3.3 | DEGRADED | DEGRADED (3 of 3 -- threshold met) |
| 4 | 4.1 | HEALTHY | DEGRADED (counter resets) |
| 5 | 4.0 | HEALTHY | HEALTHY (back to normal) |
This prevents false alerts from single-execution anomalies while still catching persistent problems.
Best Practices for Setting Thresholds
Start with Industry Standards
The ITU-T G.107 recommendation and industry experience provide these guidelines:
| Metric | Good | Acceptable | Poor |
|---|---|---|---|
| MOS | Above 4.0 | 3.5-4.0 | Below 3.5 |
| Jitter | Below 20ms | 20-50ms | Above 50ms |
| Packet Loss | Below 1% | 1-3% | Above 3% |
| RTT | Below 100ms | 100-300ms | Above 300ms |
Tune Based on Your Environment
After running probes for a week, review the metric history:
- If metrics consistently hover near the warning threshold, either your infrastructure needs attention or your threshold is too tight
- If metrics never come close to warning, you can tighten thresholds for earlier detection
- Set warning thresholds at the point where quality becomes noticeable to users
- Set critical thresholds at the point where calls become unusable
Account for Geography
Probes running from distant regions will naturally have higher RTT. Adjust RTT thresholds accordingly:
| Probe Location vs. SIP Server | RTT Warning | RTT Critical |
|---|---|---|
| Same region | 100ms | 200ms |
| Same continent | 200ms | 400ms |
| Cross-continent | 300ms | 500ms |
Example Configurations
Enterprise voice (strict quality requirements):
| Metric | Warning | Critical |
|---|---|---|
| MOS | Below 4.0 | Below 3.5 |
| Jitter | Above 20ms | Above 30ms |
| Packet Loss | Above 1% | Above 2% |
| RTT | Above 100ms | Above 200ms |
Standard SIP trunk monitoring:
| Metric | Warning | Critical |
|---|---|---|
| MOS | Below 3.5 | Below 2.5 |
| Jitter | Above 40ms | Above 50ms |
| Packet Loss | Above 3% | Above 5% |
| RTT | Above 200ms | Above 400ms |
Best-effort / development environment:
| Metric | Warning | Critical |
|---|---|---|
| MOS | Below 3.0 | Below 2.0 |
| Jitter | Above 60ms | Above 80ms |
| Packet Loss | Above 5% | Above 10% |
| RTT | Above 300ms | Above 500ms |
Next Steps
- Webhooks -- Get alerted when thresholds are breached
- Creating a Probe -- Set up a new probe with these thresholds
- Status Pages -- Share probe health status publicly
- Probe Health States -- Complete health state reference
Creating a Probe
Set up continuous SIP monitoring with scheduled calls, threshold-based health evaluation, and automated alerting for your VoIP infrastructure.
Webhooks
Configure HTTP webhook notifications for probe health status changes, verify HMAC-SHA256 signatures, handle retries, and integrate with Slack, PagerDuty, and Microsoft Teams.