Incident Conditions

Define the conditions that determine when incidents are created. Incident conditions evaluate healthcheck results and create incidents when real problems are detected.

Overview

Incident conditions are the rules that tell upti.my when to create an incident. Every healthcheck can have one or more conditions attached to it. When a condition is met, upti.my automatically opens a new incident and records the affected healthcheck, timestamp, and severity.

Conditions focus on one thing: deciding whether a problem is real enough to warrant an incident. They do not handle notifications, enrichment, or routing. That part is handled by Workflows, where you configure destinations, message formatting, escalation chains, and everything else using a visual drag-and-drop builder.

ℹ️ Conditions Create. Workflows Notify.

Think of it this way: incident conditions decide when an incident is created. Workflows decide what happens next. This separation keeps your setup clean. You define detection logic in conditions and notification logic in workflows.

Common Settings

All condition types share the following configurable setting:

Setting	Description
Severity	The severity assigned to incidents created by this condition: `critical`, `high`, `medium`, or `low`. Workflows can use severity to route notifications to the right channels.

Condition Types

1. Consecutive Failures

The most straightforward condition. It creates an incident when a healthcheck fails a specified number of consecutive times. This is the default condition type and works well for clear-cut "is it up or down" monitoring.

In the dashboard, select Consecutive Failures as the condition type, then set the number of consecutive failures required before an incident is created. The default is 3 consecutive failures.

💡 Start Conservative

A threshold of 3 consecutive failures is a good starting point for most services. It filters out single transient errors while still detecting real outages quickly.

2. Sliding Window

Creates an incident based on the percentage of failures within a rolling time window. This is useful for services that have occasional transient failures. You might tolerate 10% failure but want an incident at 50%.

Configure the failure percentage threshold and the time window (in minutes) in the dashboard. For example, set 50% failures within a 5-minute window to catch sustained degradation while ignoring occasional blips.

ℹ️ Window Size Matters

Shorter windows (1 to 5 minutes) detect issues faster but may create incidents from transient failures. Longer windows (10 to 30 minutes) are more stable but slower to react. Match the window size to the criticality of the service.

3. Latency

Creates an incident when the response time of a healthcheck exceeds a specified threshold. This is useful for detecting performance degradation before a service goes fully down. You define a latency threshold in milliseconds, and an incident is created when the response time crosses that boundary.

In the dashboard, select Latency as the condition type and enter the maximum acceptable response time in milliseconds. For example, set a 2000ms threshold for an API that should respond within 2 seconds.

4. Average Latency

Similar to the Latency condition, but evaluates the average response time over a rolling window instead of individual check results. This smooths out occasional spikes and only creates an incident when the overall performance trend exceeds the threshold.

Configure the latency threshold in milliseconds and the time window over which to calculate the average. This is ideal for services where occasional slow responses are acceptable, but sustained high latency is not.

💡 Pair with Severity-Based Routing

Use a high severity Consecutive Failures condition for total outages and a medium severity Latency condition for performance degradation. Then configure your workflows to route each severity to the appropriate channel.

SLA Tracking

Incident conditions also support SLA target percentages and SLA time windows. Set your uptime target (e.g., 99.9%) and the evaluation period in days. upti.my tracks your actual uptime against the target and surfaces it in your dashboard metrics.

What Happens After an Incident is Created?

Once a condition creates an incident, the incident enters the incident lifecycle (Ongoing, Investigating, Identified, Monitoring, Resolved). From there, Workflows take over.

In the workflow builder, you configure everything that happens after detection:

Destinations - where notifications go (Slack, Discord, Email, Teams, PagerDuty, WhatsApp, SMS, webhooks, and status pages)
Enrichment - automatically attach application context or AI-powered analysis to your alerts
Routing - use Filter nodes to send critical incidents to PagerDuty and lower-severity alerts to Slack
Escalation chains - add delays between notification stages so your team has time to respond
Deduplication - prevent notification floods during major outages

This separation means you can change how you get notified without touching your detection logic, and vice versa.