upti.my

The right alert, to the right person, at the right time

Alert Routing & Smart Workflows

Most monitoring tools send every alert to the same channel. Staging and production treated the same. Everyone paged at once. No escalation when someone misses it. upti.my routes alerts by severity, service, and team so the right person gets notified, and only them.

Why Bad Alert Routing Is an Operational Problem

The issue is not that teams fail to get alerted. They get too many alerts, routed badly. Four failure modes come up repeatedly:

Staging and production look identical

A CSS change on staging fires the same Slack message as a production database outage. The team learns to ignore the channel. When the real incident arrives, it looks like all the others.

The wrong person gets paged

No routing by service or team means the backend engineer wakes up at 2am for an infrastructure issue they cannot fix. Or everyone gets paged. Both erode trust in the alerting system.

Alert fatigue buries what matters

When most alerts turn out to be low-priority noise, on-call engineers stop treating every notification as urgent. The one that needs immediate action arrives looking the same as the rest.

No escalation when the first alert is missed

If the on-call engineer does not see the alert, nothing happens. No second notification, no escalation to a backup. The incident runs undetected until a customer reports it.

How Alert Workflows Work

Trigger

Starts when a check fails, an incident is created, or a condition is met.

Filter

Route based on severity, service, environment, region, or custom tags.

Delay & Deduplicate

Wait before alerting, group duplicate events, batch low-priority notifications.

Escalate

If unacknowledged within a time window, escalate to the next responder or team.

Notify

Send to Slack, Discord, email, SMS, Teams, PagerDuty, webhooks, or status pages.

Automate

Trigger self-healing agents, run recovery scripts, or call external APIs.

Example: How One Team Routes Alerts

A SaaS team has a staging environment, a production API, and a production database. Each needs different alert behavior. Here is how they set it up in upti.my:

Staging check fails

Severity: warningSlack #staging-alerts

No escalation. No SMS. No status page update. Just a Slack message so the team sees it during work hours.

Production API returns 5xx

Severity: highSlack #prod-incidents+ after 5 min →SMS to on-call

Slack first. If nobody acknowledges within 5 minutes, SMS goes to the on-call engineer. Status page component moves to Degraded.

Production database unreachable

Severity: criticalSMS to on-call immediately+ after 10 min →SMS to team lead

Skips Slack entirely. SMS goes out immediately. If not acknowledged in 10 minutes, escalates to team lead. Status page moves to Major Outage. Self-healing agent attempts a connection pool reset.

Three services, three different behaviors, one platform. The staging alert does not wake anyone up. The database outage skips Slack entirely and goes straight to SMS. Setup for each takes about 2 minutes.

Alert workflow builder showing routing rules: staging alerts to Slack, production API errors with SMS escalation, database failures with immediate SMS

Without Alert Routing vs. With upti.my

Without routing

  • Staging and production go to the same channel
  • Everyone on the team gets paged for every failure
  • Missed alert means no escalation, no backup notification
  • Status page requires manual update during an active incident
  • Alert channels become noise, real incidents get missed

With upti.my

  • Staging alerts go to Slack only, production escalates to SMS
  • Alerts route by service and severity to the right person
  • Escalation runs automatically if unacknowledged
  • Status page updates as part of the alert workflow
  • Deduplication keeps channels clean and actionable

What You Can Do with Alert Routing

Route critical alerts to on-call via SMS, warnings to Slack
Escalate unacknowledged alerts after 5, 15, or 30 minutes
Deduplicate repeated failures into a single notification
Suppress alerts during scheduled maintenance windows
Route different services to different teams
Batch low-priority alerts into daily digests
Trigger self-healing actions before paging a human
Auto-update status pages when incidents are created
Enrich alerts with AI-powered analysis before delivery
Use pre-built templates for common routing patterns

Notification Channels

Slack

Channels, DMs, threads

Discord

Server channels, DMs

Email

Individual or team addresses

SMS

Direct text messages

Microsoft Teams

Channels and conversations

PagerDuty

On-call integration

WhatsApp

Premium messaging

Custom Webhooks

Any HTTP endpoint

Status Pages

Automatic public updates

Frequently Asked Questions

Alert routing workflows filter, deduplicate, and route alerts based on rules you define. Instead of every check failure notifying every team member, alerts go to the right person based on severity, service, time of day, or on-call rotation. Duplicate alerts are grouped, and low-priority issues can be batched or delayed.

Yes. You can create separate workflows for different services, environments, or teams. A database check failure can route to the infrastructure team, while an API failure routes to the backend team. Each workflow has its own escalation rules.

You can configure maintenance windows that suppress alerts for specific checks or services. Monitoring continues during maintenance so you have data, but notifications are held until the window ends. If a failure persists after maintenance, the alert fires.

Slack, Discord, email, SMS, Microsoft Teams, PagerDuty, WhatsApp, custom webhooks, and automatic status page updates. You can use different channels for different severity levels or escalation stages.

Yes. Alert workflows can trigger webhooks, self-healing agents, or status page updates as automated actions. For example, a critical alert can restart a service via a self-healing agent, notify the on-call engineer, and update the status page all in one workflow.

Related Topics

Run reliability as one connected workflow

Detect failures early, route alerts clearly, coordinate incidents, and keep status updates in sync from one system.