upti.my

Agent Local Checks

Reference for the exact check types and fields supported by the open-source uptimy-agent.

Overview

Local checks run from the agent and are defined in YAML underchecks. Duration fields use Go-style durations (for example 15s, 60s, 1h). Source of truth: github.com/uptimy/uptimy-agent.

ℹ️ No Inbound Access Required

Local checks operate entirely from within your network. The agent can run fully standalone. Control plane connectivity is optional and only required for centralized telemetry/visibility features.

Common Settings

Every check entry shares these core fields:

SettingTypeDescription
typestringCheck kind, for example http, tcp, ordisk
namestringUnique check identifier referenced by repair rules
servicestringLogical service grouping label
intervaldurationHow often the check runs (for example 30s)
timeoutdurationMax runtime before timeout failure
tagsstring arrayOptional tags for filtering and grouping
metadataobjectOptional extra key/value metadata

Check Types Reference

1. HTTP Check

HTTP/HTTPS endpoint check.

FieldTypeDescription
urlstringFull URL to check, e.g., http://localhost:8080/health
methodstringHTTP method (for example GET, POST)
expected_statusintegerExpected HTTP status code. Default: 200.
headersobjectOptional custom headers to include in the request.
HTTP Check Example
{
  "type": "http",
  "name": "api-health",
  "service": "api",
  "url": "http://localhost:3000/api/health",
  "method": "GET",
  "expected_status": 200,
  "interval": "30s",
  "timeout": "5s",
  "tags": ["api", "internal"]
}

2. TCP Check

TCP connectivity check to a host and port.

FieldTypeDescription
addressstringTarget host:port, for example localhost:5432
TCP Check Example
{
  "type": "tcp",
  "name": "postgres",
  "service": "postgres",
  "address": "localhost:5432",
  "interval": "15s",
  "timeout": "5s"
}

3. Process Check

Verify that a specific process is running on the host. The agent scans the process list and matches by process name. If the process is not found, the check fails.

FieldTypeDescription
service_namestringName of the process to look for, e.g., nginx or postgres
Process Check Example
{
  "type": "process",
  "name": "nginx-running",
  "service": "nginx",
  "service_name": "nginx",
  "interval": "10s",
  "timeout": "5s"
}

4. Docker Container Check

Monitor the status of a Docker container by name. The check verifies that the container is running and, if a health check is configured on the container, that it reports a healthy state.

FieldTypeDescription
container_namestringName of the Docker container to monitor, e.g., redis-cache
Docker Container Check Example
{
  "type": "docker_container",
  "name": "redis-container",
  "service": "redis",
  "container_name": "redis-cache",
  "interval": "30s",
  "timeout": "10s"
}

5. Docker Swarm Check

Docker Swarm health check. No additional type-specific fields are required beyond the common fields.

Docker Swarm Check Example
{
  "type": "docker_swarm",
  "name": "swarm-health",
  "service": "swarm",
  "interval": "30s",
  "timeout": "10s"
}

6. Disk Check

Monitor disk usage on a specified file system path. The check fails when the usage percentage exceeds your configured threshold, helping you prevent disk-full outages before they happen.

FieldTypeDescription
pathstringFile system path to monitor, e.g., / or /var/log
thresholdintegerUsage percentage that triggers a failure. Default: 90.
Disk Usage Check Example
{
  "type": "disk",
  "name": "disk-check",
  "service": "system",
  "path": "/",
  "threshold": 85,
  "interval": "120s",
  "timeout": "10s"
}

7. Memory Check

Monitor system memory utilization. When total memory usage exceeds the configured threshold, the check fails. This helps you detect memory leaks and resource exhaustion early.

FieldTypeDescription
thresholdintegerMemory usage percentage that triggers a failure. Default: 90.
Memory Check Example
{
  "type": "memory",
  "name": "memory-check",
  "service": "system",
  "threshold": 90,
  "interval": "60s",
  "timeout": "10s"
}

8. CPU Check

Monitor CPU utilization across all cores. The check samples CPU usage over the timeout window and fails if the average exceeds the configured threshold. Useful for detecting runaway processes and unexpected load spikes.

FieldTypeDescription
thresholdintegerCPU usage percentage that triggers a failure. Default: 90.
CPU Check Example
{
  "type": "cpu",
  "name": "cpu-check",
  "service": "system",
  "threshold": 85,
  "interval": "60s",
  "timeout": "10s"
}

9. Certificate Check

TLS certificate expiry check. The agent supports endpoint-based checks with cert_url and file-based checks withcert_path.

FieldTypeDescription
cert_urlstringEndpoint form, for example api.example.com:443
cert_pathstringOptional path-based certificate source
days_before_expiryintegerWarning threshold in days
Certificate Check Example
{
  "type": "certificate",
  "name": "api-cert",
  "service": "api",
  "cert_url": "api.example.com:443",
  "days_before_expiry": 30,
  "interval": "3600s",
  "timeout": "30s"
}

💡 Field Names Matter

Use the exact field names from the open-source examples:threshold (not threshold_percent),service_name for process checks, and duration strings like 30s for interval/timeout.

Summary Table

Check TypeKey Config FieldsTypical Interval
HTTPurl, method, expected_status, headers30s
TCPaddress15s to 30s
Processservice_name15s
Docker Containercontainer_name30s
Docker Swarm(no extra type-specific fields)30s
Diskpath, threshold60s
Memorythreshold30s
CPUthreshold30s
Certificatecert_url or cert_path, days_before_expiry3600s

⚠️ Agent Permissions

Some checks require elevated permissions. Docker Container checks need access to the Docker socket. Process checks may need root access to see all running processes. Certificate checks using cert_path need read access to that file. Make sure your agent runs with the appropriate permissions.