Agent Local Checks
Configure local health checks on your agents to monitor internal services, processes, disk usage, memory, CPU, and more.
Overview
Local checks run directly on the host where the upti.my agent is installed. They monitor services, resources, and certificates from inside your infrastructure, giving you visibility into systems that external probes cannot reach. Each check type has its own configuration fields, and all checks share common settings for interval, timeout, and tagging.
ℹ️ No Inbound Access Required
Local checks operate entirely from within your network. The agent only needs outbound HTTPS access to report results back to upti.my. No firewall rules or port openings are required.
Common Settings
Every local check type shares the following configuration options:
| Setting | Type | Description |
|---|---|---|
| interval | integer (seconds) | How often the check runs. Minimum 10 seconds, default 30 seconds. |
| timeout | integer (seconds) | Maximum time a check is allowed to run before being marked as failed. Default 10 seconds. |
| tags | string array | Optional labels for organizing and filtering checks (e.g., "production", "database"). |
Check Types Reference
1. HTTP Check
Send an HTTP request to a local or internal endpoint and validate the response. This is ideal for monitoring internal APIs, admin panels, and microservices that are not publicly accessible.
| Field | Type | Description |
|---|---|---|
| url | string | Full URL to check, e.g., http://localhost:8080/health |
| method | string | HTTP method: GET, POST, PUT, DELETE, HEAD. Default: GET. |
| expected_status | integer | Expected HTTP status code. Default: 200. |
| expected_body | string | Optional substring that must appear in the response body. |
| headers | object | Optional custom headers to include in the request. |
{
"type": "http",
"url": "http://localhost:3000/api/health",
"method": "GET",
"expected_status": 200,
"expected_body": ""status":"ok"",
"interval": 30,
"timeout": 10,
"tags": ["api", "internal"]
}2. Process Check
Verify that a specific process is running on the host. The agent scans the process list and matches by process name. If the process is not found, the check fails.
| Field | Type | Description |
|---|---|---|
| process_name | string | Name of the process to look for, e.g., nginx or postgres |
{
"type": "process",
"process_name": "nginx",
"interval": 15,
"timeout": 5,
"tags": ["web-server"]
}3. Docker Container Check
Monitor the status of a Docker container by name. The check verifies that the container is running and, if a health check is configured on the container, that it reports a healthy state.
| Field | Type | Description |
|---|---|---|
| container_name | string | Name of the Docker container to monitor, e.g., redis-cache |
{
"type": "docker_container",
"container_name": "redis-cache",
"interval": 30,
"timeout": 10,
"tags": ["cache", "docker"]
}4. Disk Usage Check
Monitor disk usage on a specified file system path. The check fails when the usage percentage exceeds your configured threshold, helping you prevent disk-full outages before they happen.
| Field | Type | Description |
|---|---|---|
| path | string | File system path to monitor, e.g., / or /var/log |
| threshold_percent | integer | Usage percentage that triggers a failure. Default: 90. |
{
"type": "disk_usage",
"path": "/",
"threshold_percent": 85,
"interval": 60,
"timeout": 5,
"tags": ["infrastructure"]
}5. Memory Check
Monitor system memory utilization. When total memory usage exceeds the configured threshold, the check fails. This helps you detect memory leaks and resource exhaustion early.
| Field | Type | Description |
|---|---|---|
| threshold_percent | integer | Memory usage percentage that triggers a failure. Default: 90. |
{
"type": "memory",
"threshold_percent": 90,
"interval": 30,
"timeout": 5,
"tags": ["infrastructure"]
}6. CPU Check
Monitor CPU utilization across all cores. The check samples CPU usage over the timeout window and fails if the average exceeds the configured threshold. Useful for detecting runaway processes and unexpected load spikes.
| Field | Type | Description |
|---|---|---|
| threshold_percent | integer | CPU usage percentage that triggers a failure. Default: 90. |
{
"type": "cpu",
"threshold_percent": 85,
"interval": 30,
"timeout": 10,
"tags": ["infrastructure"]
}7. Certificate Check
Monitor local TLS certificate files for upcoming expiration. The agent reads the certificate file from disk and checks how many days remain before it expires. If the remaining days fall below the warning threshold, the check fails.
| Field | Type | Description |
|---|---|---|
| cert_path | string | Absolute path to the TLS certificate file, e.g., /etc/ssl/certs/app.crt |
| warning_days | integer | Number of days before expiry to trigger a warning. Default: 30. |
{
"type": "certificate",
"cert_path": "/etc/ssl/certs/app.crt",
"warning_days": 30,
"interval": 3600,
"timeout": 5,
"tags": ["ssl", "security"]
}💡 Choosing Check Intervals
Use shorter intervals (10 to 30 seconds) for critical service checks like HTTP and Process. Use longer intervals (60 seconds or more) for resource checks like Disk, Memory, and CPU. Certificate checks only need to run once per hour since expiry changes slowly.
Summary Table
| Check Type | Key Config Fields | Typical Interval |
|---|---|---|
| HTTP | url, method, expected_status, expected_body, headers | 30s |
| Process | process_name | 15s |
| Docker Container | container_name | 30s |
| Disk Usage | path, threshold_percent | 60s |
| Memory | threshold_percent | 30s |
| CPU | threshold_percent | 30s |
| Certificate | cert_path, warning_days | 3600s |
⚠️ Agent Permissions
Some checks require elevated permissions. Docker Container checks need access to the Docker socket. Process checks may need root access to see all running processes. Certificate checks need read access to the certificate file. Make sure your agent runs with the appropriate permissions.