Restart Policy
Some agents are long-running daemons rather than one-shot tasks. When such an agent crashes (non-zero exit, panic, or runner error), the aide daemon's supervisor can auto-restart it according to a [daemon] block in the Aidefile.
Aidefile schema
[daemon]
restart = "on-failure" # always | on-failure | never
max_restarts = 10 # supervisor gives up after this many consecutive restarts
backoff = "exponential" # exponential | linear | fixed
All fields are optional. If the [daemon] block is omitted entirely, agents keep their existing single-shot behaviour (restart = "never") — backward compatible with all pre-existing Aidefiles.
Restart modes
| Mode | When the supervisor restarts |
|---|---|
always | After every run — both success and failure. Use for true daemons. |
on-failure | Only when the run failed (non-zero exit / runner Err / budget exhausted). Default when [daemon] is present. |
never | Never restart. Default for legacy Aidefiles without [daemon]. |
max_restarts
A safety net against infinite crash loops. The supervisor counts consecutive restart attempts per agent and stops once max_restarts is reached. The counter resets to zero after any successful run, so a flaky agent that eventually succeeds gets a fresh budget.
Backoff strategies
The supervisor sleeps between restarts to avoid hot-looping. Sleep duration grows with the attempt number (0-indexed):
| Strategy | Sleep sequence (seconds) |
|---|---|
exponential | 1, 2, 4, 8, 16, 32, 64, 128, 256, 300, 300, … |
linear | 1, 2, 3, 4, 5, … |
fixed | 1, 1, 1, 1, … |
All strategies are capped at 300 seconds (5 minutes) per individual sleep, so even pathological attempt counts stay bounded.
Example: monitoring agent that should never stop
[persona]
name = "uptime-monitor"
[trigger]
on = "cron:*/5 * * * *" # every 5 minutes
[daemon]
restart = "always" # respawn on success and failure
max_restarts = 100
backoff = "exponential"
Use cases
- Monitoring agents (
restart = "always") that should never stop - Stream processors that crash intermittently (
restart = "on-failure") - Any agent with
[trigger] on = "cron:..."that you want kept alive without an external watchdog
Notes
- The restart counter is in-memory only — it lives inside the running daemon process and is reset on
aide up/aide down. This is intentional: persistent state for crash budgets is overkill for v2. - Restart policy is independent of trigger type. The cron trigger schedules the first run; the supervisor then decides whether to keep it alive.