aide policy-update
Update the routing policy from dispatch history.
Usage
# Standard update: LinUCB bandit learns from events.jsonl
aide policy-update
# Full update: bandit + MEDS failure clustering
aide policy-update --full
What it does
- Reads
~/.aide/events.jsonlfor finished dispatch events - Extracts task features (12-category keyword classification)
- Computes reward per event:
r = 0.7 * success + 0.3 * token_efficiency - Updates per-agent LinUCB bandit parameters (A matrix + b vector)
- Writes routing weights to
~/.aide/policy.toml
With --full:
- Embeds failed tasks via local ollama (
nomic-embed-text) - Clusters failures by cosine similarity (threshold 0.75)
- Saves failure centroids to
~/.aide/failure_patterns.json
Output
Policy updated: 25 events processed, 5 agents, 14529μs CPU
State: /home/user/.aide/bandit.json
Policy: /home/user/.aide/policy.toml
CATEGORY TOP AGENT SCORE
──────────────────────────────────────────
infra infra-guardian 1.00
gpu infra-guardian 1.00
pipeline pipeline-doctor 1.00
web crossmem-chrome 1.00
...
Options
| Flag | Description |
|---|---|
--full | Also run failure clustering (requires ollama + nomic-embed-text) |
Daemon integration
The daemon runs policy-update (without --full) on startup and every hour. You only need to run this manually for immediate updates or --full clustering.
See also
- Smart Routing guide — how the bandit works
- aide dispatch --auto — auto-select agent using bandit scores