aide policy-update

Update the routing policy from dispatch history.

Usage

# Standard update: LinUCB bandit learns from events.jsonl
aide policy-update

# Full update: bandit + MEDS failure clustering
aide policy-update --full

What it does

  1. Reads ~/.aide/events.jsonl for finished dispatch events
  2. Extracts task features (12-category keyword classification)
  3. Computes reward per event: r = 0.7 * success + 0.3 * token_efficiency
  4. Updates per-agent LinUCB bandit parameters (A matrix + b vector)
  5. Writes routing weights to ~/.aide/policy.toml

With --full:

  1. Embeds failed tasks via local ollama (nomic-embed-text)
  2. Clusters failures by cosine similarity (threshold 0.75)
  3. Saves failure centroids to ~/.aide/failure_patterns.json

Output

Policy updated: 25 events processed, 5 agents, 14529μs CPU
State: /home/user/.aide/bandit.json
Policy: /home/user/.aide/policy.toml

CATEGORY     TOP AGENT            SCORE
──────────────────────────────────────────
infra        infra-guardian       1.00
gpu          infra-guardian       1.00
pipeline     pipeline-doctor      1.00
web          crossmem-chrome      1.00
...

Options

FlagDescription
--fullAlso run failure clustering (requires ollama + nomic-embed-text)

Daemon integration

The daemon runs policy-update (without --full) on startup and every hour. You only need to run this manually for immediate updates or --full clustering.

See also