Agent Local-Network Runbook

LAN test setup, operational checks, troubleshooting, and regression checklist.

# Agent Runbook (Local Network)

This runbook is for local testing when:
- The app/worker run on one machine.
- An agent client runs on the same LAN and calls the API remotely.

## 1. Preconditions

- App host machine has:
  - `.env` configured
  - database reachable
  - web + worker processes running
- For deterministic API error-contract testing, set:
  - `ENABLE_TEST_FAULT_INJECTION=true` (non-production only)
- For local/private webhook receiver URLs, set:
  - `ALLOW_PRIVATE_WEBHOOK_URLS=true` (non-production only)
- LAN clients can reach the host on TCP `3000`.
- Firewall allows inbound `3000`.

## 2. Base URL

Use host LAN address:
- `http://<HOST_LAN_IP>:3000`

Example:
- `http://192.168.1.70:3000`

## 3. Human Operator Startup Checklist

1. Start app + worker:
```bash
npm run dev
```

2. Confirm web app opens at:
- `http://localhost:3000`

3. Confirm LAN client can reach:
- `http://<HOST_LAN_IP>:3000`

4. Log in as the development user in browser.

5. In Admin UI:
- verify provider/API connections
- verify agent settings (rate limits, key TTL, webhook retry schedule)

## 4. Agent Bootstrap Flow

1. Ensure account:
- `POST /api/v1/agent/accounts/ensure`

2. Activate account:
- `POST /api/v1/agent/accounts/activate`

3. Create initial API key:
- `POST /api/v1/agent/keys` with `email` + `activation_code`

4. Store API key securely.

5. Use header:
```http
Authorization: Bearer <api_key>
```

## 5. Content Job Flow (Agent)

1. Create job:
- `POST /api/v1/content/jobs`
- Required header: `Idempotency-Key`

2. Track progress:
- Poll: `GET /api/v1/content/jobs/{id}`
- Stream: `GET /api/v1/content/jobs/{id}/events` with `Accept: text/event-stream`

3. Optional cancel:
- `DELETE /api/v1/content/jobs/{id}`

4. Fetch result:
- `GET /api/v1/content/jobs/{id}/result`

5. Handle terminals:
- `ready`
- `error`
- `canceled`

## 6. Webhook Setup (Optional)

1. Register endpoint:
- `POST /api/v1/agent/webhooks`

2. Test delivery:
- `POST /api/v1/agent/webhooks/{id}/test`

3. Verify signature headers:
- `X-Webhook-Id`
- `X-Webhook-Timestamp`
- `X-Webhook-Signature`
- `X-Webhook-Event`

4. Verify retries via admin-configured retry schedule.

## 7. Local Test Use Cases

1. Queue and observe `queued`
- Pass when job starts as `queued` and transitions to `processing`.

2. Observe live progress over SSE
- Pass when `job.update` events are received continuously.

3. Observe partial content growth
- Pass when `partial_content.completed` and `elements[]` increase during processing.

4. Verify granular stage counters
- Pass when `stage_step` and `stage_total` are present and moving.

5. Verify partial result endpoint
- Pass when `/result` returns `result: null` plus `partial_result` before ready.

6. Verify completion
- Pass when terminal status is `ready` and `job.done` arrives in SSE.

7. Verify cancel of queued run
- Pass when final status remains `canceled`.

8. Verify cancel of processing run
- Pass when final status remains `canceled` (no re-entry to processing).

9. Verify structured error payload
- Pass when failed run includes `error.stage`, `error.code`, `error.message`, `error.timestamp`.

10. Verify timing metrics
- Pass when payload includes `queued_at`, `started_at`, `completed_at`, `time_in_queue_ms`, `processing_time_ms`, `age_ms`.

11. Verify rate-limit headers
- Pass when responses include `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset`.

12. Verify dashboard live feeds
- Pass when `/api/runs/live/events` updates UI without full refresh.

13. Verify dashboard cancel action
- Pass when cancel button stops run and UI state updates.

14. Verify key rotation flow
- Pass when rotated key works and revoked key no longer authenticates.

15. Regression checks
- `npm run typecheck`
- `npx vitest run --config vitest.config.ts tests/use-cases/agent-api-core.test.ts tests/unit/run-state.test.ts`
- One-command host validation:
  - `npm run test:agent-api-e2e`

## 8. Default Testing Settings

- `jobs_per_minute_per_user = 10`
- `reads_per_minute_per_user = 60`
- `webhook_tests_per_minute_per_user = 10`
- `key_ttl_days = 30`
- `runs_retention_days = 30`
- `webhook_logs_retention_days = 30`

## 9. Troubleshooting

`ERR_CONNECTION_REFUSED`
- Web app process is not running, wrong host/IP, or firewall blocks port `3000`.

Forced-error test skipped
- Enable `ENABLE_TEST_FAULT_INJECTION=true` on host and restart app/worker.

No progress updates
- Worker is down or cannot claim jobs.

Stuck in queued
- Worker not running, DB lock issue, or rate-limit pressure.

Agent unauthorized
- Missing/invalid/expired API key or missing required scope.

Missing activation mail
- Email provider not integrated yet; use `activation_code_preview` in non-production.

## 10. Related Docs
- API reference: `docs/agent-api.md`
- Runtime contract: `content-v2.md`
- Project overview: `README.md`