Agents

Worktruck runs agents as a first-class service. You register a config once, POST to an endpoint, and an agent worker picks it up, calls your data, and records every step. Runs are journaled — a worker restart doesn’t lose the work done so far. The agent executor is not a framework. You don’t bundle a Python runtime or write a LangChain graph. You configure a kind, enqueue a run, and collect the result.

When to use this

Reach for a Worktruck agent run when:

You want a task to execute against your Worktruck data (contacts, CRM, tasks, calendar, notes) without standing up your own inference stack
The work is bounded: a dedupe pass, a follow-up generation, a weekly cleanup, a daily briefing — something that starts, does a thing, and stops
You need the work to survive process restarts and mid-run failures
You want per-run cost caps, guardrails on which tools can run, and structured audit trails

Reach for the REST API or MCP server instead when:

You already have an agent runner (Claude Code, Cursor, a custom loop). In that case, connect it to the MCP server and skip the agent executor — your runner orchestrates, Worktruck serves data.
You want open-ended conversational access with the human in the loop.

Available agent kinds

Today, one kind is generally available:

Kind	What it does
`contact_deduper`	Scans your contacts for likely duplicates (email/phone/name collisions) and proposes merges.

Additional kinds land as they graduate the internal catalog.

Integration keys — credentials for external services

Some agent kinds call external APIs on your behalf — Cloudflare, GitHub, Netdata, Postmark, Outstand. You store one credential per provider once; agents load it automatically when they need it.

curl -X POST https://api.worktruck.app/api/v1/integration-keys \
  -H "Authorization: Bearer bsk_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "cloudflare",
    "key": "cf_live_...",
    "metadata": { "account_id": "abc123" }
  }'

Keys are encrypted with your tenant’s DEK. Responses never expose the raw key — only a key_hint (last four characters). Use POST /integration-keys/{provider}/verify to probe the credential against the real provider and update its active / invalid status. The site_health agent kind reads Cloudflare and Netdata integration keys at run time. If an integration key is not stored, tools that require it fail with auth_failed and the key prompt in the error message.

Your keys run the show (BYOK)

Agent runs use your Anthropic API key. Worktruck never charges you for LLM tokens — the cost lands on your Anthropic invoice, not ours. You attach a key once per blueprint with a PUT /api/v1/agent-configs/{blueprint}/byok-key call; Worktruck encrypts it with your tenant DEK and uses it on every run. Rotate or revoke any time. A probe call verifies the key works before you enqueue a real run:

curl -X POST https://api.worktruck.app/api/v1/agent-configs/validate-key \
  -H "Authorization: Bearer bsk_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{"api_key": "sk-ant-api03-..."}'

The response is a ProbeResult — either {"status": "valid"} or {"status": "invalid", "category": ..., "message": ...}. The category is one of invalid_key, revoked_or_expired, insufficient_permissions, quota_exhausted (terminal — require action) or provider_unavailable, rate_limited (transient — safe to retry).

Run lifecycle

queued ──► running ──► (succeeded | failed | cancelled)
              │
              └──► waiting ──► running ──► ...

Status	Meaning
`queued`	Your `POST /runs` call accepted the work; no worker has claimed it yet
`running`	A worker holds a 30-second lease and is driving the executor loop
`waiting`	A guardrail paused the run for approval (see below). Lease is cleared; no worker is spending time on it
`succeeded`	Terminal. Final output is available; cost was rolled up
`failed`	Terminal. `failure_category` explains why (`auth_failed`, `tool_failed`, `guardrail_blocked`, `config_error`, `timeout`, `budget_exhausted`)
`cancelled`	Terminal. Operator or API caller killed the run via `POST /runs/{id}/cancel`

Every state transition is auditable. Every tool call is journaled. Every LLM call has cost attached.

Enqueueing a run

curl -X POST https://api.worktruck.app/api/v1/agents/contact_deduper/runs \
  -H "Authorization: Bearer bsk_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "max_candidates": 50
    },
    "budget_usd_cents": 25,
    "deadline_secs": 300
  }'

Response:

{
  "id": "0194a8b3-7e5a-77c1-9d0f-7c1a2f3b4c5d",
  "agent_blueprint": "contact_deduper",
  "status": "queued",
  "created_at": "2026-04-14T18:32:10Z"
}

The POST returns immediately. A worker wakes via Postgres LISTEN/NOTIFY within milliseconds and starts the run. Poll GET /runs/{id} for status or set up a webhook.

Inspecting a run

curl https://api.worktruck.app/api/v1/agents/contact_deduper/runs/<id> \
  -H "Authorization: Bearer bsk_live_your_key"

The detail payload includes every step (LLM turn or tool call), token usage, cost, and the final output. Large tool payloads are stored out-of-line and retrieved by step ID.

Guardrails

Every agent config carries a guardrail set — a list of rules the worker enforces before each tool call. Six primitives:

Rule	Purpose
`allowlist`	Only listed tools may run
`denylist`	Listed tools may never run
`rate_limit`	Sliding-window cap per tool per run
`approval_gate`	Matching call pauses the run to `waiting` for human approval
`io_validation`	JSON Schema check on tool input before the call lands
`quiet_hours`	Block tool calls outside a tenant-local time window

Rules run in one of two modes:

shadow — the evaluator logs its decision but doesn’t block. Use this to test a new rule against live traffic.
enforce — block or suspend as the rule specifies.

If you leave the guardrail set empty, Worktruck falls back to default-deny: no tools can run. This is intentional — a brand-new config can’t accidentally hit mutating endpoints on day one.

Cost tracking and auto-disable

Every step’s cost (in USD cents) is recorded on agent_steps. At terminal, the run row gets a rolled-up cost_usd_cents. You can pull daily totals per (tenant, agent_blueprint) from the billing view for your own dashboards. On the operational side, Worktruck watches for runaway agents. If a (tenant, agent_blueprint) pair accumulates 10 failed runs in 48 hours, the config auto-disables: disabled_at gets stamped and future POST /runs calls return 409 Conflict with a reason. Manual operator action is required to re-enable — no self-healing. This is the runaway-agent circuit breaker.

What’s in a step

Each agent_step is content-hashed. If a worker crashes mid-run and another picks it up, the replay logic skips every step whose hash matches a prior record and resumes from the first un-journaled point. You don’t rebuild state by hand. The database is the state.

Limits and deferred features

One run type at a time per kind. Keep your configs focused.
Approval resume endpoint is not yet exposed. waiting runs currently need operator intervention to unblock. This lands in Phase 3.
Mid-run budget action hooks are not exposed. Configs still honor budget_usd_cents as a terminal cap — if the rolled total exceeds the cap, the run fails with budget_exhausted.
Model routing is static. The executor talks to Anthropic via the key you provide. Multi-model failover is on the roadmap under the model-failover.md design doc — contact us if you need it.

Next steps

API reference for agents — the full endpoint list
Multi-tenancy — how tenant isolation applies to agent runs
MCP server — use your existing agent runner with Worktruck data

Getting Started

Concepts

Domains

Payments

When to use this

Available agent kinds

Integration keys — credentials for external services

Your keys run the show (BYOK)

Run lifecycle

Enqueueing a run

Inspecting a run

Guardrails

Cost tracking and auto-disable

What’s in a step

Limits and deferred features

Next steps

Getting Started

Concepts

Domains

Payments

​When to use this

​Available agent kinds

​Integration keys — credentials for external services

​Your keys run the show (BYOK)

​Run lifecycle

​Enqueueing a run

​Inspecting a run

​Guardrails

​Cost tracking and auto-disable

​What’s in a step

​Limits and deferred features

​Next steps

When to use this

Available agent kinds

Integration keys — credentials for external services

Your keys run the show (BYOK)

Run lifecycle

Enqueueing a run

Inspecting a run

Guardrails

Cost tracking and auto-disable

What’s in a step

Limits and deferred features

Next steps