autonomy-orchestrator

When to use this reference

This reference covers running your own autonomy-orchestrator binary — self-host operation: starting the HTTP API with serve, migrating data with migrate, and the full flag surface for TLS, logging, and metrics.

If you have been issued a hosted control-plane URL by an AutonomyOps tenant and you only need to point your autonomy CLI at it (no binary to run, no local infrastructure to set up), use the SaaS tier’s Hosted Orchestrator connect-path documentation.

Both paths share the same autonomy CLI surface (fleet, rollout, logs, …) — the only difference is who operates the orchestrator process.

Deeper deployment guidance

This reference is the public-safe surface: flags, endpoints, error responses. Topology decisions (single-node SQLite vs HA PostgreSQL cutover timing), sizing rules, CRL distribution patterns for multi- replica deployments, pre-deployment checklists, and operational pitfalls observed in past pilots are consolidated in an engineering-and-partner-tier deployment guide that is not published to this site. If you are operating a tenant deployment and need that material, reach out via autonomyops.ai and an engineer can share the internal guide (docs/internal/orchestrator-self-hosted-deployment.md in the adk repository) under the appropriate access agreement.

Common usage

autonomy-orchestrator --help
autonomy-orchestrator version
autonomy-orchestrator serve --listen 127.0.0.1:8888 --data-dir /var/lib/autonomy/orchestrator
autonomy-orchestrator serve --listen 0.0.0.0:8443 --tls-cert-file server.crt --tls-key-file server.key --tls-ca-file ca.crt
autonomy-orchestrator migrate --sqlite-dir /var/lib/autonomy/orchestrator --postgres-url "$AUTONOMY_POSTGRES_URL"
autonomy-orchestrator migrate --dry-run --sqlite-dir /var/lib/autonomy/orchestrator --postgres-url "$AUTONOMY_POSTGRES_URL"

Commands

serve

Starts the control-plane HTTP API on --listen backed by a local SQLite store. serve does not accept a PostgreSQL connection — SQLite is the only backend. Use migrate to transfer data to PostgreSQL for HA deployments.

Endpoints:

Method

Path

Description

POST

/v1/events

Ingest a JSON batch: {"events":[{...},...]}

GET

/v1/events

Query stored events (newest first); params: node_id, event_type, since (RFC3339), limit (default 100, max 1000)

GET

/v1/health

Liveness probe: {"status":"ok"}

GET

/v1/certs/crl

Serve CRL bytes (Content-Type: application/pkix-crl); 404 when --tls-crl-file is not set; ETag-aware

POST

/v1/releases

Publish a desired-state release

GET

/v1/releases/latest

Get the latest release for a channel (?channel=...)

GET

/v1/releases/{release_id}/acks

Get all node acks for a release

POST

/v1/nodes/{node_id}/ack

Upsert a node’s ack for a release

GET

/v1/nodes/{node_id}/acks

Get all acks submitted by a node

GET

/v1/fleet/summary

Per-node observational summary (?channel=...&stale_threshold=N)

POST

/v1/rollouts

Publish a rollout plan

GET

/v1/rollouts

List rollout plans

GET

/v1/rollouts/{plan_id}

Get rollout plan and status

GET

/v1/rollouts/{plan_id}/plan

Immutable signed plan spec

GET

/v1/rollouts/{plan_id}/status

Mutable rollout status

POST

/v1/rollouts/{plan_id}/pause

Pause an active rollout

POST

/v1/rollouts/{plan_id}/resume

Resume a paused or halted rollout

POST

/v1/rollouts/{plan_id}/promote

Manually promote the current stage

POST

/v1/rollouts/{plan_id}/halt

Halt a rollout

DELETE

/v1/rollouts/{plan_id}

Cancel a rollout (terminal)

GET

/metrics

Prometheus metrics; only registered when --metrics-addr is set

HA health endpoints not available in standalone mode. The routes under /v1/health/read-ready, /v1/health/write-ready, /v1/health/quorum, /v1/health/audit, /v1/health/leader, /v1/ha/* require a PostgreSQL HealthServer to be wired at startup (RegisterPGHealth). The standalone autonomy-orchestrator serve command does not call RegisterPGHealth and therefore does not expose these endpoints.

Key flags:

Flag

Default

Description

--listen

0.0.0.0:8888

TCP address to listen on

--data-dir

XDG cache dir

Data directory for SQLite storage

--metrics-addr

(disabled)

Prometheus metrics listen address (e.g. :9090); also registers GET /metrics on the main port

--log-format

json

Log output format: json or text

--log-level

info

Minimum log level: debug, info, warn, error

--tls-cert-file

Server TLS certificate (PEM); enables TLS with --tls-key-file

--tls-key-file

Server TLS private key (PEM)

--tls-ca-file

CA certificate (PEM) for client verification (enables mTLS)

--tls-crl-file

CRL file (PEM) to reject revoked client certificates; also enables GET /v1/certs/crl

--tls-crl-sync-url

Control-plane CRL endpoint to pull (repeatable)

--tls-crl-sync-min-sources

1

Minimum CRL publishers that must agree before accepting an update

--tls-crl-sync-interval

1m

CRL pull refresh interval; 0 disables background refresh

--tls-crl-server-name

TLS server name override for --tls-crl-sync-url

Notes:

  • The control plane holds observational authority only: it stores events as received and derives nothing from them (v1.13 §1.2.3).

  • Duplicate event_id values are silently ignored (idempotent ingest).

  • When --tls-crl-file is set, the CRL is hot-reloaded on each new client handshake; no restart is required after running autonomy cert revoke.

  • Fail-closed guarantee: if --tls-crl-file is set but the file is missing or has an invalid CA signature, the server refuses to start.

  • GET /metrics is registered on the main mux in addition to the dedicated --metrics-addr server when --metrics-addr is set.

Error responses:

Code

Condition

400

Malformed JSON or missing required field

405

Method not allowed

503

SQLite writer busy beyond busy_timeout

507

Disk full

(fatal)

SQLITE_CORRUPT — process exits immediately

migrate

Migrates all control-plane data from a SQLite source database to a PostgreSQL target. The SQLite source is read-only during migration; no source data is modified or deleted.

Migration order: schema (idempotent) → releases + node_acks → events → rollout_plans + stage_status → row-count validation → HACutoverAt record.

Key flags:

Flag

Default

Description

--sqlite-dir

same as serve --data-dir

Directory containing the SQLite events.db

--postgres-url

$AUTONOMY_POSTGRES_URL

PostgreSQL connection URL

--batch-size

500

Number of rows per INSERT batch

--dry-run

false

Preview counts without writing to PostgreSQL

--validate

false

Check row counts only; skip data copy

Rollback: stop the PostgreSQL CP nodes, revert config to backend=sqlite, restart. The SQLite source is untouched and safe to roll back to.

version

Prints the binary version, build SHA, build date, tier, and runtime platform.

autonomy-orchestrator version
autonomy-orchestrator version --output json

Evidence

  • cmd/autonomy-orchestrator/main.go

  • cmd/autonomy-orchestrator/commands/root.go

  • cmd/autonomy-orchestrator/commands/serve.go

  • cmd/autonomy-orchestrator/commands/migrate.go

  • cmd/autonomy-orchestrator/commands/version.go

  • orchestrator/server.go — route registrations (NewServer, SetMetrics, RegisterPGHealth)

See also

autonomy-orchestrator is the standalone fleet orchestrator binary for the AutonomyOps ADK. It stores incoming telemetry events from edge nodes and exposes a query interface for fleet-wide observability.