Recovery
Break-glass access to an Arctic agent and clearing a stuck cluster lock
This guide covers the break-glass paths for regaining control of an Arctic agent when normal operator access is unavailable, and for clearing a cluster lock that is stuck.
Recovery token
Each agent generates a recovery token at startup. The token is an out-of-band credential that grants admin scope for a single request, intended for operators with local root access on the host.
- Path:
/etc/arctic/recovery.token - Contents: 32 random bytes, base64url-encoded
- File mode:
0600, owned by the agent's user
The token is presented to the agent in the X-Arctic-Recovery HTTP header. Any
request carrying a matching token is granted admin scope and is logged at WARN so
its use can be audited. The token does not establish a session: it authenticates
each request that presents it, and it stays valid until the next agent restart.
The token rotates every restart
A fresh token is generated on every agent startup, which invalidates the
previous one. Do not cache or distribute it for long-term use; read it fresh
from /etc/arctic/recovery.token each time you need it, and restart the agent
after recovery work completes to retire the token you used.
When to use it
Reach for the recovery token when:
- Operator credentials are lost. The client ID / secret for the cluster are gone and you need to authenticate to rotate them or create new credentials.
- You need to reach an
api_access: internalpeer. Internal-only peers reject user-facing endpoints from normal clients; the recovery token is the only way to drive operator operations against them.
Passing the token to the CLI
The CLI resolves the recovery token from three sources, in order of precedence:
--recovery-token <value>flag$ARCTIC_RECOVERY_TOKENenvironment variable--recovery-token-file <path>flag
For example, reading the token straight off the host:
arctic --recovery-token-file /etc/arctic/recovery.token peers listOr with the value inline:
ARCTIC_RECOVERY_TOKEN="$(sudo cat /etc/arctic/recovery.token)" \
arctic credentials rotateRecovery can be disabled
If the token path is not writable when the agent starts, recovery is disabled for that run; the agent does not fall back to open access. Internal-only mode then becomes strict and there is no break-glass path until the path is made writable and the agent is restarted.
The security boundary for the token is filesystem permissions alone. An attacker with local root on the host can read the token, but that attacker can already do anything to the host, so the token grants no additional reach.
Clearing a stuck cluster lock
Compose apply takes a cluster-wide lock so two operators do not apply conflicting changes at once. If an apply is interrupted (for example the machine running it is killed mid-run), the lock can be left held. Subsequent applies then fail with a contention error that includes the lock ID.
Release the stale lock with:
arctic state unlock <lock-id>The contention error reports the lock ID to pass here. Useful flags:
| Flag | Description |
|---|---|
--force | Admin override on the cluster tier; bypasses the holder-id check. |
--cluster-only | Release only the cluster lock, not the local file lock. |
--local-only | Release only the local file lock, not the cluster lock. |
--state-dir <path> | Override the .arctic/ state directory location. |
The unlock operation uses the cluster.lock scope. Only run it once you are
sure no other apply is genuinely in progress; releasing a live lock can let two
applies collide.
See Also
- Upgrades - upgrading agents and the CLI
- Declarative Cluster Management - the compose apply workflow