3.5 KiB
ananke
ananke is the host-side power + bootstrap orchestrator for Titan.
It runs outside Kubernetes (systemd on host), so it can:
- shut the cluster down gracefully before battery/runtime redlines
- bring the cluster back after power returns
- recover common Flux/Kustomize startup deadlocks
- validate service health from the outside before declaring startup done
Why ananke
I wanted a name that fits Titan/mythology, but also describes what this service actually does.
In Greek myth, Ananke is inevitability/necessity. That matches this tool: when power events happen, graceful sequencing is not optional.
UPS names in this cluster are also part of the story:
Statera: powerstitan-23,titan-24,titan-jhPyrphoros: powers all other nodes
Breakglass reminder
Vault unseal breakglass is wired for remote retrieval (magic mirror host). If local key retrieval fails, Ananke can use the configured breakglass command.
What “startup complete” means now
Ananke does not stop at “Flux says Ready”. Startup only completes when all configured gates pass:
- Flux source drift guard passes (
expected_flux_source_url+ branch expectation) - Flux kustomizations are healthy
- controller convergence is healthy (deployments/statefulsets/daemonsets)
- external service checklist passes (Gitea, Grafana, Keycloak OIDC, Harbor registry auth challenge, Longhorn auth redirect)
- stability soak window passes (no regressions, no CrashLoop/ImagePull failures)
If any gate fails, startup is blocked with a concrete reason.
Command quick sheet
From titan-db (coordinator):
sudo /usr/local/bin/ananke status --config /etc/ananke/ananke.yaml
sudo /usr/local/bin/ananke startup --config /etc/ananke/ananke.yaml --execute --force-flux-branch main
sudo /usr/local/bin/ananke shutdown --config /etc/ananke/ananke.yaml --execute --reason graceful-maintenance --mode cluster-only
sudo /usr/local/bin/ananke shutdown --config /etc/ananke/ananke.yaml --execute --reason emergency-power --mode poweroff --skip-drain --skip-etcd-snapshot
From titan-24 (tethys peer):
sudo /usr/local/bin/ananke shutdown --config /etc/ananke/ananke.yaml --execute --reason graceful-maintenance --mode cluster-only
Systemd:
sudo systemctl status ananke.service
sudo systemctl start ananke-bootstrap.service
sudo systemctl start ananke-update.service
Shutdown modes (explicit)
ananke shutdown now supports explicit mode selection:
- default behavior is
cluster-only(host poweroff is not performed) --mode config: use config default (shutdown.poweroff_enabled)--mode cluster-only: stop cluster services only (no host poweroff)--mode poweroff: include host poweroff path (explicit only)
This removes ambiguity during drills.
Config file
Primary path:
/etc/ananke/ananke.yaml
Core settings to keep accurate:
expected_flux_branchexpected_flux_source_urlstartup.service_checkliststartup.service_checklist_stability_secondsstartup.ignore_unavailable_nodes(for planned temporary node outages)coordination.role,coordination.peer_hosts
Install / update
sudo ./scripts/install.sh
Installer behavior:
- builds and installs
/usr/local/bin/ananke - installs
ananke*.serviceunits - migrates and enforces current
anankeconfig/state paths
Notes
- Apply changes through Git/Flux manifests; avoid manual in-cluster edits for durable changes.
- For controlled shutdown/startup drills, treat any manual intervention as a bug and fold the logic back into Ananke.