2026-04-03 01:43:16 -03:00
|
|
|
# Hecate
|
|
|
|
|
|
|
|
|
|
Hecate is the host-level bootstrap and power-protection service for Titan.
|
|
|
|
|
|
|
|
|
|
It runs on `titan-db` and handles:
|
|
|
|
|
- Staged **startup** (including Flux/Gitea bootstrap deadlock fallback)
|
|
|
|
|
- Graceful **shutdown**
|
|
|
|
|
- UPS-driven automatic shutdown decisions based on discharge/runtime
|
2026-04-03 14:46:03 -03:00
|
|
|
- Multi-UPS operation via multiple Hecate instances (for example `titan-db` + `tethys`)
|
|
|
|
|
- Full hardware poweroff sequencing after graceful Kubernetes shutdown
|
2026-04-03 01:43:16 -03:00
|
|
|
|
|
|
|
|
## Why host-level
|
|
|
|
|
|
|
|
|
|
A service inside Kubernetes cannot start a cluster that is fully down.
|
|
|
|
|
Hecate runs outside the cluster under systemd, so it can always orchestrate bring-up.
|
|
|
|
|
|
|
|
|
|
## Commands
|
|
|
|
|
|
|
|
|
|
- `hecate startup --config /etc/hecate/hecate.yaml --execute --force-flux-branch main`
|
|
|
|
|
- `hecate shutdown --config /etc/hecate/hecate.yaml --execute`
|
|
|
|
|
- `hecate daemon --config /etc/hecate/hecate.yaml`
|
|
|
|
|
- `hecate status --config /etc/hecate/hecate.yaml`
|
|
|
|
|
|
|
|
|
|
## Manual install on titan-db
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
git clone git@gitea-admin:bstein/hecate.git
|
|
|
|
|
cd hecate
|
2026-04-03 14:46:03 -03:00
|
|
|
sudo HECATE_ENABLE_BOOTSTRAP=1 ./scripts/install.sh
|
2026-04-03 01:43:16 -03:00
|
|
|
sudoedit /etc/hecate/hecate.yaml
|
|
|
|
|
sudo systemctl restart hecate.service
|
|
|
|
|
```
|
|
|
|
|
|
2026-04-03 14:46:03 -03:00
|
|
|
The installer is idempotent:
|
|
|
|
|
- Re-runs safely on every update
|
|
|
|
|
- Preserves existing `/etc/hecate/hecate.yaml`
|
|
|
|
|
- Ensures required dependencies are installed (`kubectl`, `nut-*`, `ssh`, `go`, etc.)
|
|
|
|
|
- Installs/refreshes systemd units and enables boot-time self-update
|
2026-04-03 15:17:26 -03:00
|
|
|
- Applies declarative NUT + udev UPS configuration by default (can be tuned via env vars)
|
|
|
|
|
|
|
|
|
|
Installer knobs (optional):
|
|
|
|
|
- `HECATE_ENABLE_BOOTSTRAP=1` enables `hecate-bootstrap.service` on this host.
|
2026-04-03 15:21:43 -03:00
|
|
|
- `HECATE_ENABLE_BOOTSTRAP=0` disables it; default `auto` preserves current bootstrap enablement state.
|
2026-04-03 15:17:26 -03:00
|
|
|
- `HECATE_MANAGE_NUT=0` skips writing NUT/udev files.
|
|
|
|
|
- `HECATE_NUT_UPS_NAME` (default `atlasups`)
|
|
|
|
|
- `HECATE_NUT_VENDOR_ID` / `HECATE_NUT_PRODUCT_ID` (defaults `0764` / `0601`)
|
|
|
|
|
- `HECATE_NUT_MONITOR_USER` / `HECATE_NUT_MONITOR_PASSWORD` (defaults `monuser` / `atlasupsmon`)
|
2026-04-03 14:46:03 -03:00
|
|
|
|
2026-04-03 01:43:16 -03:00
|
|
|
Bootstrap now (without reboot):
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
sudo systemctl start hecate-bootstrap.service
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Preconditions on titan-db
|
|
|
|
|
|
|
|
|
|
- `kubectl` installed and configured (`kubeconfig` path in config)
|
|
|
|
|
- SSH reachability to all cluster nodes
|
|
|
|
|
- Remote sudo rights to run:
|
|
|
|
|
- `systemctl start/stop k3s`
|
|
|
|
|
- `systemctl start/stop k3s-agent`
|
|
|
|
|
- UPS telemetry available via NUT (`upsc`)
|
|
|
|
|
|
2026-04-03 14:46:03 -03:00
|
|
|
## Multi-UPS topology
|
|
|
|
|
|
|
|
|
|
Recommended:
|
|
|
|
|
- `titan-db` runs Hecate as the shutdown coordinator (local UPS target + local shutdown execution).
|
|
|
|
|
- `tethys` runs Hecate with local UPS target and forwards shutdown triggers to `titan-db`.
|
|
|
|
|
- If forwarding fails, fallback local shutdown can remain enabled.
|
|
|
|
|
|
2026-04-03 01:43:16 -03:00
|
|
|
## Config
|
|
|
|
|
|
|
|
|
|
See `configs/hecate.example.yaml`.
|
|
|
|
|
|
|
|
|
|
UPS auto-shutdown trigger uses:
|
|
|
|
|
- runtime threshold = `runtime_safety_factor * estimated_shutdown_budget`
|
|
|
|
|
- default safety factor `1.10`
|
|
|
|
|
- debounce across multiple polls to avoid noise
|
|
|
|
|
|
|
|
|
|
Estimated shutdown budget is derived from historical successful shutdown runs (`/var/lib/hecate/runs.json`) with default fallback from config.
|
|
|
|
|
|
2026-04-03 14:46:03 -03:00
|
|
|
Power metrics:
|
|
|
|
|
- Hecate exposes Prometheus metrics on `:9560/metrics` by default.
|
|
|
|
|
- This is intended for a dedicated Grafana power dashboard and a high-level overview row.
|
|
|
|
|
|
2026-04-03 01:43:16 -03:00
|
|
|
## Notes
|
|
|
|
|
|
|
|
|
|
- Default behavior for `startup` and `shutdown` is dry-run unless `--execute` is set.
|
|
|
|
|
- `hecate-bootstrap.service` is enabled to run at host boot and perform staged startup automatically.
|
2026-04-03 14:46:03 -03:00
|
|
|
- `HECATE_ENABLE_BOOTSTRAP=1` enables `hecate-bootstrap.service` (recommended on `titan-db`; keep disabled on non-coordinator hosts).
|
|
|
|
|
- `hecate-update.timer` runs on boot and periodically to pull latest `main` and reinstall Hecate declaratively.
|