soteria/README.md
2026-06-19 15:46:22 -03:00

1.5 KiB

soteria

Soteria is the backup and restore console for Atlas PVCs.

Right now it is mainly built around Longhorn. It lists bound PVCs, starts backups, restores a backup into a new PVC, runs namespace-wide backup/restore jobs, and exposes backup health metrics for Grafana. It also has a small React UI so the common restore path does not require remembering the API by hand.

Soteria never overwrites an existing target PVC. Restore work is meant to be explicit and reversible.

How it works

The service runs in-cluster and talks to Kubernetes plus the Longhorn backend. For each PVC it resolves the backing volume, asks Longhorn to snapshot/backup it, and records enough inventory for humans and dashboards to see whether the backup is fresh.

Policies are stored in a Kubernetes secret and evaluated on a timer. Metrics are published at /metrics; the UI and API share the same backend.

Main endpoints:

  • GET /healthz, GET /readyz, GET /metrics
  • GET /v1/inventory
  • GET /v1/backups?namespace=<ns>&pvc=<name>
  • POST /v1/backup
  • POST /v1/backup/namespace
  • POST /v1/restores
  • POST /v1/restores/namespace
  • GET|POST|DELETE /v1/policies
  • GET /v1/b2

When auth is enabled, Soteria expects trusted headers from the fronting proxy and checks SOTERIA_ALLOWED_GROUPS.

Development

go test ./...
./scripts/check.sh

The local deploy manifests live in deploy/. Production wiring should still go through the Flux repo, not one-off cluster edits.