2026-01-31 03:34:34 -03:00
# soteria
2026-02-06 18:25:19 -03:00
Soteria is a small in-cluster service that orchestrates Longhorn backups for PVCs. It is intended to be called by Ariadne (or another controller) and focuses on:
2026-01-31 03:34:34 -03:00
2026-02-06 18:25:19 -03:00
- Longhorn-managed backups to an S3-compatible backend (Backblaze B2 by default).
- On-demand restore tests into a target PVC.
- Minimal long-running footprint (the backup work happens in Longhorn).
2026-01-31 03:34:34 -03:00
2026-02-06 18:25:19 -03:00
Snapshots are managed by Longhorn; backups are crash-consistent for the PVC as mounted.
2026-01-31 03:34:34 -03:00
## API
2026-02-06 18:25:19 -03:00
### POST /v1/backup (Longhorn)
2026-01-31 03:34:34 -03:00
```json
{
"namespace": "ai",
"pvc": "llm-cache",
"tags": ["namespace=ai", "service=llm"],
"dry_run": false
}
```
Response:
```json
{
"job_name": "soteria-backup-llm-cache-20260131-013001",
"namespace": "ai",
"secret": "soteria-soteria-backup-llm-cache-20260131-013001-restic",
"dry_run": false
}
```
2026-02-06 18:25:19 -03:00
### POST /v1/restore-test (Longhorn)
2026-01-31 03:34:34 -03:00
```json
{
"namespace": "ai",
"snapshot": "latest",
2026-02-06 18:25:19 -03:00
"pvc": "ollama-models",
2026-01-31 03:34:34 -03:00
"target_pvc": "restore-sandbox",
"dry_run": false
}
```
2026-02-06 18:25:19 -03:00
Notes:
- `pvc` is required to resolve the Longhorn volume and locate the latest backup.
- `snapshot` may be `latest` or a specific backup/snapshot name. You can also pass `backup_url` .
2026-01-31 03:34:34 -03:00
## Configuration
Environment variables:
2026-02-06 18:25:19 -03:00
- `SOTERIA_BACKUP_DRIVER` (default: `longhorn` , allowed: `longhorn` , `restic` )
- `SOTERIA_LONGHORN_URL` (default: `http://longhorn-backend.longhorn-system.svc:9500` )
- `SOTERIA_LONGHORN_BACKUP_MODE` (default: `incremental` , allowed: `incremental` , `full` )
- `SOTERIA_RESTIC_REPOSITORY` (required for restic driver) Example: `s3:s3.us-west-004.backblazeb2.com/atlas-backups`
2026-01-31 03:34:34 -03:00
- `SOTERIA_RESTIC_SECRET_NAME` (default: `soteria-restic` )
- `SOTERIA_SECRET_NAMESPACE` (default: service namespace)
- `SOTERIA_RESTIC_IMAGE` (default: `restic/restic:0.16.4` )
- `SOTERIA_RESTIC_BACKUP_ARGS` (optional) Extra args for `restic backup`
- `SOTERIA_RESTIC_FORGET_ARGS` (optional) Extra args for `restic forget` (include `--prune` if desired)
- `SOTERIA_S3_ENDPOINT` (optional) Example: `s3.us-west-004.backblazeb2.com`
- `SOTERIA_S3_REGION` (optional) Example: `us-west-004`
- `SOTERIA_JOB_TTL_SECONDS` (default: 86400)
2026-02-06 04:08:23 -03:00
- `SOTERIA_JOB_NODE_SELECTOR` (optional) Comma-separated node selector, e.g. `kubernetes.io/arch=arm64,node-role.kubernetes.io/worker=true`
2026-01-31 03:34:34 -03:00
- `SOTERIA_JOB_SERVICE_ACCOUNT` (optional) ServiceAccount for backup Jobs
- `SOTERIA_LISTEN_ADDR` (default: `:8080` )
The restic repository is encrypted with `RESTIC_PASSWORD` from the secret below.
## Secrets
2026-02-06 18:25:19 -03:00
Create a secret named `soteria-restic` in the Soteria namespace (or set `SOTERIA_RESTIC_SECRET_NAME` ) if using the restic driver. Keys required:
2026-01-31 03:34:34 -03:00
- `AWS_ACCESS_KEY_ID`
- `AWS_SECRET_ACCESS_KEY`
- `RESTIC_PASSWORD`
The service copies this secret into the target namespace per job and attaches an owner reference so it gets cleaned up with the Job.
A template is in `deploy/secret-example.yaml` (do not commit real credentials).
## Deployment
The `deploy/` folder includes Kustomize-ready manifests:
- `namespace.yaml`
- `configmap.yaml` (set your repository and endpoint)
- `serviceaccount.yaml`
- `clusterrole.yaml`
- `clusterrolebinding.yaml`
- `deployment.yaml`
- `service.yaml`
Apply with:
```sh
kubectl apply -k deploy
```
## Notes
- Backups mount the PVC read-only at `/data` and run `restic backup /data` .
- Restore tests write into `/restore` (either an emptyDir or a target PVC).
- For Backblaze B2, use the S3 endpoint and region for your bucket (example: `s3.us-west-004.backblazeb2.com` ).