This stack is staged for Flux and intentionally starts the app deployments at `replicas: 0` until images, native OIDC/session support, and smoke gates are ready.
- Backend/frontend deployments remain scaled to `0` until native OIDC/session support, image tags, and smoke gates are ready. Services route to a named `http` target port so Ingress does not depend on numeric container ports.
Veles owns authorization in the app. The `veles` Ingress does not use oauth2-proxy or Traefik forward-auth, so no ingress/auth layer should strip OIDC token claims. The app should validate tokens from `https://sso.bstein.dev/realms/veles` and expect stable `sub`, `email`, `preferred_username`, `groups`, and `realm_access.roles` claims. Do not scale Veles for real user traffic until native OIDC login/session flow is implemented and smoke-tested.
The Keycloak realm setup creates both groups and realm roles named `alpha` and `admin`. Members of the `alpha` group receive the `alpha` realm role; members of `admin` receive both `alpha` and `admin`. Built-in/meta strategies can stay universal, while runs and user-created strategies should remain user-scoped in the Veles database.
Backend runtime secrets are synced from Vault by `veles-vault` into the generated Kubernetes Secret `veles-runtime-secrets`; no secret values are committed. The backend consumes that secret with `envFrom`.
## Artifact Contract
`veles-artifacts` is an RWO Longhorn PVC mounted into backend pods at `/data/veles-artifacts`. Backend pods own artifact writes and serving. Simulation Jobs should not mount or write directly to this PVC unless they are explicitly scheduled on Oceanus with the Veles toleration and the app has chosen a same-node direct-write model. Queue-mediated upload/copy through the backend remains the safer default until the app contract settles.
Backend, simulation workers, and retention/cleanup workers must run on Oceanus/titan-23 when they need artifact access. Frontend pods must not mount `veles-artifacts`.
The backend service account can create, watch, and delete Jobs only inside the `veles` namespace. Simulation pods should use service account `veles-sim`, set `automountServiceAccountToken: false`, and use:
Retention/cleanup Jobs that touch artifacts should use the same node selector and toleration. If they do not need Kubernetes API access, use `veles-sim`; otherwise keep control-plane actions in the backend/controller and run artifact cleanup through a no-token worker.
-`veles-oceanus-artifacts` is RWO for alpha; simulation workers should either run on Oceanus with the backend or stream logs to the backend, which owns writes.
- Longhorn default backup target is `s3://atlas-soteria@us-west-004/` with credential secret `longhorn-backup-b2`; the live `BackupTarget/default` currently reports available. Postgres and artifact volumes have Longhorn recurring snapshot and backup jobs attached by their StorageClasses. This is not a substitute for a tested restore drill.