diff --git a/AGENTS.md b/AGENTS.md
deleted file mode 100644
index a8d49c8..0000000
--- a/AGENTS.md
+++ /dev/null
@@ -1,68 +0,0 @@
-
-
-Repository Guidelines
-
-## Project Structure & Module Organization
-- `infrastructure/`: cluster-scoped building blocks (core, flux-system, traefik, longhorn). Add new platform features by mirroring this layout.
-- `services/`: workload manifests per app (`services/gitea/`, etc.) with `kustomization.yaml` plus one file per kind; keep diffs small and focused.
-- `dockerfiles/` hosts bespoke images, while `scripts/` stores operational Fish/Bash helpers—extend these directories instead of relying on ad-hoc commands.
-
-## Build, Test, and Development Commands
-- `kustomize build services/<app>` (or `kubectl kustomize ...`) renders manifests exactly as Flux will.
-- `kubectl apply --server-side --dry-run=client -k services/<app>` checks schema compatibility without touching the cluster.
-- `flux reconcile kustomization <name> --namespace flux-system --with-source` pulls the latest Git state after merges or hotfixes.
-- `fish scripts/flux_hammer.fish --help` explains the recovery tool; read it before running against production workloads.
-
-## Coding Style & Naming Conventions
-- YAML uses two-space indents; retain the leading path comment (e.g. `# services/gitea/deployment.yaml`) to speed code review.
-- Keep resource names lowercase kebab-case, align labels/selectors, and mirror namespaces with directory names.
-- List resources in `kustomization.yaml` from namespace/config, through storage, then workloads and networking for predictable diffs.
-- Scripts start with `#!/usr/bin/env fish` or bash, stay executable, and follow snake_case names such as `flux_hammer.fish`.
-
-## Testing Guidelines
-- Run `kustomize build` and the dry-run apply for every service you touch; capture failures before opening a PR.
-- `flux diff kustomization <name> --path services/<app>` previews reconciliations—link notable output when behavior shifts.
-- Docker edits: `docker build -f dockerfiles/Dockerfile.monerod .` (swap the file you changed) to verify image builds.
-
-## Commit & Pull Request Guidelines
-- Keep commit subjects short, present-tense, and optionally scoped (`gpu(titan-24): add RuntimeClass`); squash fixups before review.
-- Describe linked issues, affected services, and required operator steps (e.g. `flux reconcile kustomization services-gitea`) in the PR body.
-- Focus each PR on one kustomization or service and update `infrastructure/flux-system` when Flux must track new folders.
-- Record the validation you ran (dry-runs, diffs, builds) and add screenshots only when ingress or UI behavior changes.
-
-## Security & Configuration Tips
-- Never commit credentials; use Vault workflows (`services/vault/`) or SOPS-encrypted manifests wired through `infrastructure/flux-system`.
-- Node selectors and tolerations gate workloads to hardware like `hardware: rpi4`; confirm labels before scaling or renaming nodes.
-- Pin external images by digest or rely on Flux image automation to follow approved tags and avoid drift.
-
-## Dashboard roadmap / context (2025-12-02)
-- Atlas dashboards are generated via `scripts/dashboards_render_atlas.py --build`, which writes JSON under `services/monitoring/dashboards/` and ConfigMaps under `services/monitoring/`. Keep the Grafana manifests in sync by regenerating after edits.
-- Atlas Overview panels are paired with internal dashboards (pods, nodes, storage, network, GPU). A new `atlas-gpu` internal dashboard holds the detailed GPU metrics that feed the overview share pie.
-- Old Grafana folders (`Atlas Storage`, `Atlas SRE`, `Atlas Public`, `Atlas Nodes`) should be removed in Grafana UI when convenient; only `Atlas Overview` and `Atlas Internal` should remain provisioned.
-- Future work: add a separate generator (e.g., `dashboards_render_oceanus.py`) for SUI/oceanus validation dashboards, mirroring the atlas pattern of internal dashboards feeding a public overview.
-
-## Monitoring state (2025-12-03)
-- dcgm-exporter DaemonSet pulls `registry.bstein.dev/monitoring/dcgm-exporter:4.4.2-4.7.0-ubuntu22.04` with nvidia runtime/imagePullSecret; titan-24 exports metrics, titan-22 remains NotReady.
-- Atlas Overview is the Grafana home (1h range, 1m refresh), Overview folder UID `overview`, internal folder `atlas-internal` (oceanus-internal stub).
-- Panels standardized via generator; hottest row compressed, worker/control rows taller, root disk row taller and top12 bar gauge with labels. GPU share pie uses 1h avg_over_time to persist idle activity.
-- Internal dashboards are provisioned without Viewer role; if anonymous still sees them, restart Grafana and tighten auth if needed.
-
-## Upcoming priorities (SSO/storage/mail)
-- Establish SSO (Keycloak or similar) and federate Grafana, Gitea, Zot, Nextcloud, Pegasus/Jellyfin; keep Vaultwarden separate until safe.
-- Add Nextcloud (limit to rpi5 workers) with office suite; integrate with SSO; plan storage class and ingress.
-- Plan mail: mostly self-hosted, relay through trusted provider for outbound; integrate with services (Nextcloud, Vaultwarden, etc.) for notifications and account flows.
-
-## SSO plan sketch (2025-12-03)
-- IdP: use Keycloak (preferred) in a new `sso` namespace, Bitnami or codecentric chart with Postgres backing store (single PVC), ingress `sso.bstein.dev`, admin user bound to brad@bstein.dev; stick with local DB initially (no external IdP).
-- Auth flow goals: Grafana (OIDC), Gitea (OAuth2/Keycloak), Zot (via Traefik forward-auth/oauth2-proxy), Jellyfin/Pegasus via Jellyfin OAuth/OpenID plugin (map existing usernames; run migration to pre-create users in Keycloak with same usernames/emails and temporary passwords), Pegasus keeps using Jellyfin tokens.
-- Steps to implement:
-  1) Add service folder `services/keycloak/` (namespace, PVC, HelmRelease, ingress, secret for admin creds). Verify with kustomize + Flux reconcile.
-  2) Seed realm `atlas` with users (import CSV/realm). Create client for Grafana (public/implicit), Gitea (confidential), and a “jellyfin” client for the OAuth plugin; set email for brad@bstein.dev as admin.
-  3) Reconfigure Grafana to OIDC (disable anonymous to internal folders, leave Overview public via folder permissions). Reconfigure Gitea to OIDC (app.ini).
-  4) Add Traefik forward-auth (oauth2-proxy) in front of Zot and any other services needing headers-based auth.
-  5) Deploy Jellyfin OpenID plugin; map Keycloak users to existing Jellyfin usernames; communicate password reset path.
-- Migration caution: do not delete existing local creds until SSO validated; keep Pegasus working via Jellyfin tokens during transition.
-
-## Postgres centralization (2025-12-03)
-- Prefer a shared in-cluster Postgres deployment with per-service databases to reduce resource sprawl on Pi nodes. Use it for services that can easily point at an external DB.
-- Candidates to migrate to shared Postgres: Keycloak (realm DB), Gitea (git DB), Nextcloud (app DB), possibly Grafana (if persistence needed beyond current provisioner), Jitsi prosody/JVB state (if external DB supported). Keep tightly-coupled or lightweight embedded DBs as-is when migration is painful or not supported.
diff --git a/clusters/atlas/flux-system/applications/keycloak/kustomization.yaml b/clusters/atlas/flux-system/applications/keycloak/kustomization.yaml
new file mode 100644
index 0000000..4634b5c
--- /dev/null
+++ b/clusters/atlas/flux-system/applications/keycloak/kustomization.yaml
@@ -0,0 +1,15 @@
+# clusters/atlas/flux-system/applications/keycloak/kustomization.yaml
+apiVersion: kustomize.toolkit.fluxcd.io/v1
+kind: Kustomization
+metadata:
+  name: keycloak
+  namespace: flux-system
+spec:
+  interval: 10m
+  prune: true
+  sourceRef:
+    kind: GitRepository
+    name: flux-system
+  path: ./services/keycloak
+  targetNamespace: sso
+  timeout: 2m
diff --git a/clusters/atlas/flux-system/applications/kustomization.yaml b/clusters/atlas/flux-system/applications/kustomization.yaml
index 7d2f8ee..1bc2700 100644
--- a/clusters/atlas/flux-system/applications/kustomization.yaml
+++ b/clusters/atlas/flux-system/applications/kustomization.yaml
@@ -13,3 +13,5 @@ resources:
   - jellyfin/kustomization.yaml
   - xmr-miner/kustomization.yaml
   - sui-metrics/kustomization.yaml
+  - keycloak/kustomization.yaml
+  - oauth2-proxy/kustomization.yaml
diff --git a/clusters/atlas/flux-system/applications/oauth2-proxy/kustomization.yaml b/clusters/atlas/flux-system/applications/oauth2-proxy/kustomization.yaml
new file mode 100644
index 0000000..187572d
--- /dev/null
+++ b/clusters/atlas/flux-system/applications/oauth2-proxy/kustomization.yaml
@@ -0,0 +1,15 @@
+# clusters/atlas/flux-system/applications/oauth2-proxy/kustomization.yaml
+apiVersion: kustomize.toolkit.fluxcd.io/v1
+kind: Kustomization
+metadata:
+  name: oauth2-proxy
+  namespace: flux-system
+spec:
+  interval: 10m
+  prune: true
+  sourceRef:
+    kind: GitRepository
+    name: flux-system
+  path: ./services/oauth2-proxy
+  targetNamespace: sso
+  timeout: 2m
diff --git a/clusters/atlas/flux-system/gotk-sync.yaml b/clusters/atlas/flux-system/gotk-sync.yaml
index 46f65d3..4076ef6 100644
--- a/clusters/atlas/flux-system/gotk-sync.yaml
+++ b/clusters/atlas/flux-system/gotk-sync.yaml
@@ -8,7 +8,7 @@ metadata:
 spec:
   interval: 1m0s
   ref:
-    branch: feature/atlas-monitoring
+    branch: feature/sso
   secretRef:
     name: flux-system-gitea
   url: ssh://git@scm.bstein.dev:2242/bstein/titan-iac.git
diff --git a/infrastructure/longhorn/ui-ingress/ingress.yaml b/infrastructure/longhorn/ui-ingress/ingress.yaml
index 6250cfa..94daeed 100644
--- a/infrastructure/longhorn/ui-ingress/ingress.yaml
+++ b/infrastructure/longhorn/ui-ingress/ingress.yaml
@@ -7,7 +7,7 @@ metadata:
   annotations:
     traefik.ingress.kubernetes.io/router.entrypoints: websecure
     traefik.ingress.kubernetes.io/router.tls: "true"
-    traefik.ingress.kubernetes.io/router.middlewares: longhorn-system-longhorn-basicauth@kubernetescrd,longhorn-system-longhorn-headers@kubernetescrd
+    traefik.ingress.kubernetes.io/router.middlewares: ""
 spec:
   ingressClassName: traefik
   tls:
@@ -21,6 +21,6 @@ spec:
             pathType: Prefix
             backend:
               service:
-                name: longhorn-frontend
+                name: oauth2-proxy-longhorn
                 port:
                   number: 80
diff --git a/infrastructure/longhorn/ui-ingress/kustomization.yaml b/infrastructure/longhorn/ui-ingress/kustomization.yaml
index 1d497dc..a2ae5f3 100644
--- a/infrastructure/longhorn/ui-ingress/kustomization.yaml
+++ b/infrastructure/longhorn/ui-ingress/kustomization.yaml
@@ -4,3 +4,4 @@ kind: Kustomization
 resources:
   - middleware.yaml
   - ingress.yaml
+  - oauth2-proxy-longhorn.yaml
diff --git a/infrastructure/longhorn/ui-ingress/middleware.yaml b/infrastructure/longhorn/ui-ingress/middleware.yaml
index c670cef..3bf2ff5 100644
--- a/infrastructure/longhorn/ui-ingress/middleware.yaml
+++ b/infrastructure/longhorn/ui-ingress/middleware.yaml
@@ -20,3 +20,20 @@ spec:
   headers:
     customRequestHeaders:
       X-Forwarded-Proto: "https"
+
+---
+
+apiVersion: traefik.io/v1alpha1
+kind: Middleware
+metadata:
+  name: longhorn-forward-auth
+  namespace: longhorn-system
+spec:
+  forwardAuth:
+    address: https://auth.bstein.dev/oauth2/auth
+    trustForwardHeader: true
+    authResponseHeaders:
+      - Authorization
+      - X-Auth-Request-Email
+      - X-Auth-Request-User
+      - X-Auth-Request-Groups
diff --git a/infrastructure/longhorn/ui-ingress/oauth2-proxy-longhorn.yaml b/infrastructure/longhorn/ui-ingress/oauth2-proxy-longhorn.yaml
new file mode 100644
index 0000000..b8d4f34
--- /dev/null
+++ b/infrastructure/longhorn/ui-ingress/oauth2-proxy-longhorn.yaml
@@ -0,0 +1,102 @@
+# infrastructure/longhorn/ui-ingress/oauth2-proxy-longhorn.yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: oauth2-proxy-longhorn
+  namespace: longhorn-system
+  labels:
+    app: oauth2-proxy-longhorn
+spec:
+  ports:
+    - name: http
+      port: 80
+      targetPort: 4180
+  selector:
+    app: oauth2-proxy-longhorn
+
+---
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: oauth2-proxy-longhorn
+  namespace: longhorn-system
+  labels:
+    app: oauth2-proxy-longhorn
+spec:
+  replicas: 2
+  selector:
+    matchLabels:
+      app: oauth2-proxy-longhorn
+  template:
+    metadata:
+      labels:
+        app: oauth2-proxy-longhorn
+    spec:
+      nodeSelector:
+        node-role.kubernetes.io/worker: "true"
+      affinity:
+        nodeAffinity:
+          preferredDuringSchedulingIgnoredDuringExecution:
+            - weight: 90
+              preference:
+                matchExpressions:
+                  - key: hardware
+                    operator: In
+                    values: ["rpi5","rpi4"]
+      containers:
+        - name: oauth2-proxy
+          image: quay.io/oauth2-proxy/oauth2-proxy:v7.6.0
+          imagePullPolicy: IfNotPresent
+          args:
+            - --provider=oidc
+            - --redirect-url=https://longhorn.bstein.dev/oauth2/callback
+            - --oidc-issuer-url=https://sso.bstein.dev/realms/atlas
+            - --scope=openid profile email groups
+            - --email-domain=*
+            - --allowed-group=admin
+            - --set-xauthrequest=true
+            - --pass-access-token=true
+            - --set-authorization-header=true
+            - --cookie-secure=true
+            - --cookie-samesite=lax
+            - --cookie-refresh=20m
+            - --cookie-expire=168h
+            - --insecure-oidc-allow-unverified-email=true
+            - --upstream=http://longhorn-frontend.longhorn-system.svc.cluster.local
+            - --http-address=0.0.0.0:4180
+            - --skip-provider-button=true
+            - --skip-jwt-bearer-tokens=true
+            - --oidc-groups-claim=groups
+            - --cookie-domain=longhorn.bstein.dev
+          env:
+            - name: OAUTH2_PROXY_CLIENT_ID
+              valueFrom:
+                secretKeyRef:
+                  name: oauth2-proxy-longhorn-oidc
+                  key: client_id
+            - name: OAUTH2_PROXY_CLIENT_SECRET
+              valueFrom:
+                secretKeyRef:
+                  name: oauth2-proxy-longhorn-oidc
+                  key: client_secret
+            - name: OAUTH2_PROXY_COOKIE_SECRET
+              valueFrom:
+                secretKeyRef:
+                  name: oauth2-proxy-longhorn-oidc
+                  key: cookie_secret
+          ports:
+            - containerPort: 4180
+              name: http
+          readinessProbe:
+            httpGet:
+              path: /ping
+              port: 4180
+            initialDelaySeconds: 5
+            periodSeconds: 10
+          livenessProbe:
+            httpGet:
+              path: /ping
+              port: 4180
+            initialDelaySeconds: 20
+            periodSeconds: 20
diff --git a/scripts/dashboards_render_atlas.py b/scripts/dashboards_render_atlas.py
index 93de006..f577eab 100644
--- a/scripts/dashboards_render_atlas.py
+++ b/scripts/dashboards_render_atlas.py
@@ -232,7 +232,7 @@ NAMESPACE_GPU_ALLOC = (
     ' or kube_pod_container_resource_limits{namespace!="",resource="nvidia.com/gpu"})) by (namespace)'
 )
 NAMESPACE_GPU_USAGE_SHARE = (
-    'sum by (namespace) (avg_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[1h]))'
+    'sum by (namespace) (max_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[$__range]))'
 )
 NAMESPACE_GPU_USAGE_INSTANT = 'sum(DCGM_FI_DEV_GPU_UTIL{namespace!="",pod!=""}) by (namespace)'
 NAMESPACE_GPU_RAW = (
diff --git a/services/keycloak/README.md b/services/keycloak/README.md
new file mode 100644
index 0000000..bf7c21b
--- /dev/null
+++ b/services/keycloak/README.md
@@ -0,0 +1,27 @@
+# services/keycloak
+
+Keycloak is deployed via raw manifests and backed by the shared Postgres (`postgres-service.postgres.svc.cluster.local:5432`). Create these secrets before applying:
+
+```bash
+# DB creds (per-service DB/user in shared Postgres)
+kubectl -n sso create secret generic keycloak-db \
+  --from-literal=username=keycloak \
+  --from-literal=password='<DB_PASSWORD>' \
+  --from-literal=database=keycloak
+
+# Admin console creds (maps to KC admin user)
+kubectl -n sso create secret generic keycloak-admin \
+  --from-literal=username=brad@bstein.dev \
+  --from-literal=password='<ADMIN_PASSWORD>'
+```
+
+Apply:
+
+```bash
+kubectl apply -k services/keycloak
+```
+
+Notes
+- Service: `keycloak.sso.svc:80` (Ingress `sso.bstein.dev`, TLS via cert-manager).
+- Uses Postgres schema `public`; DB/user should be provisioned in the shared Postgres instance.
+- Health endpoints on :9000 are wired for probes.
diff --git a/services/keycloak/deployment.yaml b/services/keycloak/deployment.yaml
new file mode 100644
index 0000000..af7839f
--- /dev/null
+++ b/services/keycloak/deployment.yaml
@@ -0,0 +1,132 @@
+# services/keycloak/deployment.yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: keycloak
+  namespace: sso
+  labels:
+    app: keycloak
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: keycloak
+  template:
+    metadata:
+      labels:
+        app: keycloak
+    spec:
+      affinity:
+        nodeAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+            nodeSelectorTerms:
+              - matchExpressions:
+                  - key: hardware
+                    operator: In
+                    values: ["rpi5","rpi4"]
+                  - key: node-role.kubernetes.io/worker
+                    operator: Exists
+              - matchExpressions:
+                  - key: kubernetes.io/hostname
+                    operator: In
+                    values: ["titan-24"]
+          preferredDuringSchedulingIgnoredDuringExecution:
+            - weight: 90
+              preference:
+                matchExpressions:
+                  - key: hardware
+                    operator: In
+                    values: ["rpi5"]
+            - weight: 70
+              preference:
+                matchExpressions:
+                  - key: hardware
+                    operator: In
+                    values: ["rpi4"]
+      securityContext:
+        runAsUser: 1000
+        runAsGroup: 0
+        fsGroup: 1000
+        fsGroupChangePolicy: OnRootMismatch
+      containers:
+        - name: keycloak
+          image: quay.io/keycloak/keycloak:26.0.7
+          imagePullPolicy: IfNotPresent
+          args:
+            - start
+          env:
+            - name: KC_DB
+              value: postgres
+            - name: KC_DB_URL_HOST
+              value: postgres-service.postgres.svc.cluster.local
+            - name: KC_DB_URL_DATABASE
+              valueFrom:
+                secretKeyRef:
+                  name: keycloak-db
+                  key: database
+            - name: KC_DB_USERNAME
+              valueFrom:
+                secretKeyRef:
+                  name: keycloak-db
+                  key: username
+            - name: KC_DB_PASSWORD
+              valueFrom:
+                secretKeyRef:
+                  name: keycloak-db
+                  key: password
+            - name: KC_DB_SCHEMA
+              value: public
+            - name: KC_HOSTNAME
+              value: sso.bstein.dev
+            - name: KC_HOSTNAME_URL
+              value: https://sso.bstein.dev
+            - name: KC_PROXY
+              value: edge
+            - name: KC_PROXY_HEADERS
+              value: xforwarded
+            - name: KC_HTTP_ENABLED
+              value: "true"
+            - name: KC_HTTP_MANAGEMENT_PORT
+              value: "9000"
+            - name: KC_HTTP_MANAGEMENT_BIND_ADDRESS
+              value: 0.0.0.0
+            - name: KC_HEALTH_ENABLED
+              value: "true"
+            - name: KC_METRICS_ENABLED
+              value: "true"
+            - name: KEYCLOAK_ADMIN
+              valueFrom:
+                secretKeyRef:
+                  name: keycloak-admin
+                  key: username
+            - name: KEYCLOAK_ADMIN_PASSWORD
+              valueFrom:
+                secretKeyRef:
+                  name: keycloak-admin
+                  key: password
+          ports:
+            - containerPort: 8080
+              name: http
+            - containerPort: 9000
+              name: metrics
+          readinessProbe:
+            httpGet:
+              path: /health/ready
+              port: 9000
+            initialDelaySeconds: 15
+            periodSeconds: 10
+            failureThreshold: 6
+          livenessProbe:
+            httpGet:
+              path: /health/live
+              port: 9000
+            initialDelaySeconds: 60
+            periodSeconds: 15
+            failureThreshold: 6
+          volumeMounts:
+            - name: data
+              mountPath: /opt/keycloak/data
+      volumes:
+        - name: data
+          persistentVolumeClaim:
+            claimName: keycloak-data
diff --git a/services/keycloak/ingress.yaml b/services/keycloak/ingress.yaml
new file mode 100644
index 0000000..39f6cb0
--- /dev/null
+++ b/services/keycloak/ingress.yaml
@@ -0,0 +1,24 @@
+# services/keycloak/ingress.yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+  name: keycloak
+  namespace: sso
+  annotations:
+    cert-manager.io/cluster-issuer: letsencrypt
+spec:
+  ingressClassName: traefik
+  rules:
+    - host: sso.bstein.dev
+      http:
+        paths:
+          - path: /
+            pathType: Prefix
+            backend:
+              service:
+                name: keycloak
+                port:
+                  number: 80
+  tls:
+    - hosts: [sso.bstein.dev]
+      secretName: keycloak-tls
diff --git a/services/keycloak/kustomization.yaml b/services/keycloak/kustomization.yaml
new file mode 100644
index 0000000..a65715c
--- /dev/null
+++ b/services/keycloak/kustomization.yaml
@@ -0,0 +1,10 @@
+# services/keycloak/kustomization.yaml
+apiVersion: kustomize.config.k8s.io/v1beta1
+kind: Kustomization
+namespace: sso
+resources:
+  - namespace.yaml
+  - pvc.yaml
+  - deployment.yaml
+  - service.yaml
+  - ingress.yaml
diff --git a/services/keycloak/namespace.yaml b/services/keycloak/namespace.yaml
new file mode 100644
index 0000000..b4c731d
--- /dev/null
+++ b/services/keycloak/namespace.yaml
@@ -0,0 +1,5 @@
+# services/keycloak/namespace.yaml
+apiVersion: v1
+kind: Namespace
+metadata:
+  name: sso
diff --git a/services/keycloak/pvc.yaml b/services/keycloak/pvc.yaml
new file mode 100644
index 0000000..b57ec61
--- /dev/null
+++ b/services/keycloak/pvc.yaml
@@ -0,0 +1,12 @@
+# services/keycloak/pvc.yaml
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: keycloak-data
+  namespace: sso
+spec:
+  accessModes: ["ReadWriteOnce"]
+  resources:
+    requests:
+      storage: 10Gi
+  storageClassName: astreae
diff --git a/services/keycloak/service.yaml b/services/keycloak/service.yaml
new file mode 100644
index 0000000..5d93ef6
--- /dev/null
+++ b/services/keycloak/service.yaml
@@ -0,0 +1,15 @@
+# services/keycloak/service.yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: keycloak
+  namespace: sso
+  labels:
+    app: keycloak
+spec:
+  selector:
+    app: keycloak
+  ports:
+    - name: http
+      port: 80
+      targetPort: http
diff --git a/services/monitoring/dashboards/atlas-gpu.json b/services/monitoring/dashboards/atlas-gpu.json
index e67b3d2..9071b0a 100644
--- a/services/monitoring/dashboards/atlas-gpu.json
+++ b/services/monitoring/dashboards/atlas-gpu.json
@@ -20,7 +20,7 @@
       },
       "targets": [
         {
-          "expr": "100 * ( ( (sum by (namespace) (avg_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[1h]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ) / clamp_min(sum( ( (sum by (namespace) (avg_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[1h]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ), 1)",
+          "expr": "100 * ( ( (sum by (namespace) (max_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[$__range]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ) / clamp_min(sum( ( (sum by (namespace) (max_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[$__range]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ), 1)",
           "refId": "A",
           "legendFormat": "{{namespace}}"
         }
diff --git a/services/monitoring/dashboards/atlas-overview.json b/services/monitoring/dashboards/atlas-overview.json
index 9eda81d..beb676e 100644
--- a/services/monitoring/dashboards/atlas-overview.json
+++ b/services/monitoring/dashboards/atlas-overview.json
@@ -975,7 +975,7 @@
       },
       "targets": [
         {
-          "expr": "100 * ( ( (sum by (namespace) (avg_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[1h]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ) / clamp_min(sum( ( (sum by (namespace) (avg_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[1h]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ), 1)",
+          "expr": "100 * ( ( (sum by (namespace) (max_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[$__range]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ) / clamp_min(sum( ( (sum by (namespace) (max_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[$__range]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ), 1)",
           "refId": "A",
           "legendFormat": "{{namespace}}"
         }
diff --git a/services/monitoring/grafana-dashboard-gpu.yaml b/services/monitoring/grafana-dashboard-gpu.yaml
index 3af8717..b5c2c18 100644
--- a/services/monitoring/grafana-dashboard-gpu.yaml
+++ b/services/monitoring/grafana-dashboard-gpu.yaml
@@ -29,7 +29,7 @@ data:
           },
           "targets": [
             {
-              "expr": "100 * ( ( (sum by (namespace) (avg_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[1h]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ) / clamp_min(sum( ( (sum by (namespace) (avg_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[1h]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ), 1)",
+              "expr": "100 * ( ( (sum by (namespace) (max_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[$__range]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ) / clamp_min(sum( ( (sum by (namespace) (max_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[$__range]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ), 1)",
               "refId": "A",
               "legendFormat": "{{namespace}}"
             }
diff --git a/services/monitoring/grafana-dashboard-overview.yaml b/services/monitoring/grafana-dashboard-overview.yaml
index 928098e..ef17ebf 100644
--- a/services/monitoring/grafana-dashboard-overview.yaml
+++ b/services/monitoring/grafana-dashboard-overview.yaml
@@ -984,7 +984,7 @@ data:
           },
           "targets": [
             {
-              "expr": "100 * ( ( (sum by (namespace) (avg_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[1h]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ) / clamp_min(sum( ( (sum by (namespace) (avg_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[1h]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ), 1)",
+              "expr": "100 * ( ( (sum by (namespace) (max_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[$__range]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ) / clamp_min(sum( ( (sum by (namespace) (max_over_time(DCGM_FI_DEV_GPU_UTIL{namespace!=\"\",pod!=\"\"}[$__range]))) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) ) and on(namespace) ( (topk(10, ( sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) ) + (sum(container_memory_working_set_bytes{namespace!=\"\",pod!=\"\",container!=\"\"}) by (namespace) / 1e9) + ((sum((kube_pod_container_resource_requests{namespace!=\"\",resource=\"nvidia.com/gpu\"} or kube_pod_container_resource_limits{namespace!=\"\",resource=\"nvidia.com/gpu\"})) by (namespace)) or on(namespace) (sum(rate(container_cpu_usage_seconds_total{namespace!=\"\",pod!=\"\",container!=\"\"}[5m])) by (namespace) * 0) * 100)) >= bool 0) ) ), 1)",
               "refId": "A",
               "legendFormat": "{{namespace}}"
             }
diff --git a/services/monitoring/helmrelease.yaml b/services/monitoring/helmrelease.yaml
index 2546dc1..d7d7579 100644
--- a/services/monitoring/helmrelease.yaml
+++ b/services/monitoring/helmrelease.yaml
@@ -249,9 +249,27 @@ spec:
     service:
       type: ClusterIP
     env:
-      GF_AUTH_ANONYMOUS_ENABLED: "true"
-      GF_AUTH_ANONYMOUS_ORG_ROLE: Viewer
+      GF_AUTH_ANONYMOUS_ENABLED: "false"
       GF_SECURITY_ALLOW_EMBEDDING: "true"
+      GF_AUTH_GENERIC_OAUTH_ENABLED: "true"
+      GF_AUTH_GENERIC_OAUTH_NAME: "Keycloak"
+      GF_AUTH_GENERIC_OAUTH_ALLOW_SIGN_UP: "true"
+      GF_AUTH_GENERIC_OAUTH_SCOPES: "openid profile email groups"
+      GF_AUTH_GENERIC_OAUTH_AUTH_URL: "https://sso.bstein.dev/realms/atlas/protocol/openid-connect/auth"
+      GF_AUTH_GENERIC_OAUTH_TOKEN_URL: "https://sso.bstein.dev/realms/atlas/protocol/openid-connect/token"
+      GF_AUTH_GENERIC_OAUTH_API_URL: "https://sso.bstein.dev/realms/atlas/protocol/openid-connect/userinfo"
+      GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_PATH: "contains(groups, 'admin') && 'Admin' || 'Viewer'"
+      GF_AUTH_GENERIC_OAUTH_TLS_SKIP_VERIFY_INSECURE: "false"
+      GF_AUTH_SIGNOUT_REDIRECT_URL: "https://sso.bstein.dev/realms/atlas/protocol/openid-connect/logout?redirect_uri=https://metrics.bstein.dev/"
+    envValueFrom:
+      GF_AUTH_GENERIC_OAUTH_CLIENT_ID:
+        secretKeyRef:
+          name: grafana-oidc
+          key: client_id
+      GF_AUTH_GENERIC_OAUTH_CLIENT_SECRET:
+        secretKeyRef:
+          name: grafana-oidc
+          key: client_secret
     grafana.ini:
       server:
         domain: metrics.bstein.dev
diff --git a/services/oauth2-proxy/deployment.yaml b/services/oauth2-proxy/deployment.yaml
new file mode 100644
index 0000000..7c22a93
--- /dev/null
+++ b/services/oauth2-proxy/deployment.yaml
@@ -0,0 +1,82 @@
+# services/oauth2-proxy/deployment.yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: oauth2-proxy
+  namespace: sso
+  labels:
+    app: oauth2-proxy
+spec:
+  replicas: 2
+  selector:
+    matchLabels:
+      app: oauth2-proxy
+  template:
+    metadata:
+      labels:
+        app: oauth2-proxy
+    spec:
+      nodeSelector:
+        node-role.kubernetes.io/worker: "true"
+      affinity:
+        nodeAffinity:
+          preferredDuringSchedulingIgnoredDuringExecution:
+            - weight: 90
+              preference:
+                matchExpressions:
+                  - key: hardware
+                    operator: In
+                    values: ["rpi5","rpi4"]
+      containers:
+        - name: oauth2-proxy
+          image: quay.io/oauth2-proxy/oauth2-proxy:v7.6.0
+          imagePullPolicy: IfNotPresent
+          args:
+            - --provider=oidc
+            - --redirect-url=https://auth.bstein.dev/oauth2/callback
+            - --oidc-issuer-url=https://sso.bstein.dev/realms/atlas
+            - --scope=openid profile email groups
+            - --email-domain=*
+            - --set-xauthrequest=true
+            - --pass-access-token=true
+            - --set-authorization-header=true
+            - --cookie-secure=true
+            - --cookie-samesite=lax
+            - --cookie-refresh=20m
+            - --cookie-expire=168h
+            - --upstream=static://200
+            - --http-address=0.0.0.0:4180
+            - --skip-provider-button=true
+            - --skip-jwt-bearer-tokens=true
+            - --oidc-groups-claim=groups
+          env:
+            - name: OAUTH2_PROXY_CLIENT_ID
+              valueFrom:
+                secretKeyRef:
+                  name: oauth2-proxy-oidc
+                  key: client_id
+            - name: OAUTH2_PROXY_CLIENT_SECRET
+              valueFrom:
+                secretKeyRef:
+                  name: oauth2-proxy-oidc
+                  key: client_secret
+            - name: OAUTH2_PROXY_COOKIE_SECRET
+              valueFrom:
+                secretKeyRef:
+                  name: oauth2-proxy-oidc
+                  key: cookie_secret
+          ports:
+            - containerPort: 4180
+              name: http
+          readinessProbe:
+            httpGet:
+              path: /ping
+              port: 4180
+            initialDelaySeconds: 5
+            periodSeconds: 10
+          livenessProbe:
+            httpGet:
+              path: /ping
+              port: 4180
+            initialDelaySeconds: 20
+            periodSeconds: 20
diff --git a/services/oauth2-proxy/ingress.yaml b/services/oauth2-proxy/ingress.yaml
new file mode 100644
index 0000000..0f5830c
--- /dev/null
+++ b/services/oauth2-proxy/ingress.yaml
@@ -0,0 +1,25 @@
+# services/oauth2-proxy/ingress.yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+  name: oauth2-proxy
+  namespace: sso
+  annotations:
+    cert-manager.io/cluster-issuer: letsencrypt
+    traefik.ingress.kubernetes.io/router.middlewares: sso-oauth2-proxy-errors@kubernetescrd
+spec:
+  ingressClassName: traefik
+  rules:
+    - host: auth.bstein.dev
+      http:
+        paths:
+          - path: /
+            pathType: Prefix
+            backend:
+              service:
+                name: oauth2-proxy
+                port:
+                  number: 80
+  tls:
+    - hosts: [auth.bstein.dev]
+      secretName: auth-tls
diff --git a/services/oauth2-proxy/kustomization.yaml b/services/oauth2-proxy/kustomization.yaml
new file mode 100644
index 0000000..ff4705a
--- /dev/null
+++ b/services/oauth2-proxy/kustomization.yaml
@@ -0,0 +1,10 @@
+# services/oauth2-proxy/kustomization.yaml
+apiVersion: kustomize.config.k8s.io/v1beta1
+kind: Kustomization
+namespace: sso
+resources:
+  - deployment.yaml
+  - service.yaml
+  - ingress.yaml
+  - middleware.yaml
+  - middleware-errors.yaml
diff --git a/services/oauth2-proxy/middleware-errors.yaml b/services/oauth2-proxy/middleware-errors.yaml
new file mode 100644
index 0000000..55e092a
--- /dev/null
+++ b/services/oauth2-proxy/middleware-errors.yaml
@@ -0,0 +1,15 @@
+# services/oauth2-proxy/middleware-errors.yaml
+apiVersion: traefik.io/v1alpha1
+kind: Middleware
+metadata:
+  name: oauth2-proxy-errors
+  namespace: sso
+spec:
+  errors:
+    status:
+      - "401"
+      - "403"
+    service:
+      name: oauth2-proxy
+      port: 80
+    query: /oauth2/start?rd={url}
diff --git a/services/oauth2-proxy/middleware.yaml b/services/oauth2-proxy/middleware.yaml
new file mode 100644
index 0000000..db5f3a4
--- /dev/null
+++ b/services/oauth2-proxy/middleware.yaml
@@ -0,0 +1,15 @@
+# services/oauth2-proxy/middleware.yaml
+apiVersion: traefik.io/v1alpha1
+kind: Middleware
+metadata:
+  name: oauth2-proxy-forward-auth
+  namespace: sso
+spec:
+  forwardAuth:
+    address: http://oauth2-proxy.sso.svc.cluster.local:4180/oauth2/auth
+    trustForwardHeader: true
+    authResponseHeaders:
+      - Authorization
+      - X-Auth-Request-Email
+      - X-Auth-Request-User
+      - X-Auth-Request-Groups
diff --git a/services/oauth2-proxy/service.yaml b/services/oauth2-proxy/service.yaml
new file mode 100644
index 0000000..1eb5481
--- /dev/null
+++ b/services/oauth2-proxy/service.yaml
@@ -0,0 +1,15 @@
+# services/oauth2-proxy/service.yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: oauth2-proxy
+  namespace: sso
+  labels:
+    app: oauth2-proxy
+spec:
+  selector:
+    app: oauth2-proxy
+  ports:
+    - name: http
+      port: 80
+      targetPort: 4180
diff --git a/services/vault/ingress.yaml b/services/vault/ingress.yaml
index 306556d..91d9ca4 100644
--- a/services/vault/ingress.yaml
+++ b/services/vault/ingress.yaml
@@ -7,7 +7,6 @@ metadata:
   annotations:
     kubernetes.io/ingress.class: traefik
     traefik.ingress.kubernetes.io/router.entrypoints: websecure
-    traefik.ingress.kubernetes.io/router.middlewares: vault-vault-basicauth@kubernetescrd
     traefik.ingress.kubernetes.io/service.serversscheme: https
     traefik.ingress.kubernetes.io/service.serversTransport: vault-vault-to-https@kubernetescrd
 spec:
diff --git a/services/vault/kustomization.yaml b/services/vault/kustomization.yaml
index 4c3fbc5..1d7af87 100644
--- a/services/vault/kustomization.yaml
+++ b/services/vault/kustomization.yaml
@@ -7,5 +7,4 @@ resources:
   - helmrelease.yaml
   - certificate.yaml
   - ingress.yaml
-  - middleware.yaml
   - serverstransport.yaml
diff --git a/services/vault/middleware.yaml b/services/vault/middleware.yaml
deleted file mode 100644
index 0a41961..0000000
--- a/services/vault/middleware.yaml
+++ /dev/null
@@ -1,9 +0,0 @@
-# services/vault/middleware.yaml
-apiVersion: traefik.io/v1alpha1
-kind: Middleware
-metadata:
-  name: vault-basicauth
-  namespace: vault
-spec:
-  basicAuth:
-    secret: vault-basic-auth