302 Commits

Author SHA1 Message Date
9f226c1584 Merge pull request 'feature/sso' (#4) from feature/sso into main
Reviewed-on: #4
2025-12-11 20:43:34 +00:00
319b515882 zot: restore main branch config 2025-12-11 17:26:15 -03:00
cb2b2ec1cd zot: revert to unauthenticated registry 2025-12-11 17:22:16 -03:00
20cd185c0b vault: drop traefik basicauth 2025-12-11 17:09:05 -03:00
2f368f6975 zot,vault: remove oauth2-proxy sso 2025-12-11 17:04:19 -03:00
6c62d42f7a longhorn/vault: gate via oauth2-proxy 2025-12-07 19:44:02 -03:00
a7e9f1f7d8 auth: remove error middleware to allow redirect 2025-12-07 13:19:45 -03:00
ceb692f7ee oauth2-proxy: drop groups scope to avoid invalid_scope 2025-12-07 13:09:29 -03:00
24fbaad040 auth: forward-auth via external auth host (svc traffic flaky) 2025-12-07 13:03:29 -03:00
04aa32a762 oauth2-proxy: schedule on worker rpis 2025-12-07 12:49:38 -03:00
25ee698021 oauth2-proxy: ensure error middleware on auth ingress 2025-12-07 12:03:14 -03:00
4a089876ba auth: use internal oauth2-proxy svc for forward-auth 2025-12-07 11:25:29 -03:00
20bb776625 auth: add 401 redirect middleware to oauth2-proxy 2025-12-07 11:14:25 -03:00
5e59f20bc3 auth: point forward-auth to external auth host 2025-12-07 11:09:09 -03:00
dbede55ad4 oauth2-proxy: temporarily drop group restriction 2025-12-07 10:42:13 -03:00
27e5c9391c auth: add namespace-local forward-auth middlewares 2025-12-07 10:25:44 -03:00
8d5e6c267c auth: wire oauth2-proxy and enable grafana oidc 2025-12-07 02:01:21 -03:00
a55502fe27 add oauth2-proxy for SSO forward-auth 2025-12-06 14:42:24 -03:00
598bdfc727 keycloak: restrict to worker rpis with titan-24 fallback 2025-12-06 01:44:23 -03:00
88c7a1c2aa keycloak: require rpi nodes with titan-24 fallback 2025-12-06 01:40:24 -03:00
f4da27271e keycloak: prefer rpi nodes, avoid titan-24 2025-12-06 01:36:33 -03:00
141c05b08f keycloak: honor xforwarded headers and hostname url 2025-12-06 01:23:07 -03:00
f0a8f6d35e keycloak: enable health/metrics management port 2025-12-06 00:51:47 -03:00
1b01052eda keycloak: set fsGroup for data volume 2025-12-06 00:49:17 -03:00
1d346edd28 keycloak: remove optimized flag for first start 2025-12-06 00:43:24 -03:00
b14a9dcb98 chore: drop AGENTS.md from repo 2025-12-06 00:43:17 -03:00
47caf08885 notes: capture GPU share change and flux branch 2025-12-03 12:28:45 -03:00
0db149605d monitoring: show GPU share over dashboard range 2025-12-02 20:28:35 -03:00
f64e60c5a2 flux: add keycloak kustomization 2025-12-02 18:10:20 -03:00
61c5db5c99 flux: track feature/sso 2025-12-02 18:00:49 -03:00
2db550afdd keycloak: add raw manifests backed by shared postgres 2025-12-02 17:58:19 -03:00
65d389193f Merge pull request 'feature/atlas-monitoring' (#3) from feature/atlas-monitoring into main
Reviewed-on: #3
2025-12-02 20:52:35 +00:00
e80505a773 notes: add postgres centralization guidance 2025-12-02 17:36:37 -03:00
762aa7bb0f notes: add sso plan sketch 2025-12-02 17:14:45 -03:00
839fb94836 notes: update monitoring and next steps 2025-12-02 17:01:32 -03:00
6eba26b359 monitoring: show top12 root disks 2025-12-02 15:21:02 -03:00
ace383bedd monitoring: expand worker/control/root rows 2025-12-02 15:15:21 -03:00
b93636ecb9 monitoring: shrink hottest node row height 2025-12-02 15:12:16 -03:00
5df94a7937 monitoring: fix gpu share query and root bar labels 2025-12-02 14:56:36 -03:00
a3dc9391ee monitoring: polish dashboards and folders 2025-12-02 14:41:39 -03:00
eed67b3db0 monitoring: regen dashboards with gpu details 2025-12-02 13:16:00 -03:00
f1d0970aa0 monitoring: mirror dcgm-exporter as multi-arch 2025-12-02 12:36:24 -03:00
e26ef44d1a monitoring: run dcgm-exporter with nvidia runtime 2025-12-02 12:25:30 -03:00
a18c3e6f67 monitoring: always pull dcgm-exporter tag 2025-12-02 12:19:16 -03:00
ee923df567 monitoring: add registry pull secret for dcgm-exporter 2025-12-02 12:07:11 -03:00
d87a1dbc47 monitoring: allow dcgm rollout with unavailable node 2025-12-02 11:59:55 -03:00
5b89b0533e monitoring: use mirrored dcgm-exporter tag 2025-12-02 11:54:53 -03:00
d99bb06eeb monitoring: reenable dcgm exporter 2025-11-20 13:11:13 -03:00
75f6a59316 traefik: use responding timeouts only 2025-11-18 20:01:16 -03:00
630f1f2a81 traefik: extend upload timeouts 2025-11-18 19:43:19 -03:00