306 Commits

Author SHA1 Message Date
4a6aa907f6 vault: fix ingress tls annotation 2025-12-09 03:25:28 -03:00
1f5ae50989 zot: restore oauth2-proxy front; vault: point ingress to vault-ui 2025-12-09 02:34:16 -03:00
27214e7294 zot/vault: simplify to native OIDC and redirect to login 2025-12-09 02:26:01 -03:00
7c9fc9008a zot: route ingress directly to zot (native OIDC) 2025-12-09 02:08:22 -03:00
0a76fc3612 zot: use generic oidc provider key 2025-12-09 01:29:05 -03:00
cdbad50c02 zot: fix oidc config keys 2025-12-09 01:15:53 -03:00
ea4c04ba04 zot: fix oidc provider map shape 2025-12-08 23:36:19 -03:00
dba4d270ff sso: fix vault OIDC bootstrap and render zot oidc config 2025-12-08 23:23:21 -03:00
c8254d6eec longhorn/vault: zot oauth2-proxy integration 2025-12-07 20:28:45 -03:00
6c62d42f7a longhorn/vault: gate via oauth2-proxy 2025-12-07 19:44:02 -03:00
a7e9f1f7d8 auth: remove error middleware to allow redirect 2025-12-07 13:19:45 -03:00
ceb692f7ee oauth2-proxy: drop groups scope to avoid invalid_scope 2025-12-07 13:09:29 -03:00
24fbaad040 auth: forward-auth via external auth host (svc traffic flaky) 2025-12-07 13:03:29 -03:00
04aa32a762 oauth2-proxy: schedule on worker rpis 2025-12-07 12:49:38 -03:00
25ee698021 oauth2-proxy: ensure error middleware on auth ingress 2025-12-07 12:03:14 -03:00
4a089876ba auth: use internal oauth2-proxy svc for forward-auth 2025-12-07 11:25:29 -03:00
20bb776625 auth: add 401 redirect middleware to oauth2-proxy 2025-12-07 11:14:25 -03:00
5e59f20bc3 auth: point forward-auth to external auth host 2025-12-07 11:09:09 -03:00
dbede55ad4 oauth2-proxy: temporarily drop group restriction 2025-12-07 10:42:13 -03:00
27e5c9391c auth: add namespace-local forward-auth middlewares 2025-12-07 10:25:44 -03:00
8d5e6c267c auth: wire oauth2-proxy and enable grafana oidc 2025-12-07 02:01:21 -03:00
a55502fe27 add oauth2-proxy for SSO forward-auth 2025-12-06 14:42:24 -03:00
598bdfc727 keycloak: restrict to worker rpis with titan-24 fallback 2025-12-06 01:44:23 -03:00
88c7a1c2aa keycloak: require rpi nodes with titan-24 fallback 2025-12-06 01:40:24 -03:00
f4da27271e keycloak: prefer rpi nodes, avoid titan-24 2025-12-06 01:36:33 -03:00
141c05b08f keycloak: honor xforwarded headers and hostname url 2025-12-06 01:23:07 -03:00
f0a8f6d35e keycloak: enable health/metrics management port 2025-12-06 00:51:47 -03:00
1b01052eda keycloak: set fsGroup for data volume 2025-12-06 00:49:17 -03:00
1d346edd28 keycloak: remove optimized flag for first start 2025-12-06 00:43:24 -03:00
b14a9dcb98 chore: drop AGENTS.md from repo 2025-12-06 00:43:17 -03:00
47caf08885 notes: capture GPU share change and flux branch 2025-12-03 12:28:45 -03:00
0db149605d monitoring: show GPU share over dashboard range 2025-12-02 20:28:35 -03:00
f64e60c5a2 flux: add keycloak kustomization 2025-12-02 18:10:20 -03:00
61c5db5c99 flux: track feature/sso 2025-12-02 18:00:49 -03:00
2db550afdd keycloak: add raw manifests backed by shared postgres 2025-12-02 17:58:19 -03:00
65d389193f Merge pull request 'feature/atlas-monitoring' (#3) from feature/atlas-monitoring into main
Reviewed-on: #3
2025-12-02 20:52:35 +00:00
e80505a773 notes: add postgres centralization guidance 2025-12-02 17:36:37 -03:00
762aa7bb0f notes: add sso plan sketch 2025-12-02 17:14:45 -03:00
839fb94836 notes: update monitoring and next steps 2025-12-02 17:01:32 -03:00
6eba26b359 monitoring: show top12 root disks 2025-12-02 15:21:02 -03:00
ace383bedd monitoring: expand worker/control/root rows 2025-12-02 15:15:21 -03:00
b93636ecb9 monitoring: shrink hottest node row height 2025-12-02 15:12:16 -03:00
5df94a7937 monitoring: fix gpu share query and root bar labels 2025-12-02 14:56:36 -03:00
a3dc9391ee monitoring: polish dashboards and folders 2025-12-02 14:41:39 -03:00
eed67b3db0 monitoring: regen dashboards with gpu details 2025-12-02 13:16:00 -03:00
f1d0970aa0 monitoring: mirror dcgm-exporter as multi-arch 2025-12-02 12:36:24 -03:00
e26ef44d1a monitoring: run dcgm-exporter with nvidia runtime 2025-12-02 12:25:30 -03:00
a18c3e6f67 monitoring: always pull dcgm-exporter tag 2025-12-02 12:19:16 -03:00
ee923df567 monitoring: add registry pull secret for dcgm-exporter 2025-12-02 12:07:11 -03:00
d87a1dbc47 monitoring: allow dcgm rollout with unavailable node 2025-12-02 11:59:55 -03:00