15 Commits

Author SHA1 Message Date
84710b99e8 monitoring: add glue dashboard and tag cronjobs 2026-01-18 02:50:07 -03:00
fddf58346d monitoring: treat cert-manager as infrastructure 2026-01-12 00:26:46 -03:00
98d405bc42 monitoring: regenerate dashboards with expanded infra namespaces 2026-01-11 23:55:43 -03:00
879ff7c16b monitoring: fix infra scopes and add jetson metrics 2026-01-11 23:46:24 -03:00
05a888aeb6 monitoring(dashboards): tune namespace share metrics 2026-01-05 13:30:51 -03:00
ceea2539bc monitoring: per-panel namespace share filters 2026-01-01 14:44:33 -03:00
bcc1ceef6d monitoring: ensure gpu idle share renders 2026-01-01 14:21:43 -03:00
91de1c1d8d gpu: enable time-slicing and refresh dashboards 2026-01-01 14:16:08 -03:00
2baa537ec7 Use table format for namespace plurality panel 2025-12-13 18:23:19 -03:00
2e18a4e1c5 atlas dashboards: fix pod share display and zero/red stat thresholds 2025-12-12 20:40:32 -03:00
da8ed7a3b0 atlas dashboards: show pod counts (not %) and make zero-friendly stats 2025-12-12 20:30:00 -03:00
2906e3e5d9 monitoring: show GPU share over dashboard range 2025-12-02 20:28:35 -03:00
12fd5229dc monitoring: fix gpu share query and root bar labels 2025-12-02 14:56:36 -03:00
1963fadec1 monitoring: polish dashboards and folders 2025-12-02 14:41:39 -03:00
d23e2fe78c monitoring: regen dashboards with gpu details 2025-12-02 13:16:00 -03:00