43 Commits

Author SHA1 Message Date
22cd934b15 Fix namespace plurality table query 2025-12-13 17:29:55 -03:00
f2ca30dcb1 atlas pods: plurality table v11 (deterministic top node) 2025-12-13 17:19:03 -03:00
c289924cb2 atlas pods: plurality table v10 2025-12-13 16:36:25 -03:00
e95cdd6b27 atlas pods: per-namespace top node via topk 2025-12-13 15:51:45 -03:00
b0389b219b atlas pods: simplify plurality table (no filter) 2025-12-13 15:29:08 -03:00
d367d0164f atlas pods: stabilize plurality query to avoid 422 2025-12-13 15:11:21 -03:00
4f08872205 atlas pods: show per-namespace top node without vars 2025-12-13 15:02:52 -03:00
e64beee718 atlas pods: drop non-leading nodes in plurality table 2025-12-13 13:39:06 -03:00
c76bef69f2 atlas pods: simplify plurality table query 2025-12-13 12:06:18 -03:00
ca42b32b9e atlas pods: fix plurality table query 2025-12-13 12:00:31 -03:00
789ace779f atlas pods: use prom share() for plurality table 2025-12-13 11:53:27 -03:00
c82bbf32f6 atlas pods: fix plurality query with bool max match 2025-12-13 11:51:18 -03:00
f19539eb25 atlas pods: robust per-namespace top-node share 2025-12-13 11:48:44 -03:00
996f008593 atlas pods: select per-namespace top node via max match 2025-12-13 04:15:03 -03:00
b049997959 atlas pods: sort plurality table by node then share 2025-12-13 04:10:10 -03:00
f9ccd292d6 atlas pods: simplify namespace plurality query 2025-12-13 04:06:46 -03:00
0d938ad758 atlas pods: fix namespace plurality query 2025-12-13 04:00:57 -03:00
e06a6826b7 atlas pods: add namespace plurality by node table 2025-12-13 03:57:20 -03:00
c9c13372a8 atlas overview: include titan-db in control plane panels 2025-12-12 21:55:53 -03:00
f884ce8146 atlas dashboards: align percent thresholds and disk bars 2025-12-12 21:13:31 -03:00
755a6926ab atlas overview: refine alert thresholds and availability colors 2025-12-12 20:50:41 -03:00
73deee09af atlas dashboards: use threshold colors for stats 2025-12-12 20:44:20 -03:00
2e18a4e1c5 atlas dashboards: fix pod share display and zero/red stat thresholds 2025-12-12 20:40:32 -03:00
da8ed7a3b0 atlas dashboards: show pod counts (not %) and make zero-friendly stats 2025-12-12 20:30:00 -03:00
ca1b2351c0 atlas dashboards: show pod counts with top12 bars 2025-12-12 20:20:13 -03:00
0a520e1d4b atlas dashboards: drop empty nodes and enforce top12 pod bars 2025-12-12 19:09:51 -03:00
1fefca3b3e atlas dashboards: cap pod count bars at top12 2025-12-12 18:56:13 -03:00
8ed23c673c atlas dashboards: sort pod counts and add pod row to overview 2025-12-12 18:51:43 -03:00
66f537185d atlas pods: add pod count bar and tidy pie 2025-12-12 18:45:29 -03:00
c093f98522 atlas dashboards: fix overview links and add pods-by-node pie 2025-12-12 18:32:45 -03:00
4a7822d6f0 atlas internal dashboards: add SLO/burn and api health panels 2025-12-12 18:00:43 -03:00
1a38bffdf3 atlas overview: fix availability scaling 2025-12-12 16:36:47 -03:00
92a7688a2f atlas overview: show availability percent with 3 decimals 2025-12-12 16:15:37 -03:00
72d4fd60d2 atlas overview: show availability percent and keep uptime centered 2025-12-12 16:11:28 -03:00
9320d809f4 atlas overview: center uptime and reorder top row 2025-12-12 15:56:33 -03:00
27f4e60f30 atlas overview: add uptime and crashloop panels 2025-12-12 15:23:51 -03:00
2906e3e5d9 monitoring: show GPU share over dashboard range 2025-12-02 20:28:35 -03:00
42b3ac0139 monitoring: show top12 root disks 2025-12-02 15:21:02 -03:00
e53ca4dd91 monitoring: expand worker/control/root rows 2025-12-02 15:15:21 -03:00
134e39d9a4 monitoring: shrink hottest node row height 2025-12-02 15:12:16 -03:00
12fd5229dc monitoring: fix gpu share query and root bar labels 2025-12-02 14:56:36 -03:00
1963fadec1 monitoring: polish dashboards and folders 2025-12-02 14:41:39 -03:00
d23e2fe78c monitoring: regen dashboards with gpu details 2025-12-02 13:16:00 -03:00