|
|
0db149605d
|
monitoring: show GPU share over dashboard range
|
2025-12-02 20:28:35 -03:00 |
|
|
|
2db550afdd
|
keycloak: add raw manifests backed by shared postgres
|
2025-12-02 17:58:19 -03:00 |
|
|
|
6eba26b359
|
monitoring: show top12 root disks
|
2025-12-02 15:21:02 -03:00 |
|
|
|
ace383bedd
|
monitoring: expand worker/control/root rows
|
2025-12-02 15:15:21 -03:00 |
|
|
|
b93636ecb9
|
monitoring: shrink hottest node row height
|
2025-12-02 15:12:16 -03:00 |
|
|
|
5df94a7937
|
monitoring: fix gpu share query and root bar labels
|
2025-12-02 14:56:36 -03:00 |
|
|
|
a3dc9391ee
|
monitoring: polish dashboards and folders
|
2025-12-02 14:41:39 -03:00 |
|
|
|
eed67b3db0
|
monitoring: regen dashboards with gpu details
|
2025-12-02 13:16:00 -03:00 |
|
|
|
f1d0970aa0
|
monitoring: mirror dcgm-exporter as multi-arch
|
2025-12-02 12:36:24 -03:00 |
|
|
|
e26ef44d1a
|
monitoring: run dcgm-exporter with nvidia runtime
|
2025-12-02 12:25:30 -03:00 |
|
|
|
a18c3e6f67
|
monitoring: always pull dcgm-exporter tag
|
2025-12-02 12:19:16 -03:00 |
|
|
|
ee923df567
|
monitoring: add registry pull secret for dcgm-exporter
|
2025-12-02 12:07:11 -03:00 |
|
|
|
d87a1dbc47
|
monitoring: allow dcgm rollout with unavailable node
|
2025-12-02 11:59:55 -03:00 |
|
|
|
5b89b0533e
|
monitoring: use mirrored dcgm-exporter tag
|
2025-12-02 11:54:53 -03:00 |
|
|
|
d99bb06eeb
|
monitoring: reenable dcgm exporter
|
2025-11-20 13:11:13 -03:00 |
|
|
|
e4f93e85d2
|
monitoring: control-plane stat and namespace share tweaks
|
2025-11-18 17:09:13 -03:00 |
|
|
|
f06be37f44
|
monitoring: refine network metrics and control-plane allowance
|
2025-11-18 16:18:52 -03:00 |
|
|
|
c7b7bc7a6d
|
monitoring: adjust overview spacing and net panels
|
2025-11-18 15:55:24 -03:00 |
|
|
|
7b2a69cfe3
|
monitoring: disable dcgm exporter
|
2025-11-18 15:10:58 -03:00 |
|
|
|
46410c9a9d
|
monitoring: fix dcgm image
|
2025-11-18 14:19:23 -03:00 |
|
|
|
ff056551c7
|
monitoring: refresh overview dashboards
|
2025-11-18 14:08:33 -03:00 |
|
|
|
8e6c0a3cfe
|
monitoring: rework gpu share + gauges
|
2025-11-18 12:11:47 -03:00 |
|
|
|
497164a1ad
|
monitoring: clean namespace gpu share and layout
|
2025-11-18 11:42:24 -03:00 |
|
|
|
fab5552039
|
monitoring: resolve pie errors and network data
|
2025-11-18 11:30:33 -03:00 |
|
|
|
7009a4f9ff
|
monitoring: fix namespace gpu share and network stats
|
2025-11-18 11:12:03 -03:00 |
|
|
|
d7e4bcd533
|
monitoring: add gpu node fallback
|
2025-11-18 10:47:24 -03:00 |
|
|
|
ec76563a86
|
monitoring: source gpu pie from limits and node nets
|
2025-11-18 01:01:10 -03:00 |
|
|
|
5144bbe1f2
|
monitoring: fix gpu pie data and network panels
|
2025-11-18 00:31:51 -03:00 |
|
|
|
ac62387e07
|
monitoring: stabilize namespace pies and labels
|
2025-11-18 00:19:45 -03:00 |
|
|
|
2ba642d49f
|
monitoring: add gpu pie and tidy net panels
|
2025-11-18 00:11:39 -03:00 |
|
|
|
beb3243839
|
Revert GPU pie chart additions
|
2025-11-17 23:42:55 -03:00 |
|
|
|
aef3176c1c
|
monitoring: fix hottest stats and gpu share
|
2025-11-17 23:40:22 -03:00 |
|
|
|
f4dd1de43f
|
monitoring: reorder namespace pies and add gpu data
|
2025-11-17 23:18:53 -03:00 |
|
|
|
0708522b28
|
monitoring: add namespace gpu share
|
2025-11-17 23:12:16 -03:00 |
|
|
|
c53c518301
|
monitoring: express namespace share as cluster percent
|
2025-11-17 22:58:57 -03:00 |
|
|
|
442a89d327
|
monitoring: fix pie colors & thresholds
|
2025-11-17 22:39:50 -03:00 |
|
|
|
255e014e0a
|
monitoring: color namespace pies
|
2025-11-17 22:36:50 -03:00 |
|
|
|
cc62f497e9
|
monitoring: fix namespace share percentages
|
2025-11-17 22:19:01 -03:00 |
|
|
|
37e51b361b
|
monitoring: normalize namespace share
|
2025-11-17 22:06:06 -03:00 |
|
|
|
be6052c47c
|
monitoring: unify namespace share panels
|
2025-11-17 21:57:40 -03:00 |
|
|
|
b59677615c
|
monitoring: worker/control-plane splits
|
2025-11-17 21:48:12 -03:00 |
|
|
|
76d3dc6ae2
|
monitoring: restore top1 hottest stats
|
2025-11-17 21:20:19 -03:00 |
|
|
|
53427cc8fa
|
monitoring: fix net/io legend labels
|
2025-11-17 20:19:20 -03:00 |
|
|
|
b8998a3c6a
|
monitoring: attach nodes to net/io stats
|
2025-11-17 20:14:11 -03:00 |
|
|
|
a67a6a1f3a
|
monitoring: tidy hottest node labels
|
2025-11-17 20:04:50 -03:00 |
|
|
|
b28e7501b7
|
monitoring: show hottest node labels
|
2025-11-17 20:00:40 -03:00 |
|
|
|
4aece7e5cb
|
monitoring: fix hottest node labels
|
2025-11-17 19:56:57 -03:00 |
|
|
|
bcaa0a3327
|
monitoring: show hottest node names
|
2025-11-17 19:53:39 -03:00 |
|
|
|
41e8a6a582
|
monitoring: reorder overview stats
|
2025-11-17 19:49:50 -03:00 |
|
|
|
a1e731e929
|
monitoring: fix hottest stats and titan-db scrape
|
2025-11-17 19:38:40 -03:00 |
|