238 Commits

Author SHA1 Message Date
ebfeb78e87 monitoring: fix gpu pie data and network panels 2025-11-18 00:31:51 -03:00
d5e1003de8 monitoring: stabilize namespace pies and labels 2025-11-18 00:19:45 -03:00
a411694bda monitoring: add gpu pie and tidy net panels 2025-11-18 00:11:39 -03:00
1df06f18f6 Revert GPU pie chart additions 2025-11-17 23:42:55 -03:00
9bd7effdee monitoring: fix hottest stats and gpu share 2025-11-17 23:40:22 -03:00
991d6defc4 monitoring: reorder namespace pies and add gpu data 2025-11-17 23:18:53 -03:00
43b9265cdf monitoring: add namespace gpu share 2025-11-17 23:12:16 -03:00
9233ba60fc monitoring: express namespace share as cluster percent 2025-11-17 22:58:57 -03:00
ccca363fb4 monitoring: fix pie colors & thresholds 2025-11-17 22:39:50 -03:00
f22c19bc5d monitoring: color namespace pies 2025-11-17 22:36:50 -03:00
0e9b293e95 monitoring: fix namespace share percentages 2025-11-17 22:19:01 -03:00
5a2cafb5db monitoring: normalize namespace share 2025-11-17 22:06:06 -03:00
5ce1493b3b monitoring: unify namespace share panels 2025-11-17 21:57:40 -03:00
c85c6b1bc3 monitoring: worker/control-plane splits 2025-11-17 21:48:12 -03:00
64059a08f5 monitoring: restore top1 hottest stats 2025-11-17 21:20:19 -03:00
2073ffe944 monitoring: fix net/io legend labels 2025-11-17 20:19:20 -03:00
a99e1ba227 monitoring: attach nodes to net/io stats 2025-11-17 20:14:11 -03:00
8d42f501e5 monitoring: tidy hottest node labels 2025-11-17 20:04:50 -03:00
7358f9e618 monitoring: show hottest node labels 2025-11-17 20:00:40 -03:00
831d1fe707 monitoring: fix hottest node labels 2025-11-17 19:56:57 -03:00
8c263b36b9 monitoring: show hottest node names 2025-11-17 19:53:39 -03:00
bf31272339 monitoring: reorder overview stats 2025-11-17 19:49:50 -03:00
a34e58d319 monitoring: fix hottest stats and titan-db scrape 2025-11-17 19:38:40 -03:00
6a60e4284a monitoring: tighten overview stats 2025-11-17 19:24:03 -03:00
0f7d0b7bac monitoring: polish dashboards 2025-11-17 18:55:11 -03:00
665dfa2e52 monitoring: rebuild atlas dashboards 2025-11-17 16:27:38 -03:00
5858a80c72 monitoring: restructure grafana dashboards 2025-11-17 14:22:46 -03:00
d844e068ec monitoring: enrich dashboards 2025-11-16 12:58:08 -03:00
77c3e260a3 monitoring: refresh grafana dashboards 2025-11-15 21:03:11 -03:00
2e6b9a47c8 dashboards: improve public view and fix color 2025-11-15 11:59:48 -03:00
48f9c6d715 grafana: set datasource uid 2025-11-15 11:35:27 -03:00
da82ebd469 grafana: use atlas metrics hostname 2025-11-15 11:18:40 -03:00
37b93de3e7 victoria-metrics: revert storageclass change 2025-11-15 11:16:37 -03:00
89c0fbfd44 monitoring: fix domain 2025-11-14 19:13:40 -03:00
cb402d0bb9 monitoring: fix ingress and env formats 2025-11-14 08:51:09 -03:00
597556d1c0 grafana: use string host format 2025-11-14 08:37:46 -03:00
f886e2b873 grafana: fix dashboard provider list 2025-11-14 08:33:53 -03:00
94f0cd939d monitoring: fix grafana values 2025-11-14 08:29:59 -03:00
bc757265cf monitoring: add grafana and alertmanager 2025-11-14 00:02:59 -03:00
4d3a4cd2b4 flux-system: track main branch 2025-11-12 01:06:26 -03:00
ac7863802a monitoring: disable wait on node-exporter 2025-11-09 14:03:14 -03:00
afb926439f core: disable wait to unblock reconciliation 2025-11-09 13:46:56 -03:00
ebf5a8aef9 core: remove gpu health gate 2025-11-09 13:37:59 -03:00
dca749cc04 gpu: drop runtimeClass from minipc plugin 2025-11-09 13:28:40 -03:00
65b3e3fbb8 monitoring: disable kube-state annotations 2025-11-09 13:20:50 -03:00
45ad2a2b06 monitoring: clean helm values 2025-11-09 13:16:21 -03:00
396acb818a monitoring: disable chart prometheusScrape 2025-11-09 13:11:40 -03:00
aae55a14f8 monitoring: annotate kube-state svc manually 2025-11-09 13:07:39 -03:00
8ac040a7d8 monitoring: drop duplicate annotations 2025-11-09 13:03:40 -03:00
79a17412af monitoring: reference prometheus repo 2025-11-09 12:59:03 -03:00