|
|
a01dc0813a
|
maintenance(soteria): enable b2 usage scan config and alert
|
2026-04-12 19:47:58 -03:00 |
|
|
|
609cfcb696
|
monitoring: force horizontal stat layout for power/climate panels
|
2026-04-12 19:04:35 -03:00 |
|
|
|
a87a5f7bff
|
monitoring: fix typhon low-threshold alert semantics
|
2026-04-12 14:56:34 -03:00 |
|
|
|
a1c8a99866
|
monitoring(alerts): watch soteria authz denial spikes
|
2026-04-12 12:19:42 -03:00 |
|
|
|
7b3dfa335b
|
maintenance(soteria): harden ingress path and add backup alerts
|
2026-04-12 12:12:43 -03:00 |
|
|
|
96f923ae4c
|
maintenance(soteria): add protected UI, OIDC bootstrap, and backup health panel wiring
|
2026-04-12 11:16:29 -03:00 |
|
|
|
f4e921bb33
|
scheduling: keep app workloads off control-plane
|
2026-04-12 04:26:52 -03:00 |
|
|
|
40de2b59a5
|
maintenance: enforce Astraios + tmpfs /tmp on worker Pis
|
2026-04-11 11:54:43 -03:00 |
|
|
|
64b4f14018
|
ariadne: remove remaining cronjobs and migrate schedule ownership
|
2026-04-10 22:40:58 -03:00 |
|
|
|
166020ca1d
|
ariadne: migrate glue cronjobs to schedules
|
2026-04-10 21:22:35 -03:00 |
|
|
|
9419c4b26b
|
dashboards: unify suite pass-rate metrics on platform counters
|
2026-04-10 15:35:20 -03:00 |
|
|
|
5f4641553c
|
monitoring: replace failure table with 24h suite pass snapshot
|
2026-04-09 20:16:44 -03:00 |
|
|
|
530f440679
|
monitoring: add suite probe metrics and align fan labels
|
2026-04-09 20:10:52 -03:00 |
|
|
|
5e3aadc640
|
monitoring: set overview platform test panel to 7d
|
2026-04-09 20:05:10 -03:00 |
|
|
|
12b85f4597
|
monitoring: add platform quality push gateway for test metrics
|
2026-04-09 19:30:16 -03:00 |
|
|
|
ad1cbd6f85
|
monitoring: make test panel point-based and failure-by-suite
|
2026-04-09 19:27:48 -03:00 |
|
|
|
5cf9a16d97
|
monitoring: align overview panels with jobs and point-based suite rates
|
2026-04-09 16:35:14 -03:00 |
|
|
|
f8c1243dfd
|
monitoring: add generic suite metric slots for platform tests
|
2026-04-09 16:16:35 -03:00 |
|
|
|
7b0e9acbb1
|
monitoring: make suite pass rate 30d rolling for sparse tests
|
2026-04-09 16:14:26 -03:00 |
|
|
|
0273727cb4
|
monitoring: make platform test success one line per suite
|
2026-04-09 15:21:59 -03:00 |
|
|
|
09fa3e716c
|
monitoring/atlas: merge top rows and fix platform test pass-rate panel
|
2026-04-09 14:56:43 -03:00 |
|
|
|
293cd83999
|
monitoring/atlas: resize test/ops rows and source overview tests from atlas-jobs
|
2026-04-09 13:39:55 -03:00 |
|
|
|
764bfe189e
|
monitoring/recovery: harden ananke checks and OIDC-gated service validation
|
2026-04-09 01:44:26 -03:00 |
|
|
|
e0b124ca4e
|
monitoring: switch power telemetry to ananke metrics
|
2026-04-08 23:33:17 -03:00 |
|
|
|
3ce7b2eeb7
|
maintenance/monitoring: wire reciprocal metis hecate key + dampen alert flapping
|
2026-04-05 13:51:57 -03:00 |
|
|
|
96bc93670b
|
monitoring(power): rename hecate UPS peers to Pyrphoros and Statera
|
2026-04-04 05:54:16 -03:00 |
|
|
|
82e1b87b8f
|
monitoring(overview): refine ups-climate row and climate/fan stat display
|
2026-04-04 04:40:22 -03:00 |
|
|
|
1b682cc60f
|
monitoring(grafana): restart to pick up latest overview layout
|
2026-04-04 04:35:26 -03:00 |
|
|
|
5059d2918d
|
monitoring(overview): swap jobs and power rows; tighten climate/fan display
|
2026-04-04 04:34:18 -03:00 |
|
|
|
d5fc6c89c4
|
monitoring(grafana): bump restart revision for overview dashboard reload
|
2026-04-04 01:34:36 -03:00 |
|
|
|
55b96c0675
|
monitoring(overview): place six power/climate panels on one row and fix test/job data
|
2026-04-04 01:33:15 -03:00 |
|
|
|
cdc3c081f5
|
monitoring(overview): replace power/climate summary row with six-panel layout
|
2026-04-03 22:16:02 -03:00 |
|
|
|
7ef4c895ba
|
monitoring(grafana): bump restart revision to reload provisioned dashboards
|
2026-04-03 20:54:12 -03:00 |
|
|
|
69a02a3352
|
monitoring(power): implement six-panel UPS and climate layout
|
2026-04-03 20:45:40 -03:00 |
|
|
|
4167f0f988
|
monitoring(power): add UPS status snapshot table and climate placeholders
|
2026-04-03 17:53:42 -03:00 |
|
|
|
fd71c6644b
|
monitoring(power): wire generated power dashboard and split per-UPS panels
|
2026-04-03 17:49:09 -03:00 |
|
|
|
7ae4746d10
|
monitoring: scope hecate power queries to hecate-power job
|
2026-04-03 15:23:27 -03:00 |
|
|
|
bc9bf0310a
|
monitoring: add power dashboard and reorder atlas overview rows
|
2026-04-03 14:55:16 -03:00 |
|
|
|
5a577630df
|
platform: expose metis on sentinel and move gitea to rpi5
|
2026-03-31 16:44:41 -03:00 |
|
|
|
9a8030bf68
|
maintenance: harden metis recovery and fix harbor rollout
|
2026-03-31 14:55:48 -03:00 |
|
|
|
10ae47110a
|
monitoring: combine Ariadne and Metis tests
|
2026-03-31 14:54:54 -03:00 |
|
|
|
03ae79df3e
|
maintenance: harden sd-write controls and recovery workflow
|
2026-03-31 00:06:44 -03:00 |
|
|
|
3cf426e23a
|
monitoring: roll grafana to apply latest alert rules
|
2026-03-30 18:41:26 -03:00 |
|
|
|
8006540645
|
monitoring: raise rootfs warning threshold to 85 percent
|
2026-03-30 18:41:05 -03:00 |
|
|
|
0aeb08d375
|
monitoring: fix noisy grafana email alerts and reload rules
|
2026-03-30 18:33:02 -03:00 |
|
|
|
bc59270202
|
chore: organize one-off jobs
|
2026-01-28 01:48:32 -03:00 |
|
|
|
35d5d5a1a3
|
monitoring: fix grafana alert exec state
|
2026-01-27 23:34:11 -03:00 |
|
|
|
a49fa6dd33
|
monitoring: restart grafana for alerting reload
|
2026-01-27 23:29:46 -03:00 |
|
|
|
9a978c5e72
|
monitoring: tune cpu and maintenance alerts
|
2026-01-27 23:23:42 -03:00 |
|
|
|
32884e0b7e
|
monitoring: fix grafana smtp from address
|
2026-01-27 22:28:37 -03:00 |
|