316 Commits

Author SHA1 Message Date
3a148c63e4 monitoring(overview): rebalance climate row widths for current/history panels 2026-04-12 22:57:25 -03:00
f17fa41207 monitoring(overview): restore single-panel cards and dynamic climate axes 2026-04-12 22:53:46 -03:00
51e35b8643 monitoring(overview): stack ups current card into draw/runtime rows 2026-04-12 22:25:34 -03:00
e53933ece7 monitoring(overview): stack climate stats into explicit 2x2 rows 2026-04-12 22:19:37 -03:00
4efd28c956 Revert "monitoring(overview): force horizontal stat cards for climate/ups wrap"
This reverts commit 287c339aa0001c1daec161fd9fc73fbd4b267b48.
2026-04-12 22:14:59 -03:00
a1ab78b0c9 monitoring(grafana): mount and provision atlas-testing dashboard 2026-04-12 22:13:58 -03:00
287c339aa0 monitoring(overview): force horizontal stat cards for climate/ups wrap 2026-04-12 22:11:40 -03:00
dc1f1cbb7c monitoring(overview): split climate and ups stats into two-row query groups 2026-04-12 22:07:58 -03:00
4a10163b10 monitoring(overview): tune stat sizing for 2x2 climate/ups cards 2026-04-12 22:03:13 -03:00
f45217f98e monitoring(overview): simplify ups current card to draw/runtime 2026-04-12 21:36:42 -03:00
66da1b3aab monitoring(overview): shorten ups labels for readable stat rows 2026-04-12 21:32:48 -03:00
8d30fddd7d monitoring(overview): wrap ups and climate stats for narrow panels 2026-04-12 21:28:14 -03:00
a0f1149bbb monitoring(overview): restore readable two-row stats for ups and climate 2026-04-12 21:23:28 -03:00
d2672300a3 monitoring(jobs): switch cleanup stats to two-row layout 2026-04-12 20:38:52 -03:00
66bd705971 monitoring: tune stat text sizing for climate and ups 2026-04-12 20:30:17 -03:00
4b78e67036 monitoring: use wide stat layout for ups and climate cards 2026-04-12 20:23:38 -03:00
3a4bdbd42f monitoring: switch ups/climate/fan stats to vertical orientation 2026-04-12 20:12:17 -03:00
e222344cd9 monitoring(jobs): add schedule fallback series for cold starts 2026-04-12 20:09:43 -03:00
299a68ad95 monitoring(jobs): split testing dashboard and clean up job ops view 2026-04-12 20:06:03 -03:00
049a0deb04 maintenance(soteria): roll react ui image and wire b2 monitoring 2026-04-12 20:04:35 -03:00
7d3b12c774 monitoring: restore stat layout for ups/climate/fan rows 2026-04-12 19:56:12 -03:00
ac71b4621c monitoring: render ups/climate/fan panels as row tables 2026-04-12 19:46:39 -03:00
3271369e2d monitoring: set compact stat layout for climate and ups rows 2026-04-12 19:37:08 -03:00
931ee5944d monitoring: pack overview/power stats horizontally 2026-04-12 19:23:10 -03:00
08077f46c6 monitoring(atlas-power): force horizontal layout for stat rows 2026-04-12 19:06:07 -03:00
3096e0d7de monitoring(overview): tighten climate labels and drop duplicate temp line 2026-04-12 18:50:25 -03:00
6b0d6b017c monitoring(overview): tune climate row and restore ups card density 2026-04-12 18:35:15 -03:00
de3272e160 merge: atlas jobs ariadne schedule observability 2026-04-12 18:33:07 -03:00
cb27592272 monitoring(overview): reflow UPS/climate rows and add jenkins weather 2026-04-12 18:14:54 -03:00
f67ca30f94 monitoring(climate): add C/F history and dedupe typhon series 2026-04-12 17:56:54 -03:00
b6b1e533ed monitoring(jobs): add Ariadne schedule inventory signals 2026-04-12 17:29:27 -03:00
58ccbfb130 monitoring: add humidity and dew point to climate panels 2026-04-12 17:28:15 -03:00
a20fd995a1 monitoring: switch climate dashboards to typhon metrics 2026-04-12 17:20:05 -03:00
c325744540 monitoring(alerts): watch soteria authz denial spikes 2026-04-12 15:07:54 -03:00
241a405c05 maintenance(soteria): harden ingress path and add backup alerts 2026-04-12 15:07:54 -03:00
091e743d0e maintenance(soteria): add protected UI, OIDC bootstrap, and backup health panel wiring 2026-04-12 15:07:53 -03:00
3774b600ee scheduling: keep app workloads off control-plane 2026-04-12 04:27:43 -03:00
3ea296b552 maintenance: enforce Astraios + tmpfs /tmp on worker Pis 2026-04-11 11:55:49 -03:00
b723382ff4 dashboards: unify suite pass-rate metrics on platform counters 2026-04-10 16:39:55 -03:00
32b6e55467 monitoring: use CI-only series for platform test success panels 2026-04-10 04:52:57 -03:00
99eda351df monitoring/jenkins: add pegasus CI job and separate health probe suite 2026-04-10 03:26:51 -03:00
5f4641553c monitoring: replace failure table with 24h suite pass snapshot 2026-04-09 20:16:44 -03:00
530f440679 monitoring: add suite probe metrics and align fan labels 2026-04-09 20:10:52 -03:00
5e3aadc640 monitoring: set overview platform test panel to 7d 2026-04-09 20:05:10 -03:00
12b85f4597 monitoring: add platform quality push gateway for test metrics 2026-04-09 19:30:16 -03:00
ad1cbd6f85 monitoring: make test panel point-based and failure-by-suite 2026-04-09 19:27:48 -03:00
5cf9a16d97 monitoring: align overview panels with jobs and point-based suite rates 2026-04-09 16:35:14 -03:00
f8c1243dfd monitoring: add generic suite metric slots for platform tests 2026-04-09 16:16:35 -03:00
7b0e9acbb1 monitoring: make suite pass rate 30d rolling for sparse tests 2026-04-09 16:14:26 -03:00
0273727cb4 monitoring: make platform test success one line per suite 2026-04-09 15:21:59 -03:00