2780 Commits

Author SHA1 Message Date
flux-bot
9c4fcfffed chore(maintenance): automated image update 2026-05-24 17:14:17 +00:00
flux-bot
5100e37471 chore(bstein-dev-home): automated image update 2026-05-24 09:37:52 +00:00
flux-bot
39342e7910 chore(bstein-dev-home): automated image update 2026-05-24 09:36:52 +00:00
flux-bot
c3d37fc203 chore(maintenance): automated image update 2026-05-23 17:14:51 +00:00
flux-bot
2859e0f0dd chore(bstein-dev-home): automated image update 2026-05-23 09:38:17 +00:00
flux-bot
f62664c419 chore(bstein-dev-home): automated image update 2026-05-23 09:37:24 +00:00
flux-bot
fae4f2bbcd chore(maintenance): automated image update 2026-05-23 01:40:16 +00:00
flux-bot
0e00181cb4 chore(maintenance): automated image update 2026-05-23 01:37:55 +00:00
flux-bot
f50de4cd49 chore(maintenance): automated image update 2026-05-23 01:36:05 +00:00
jenkins
cf8baafed1 maintenance: document node recovery guardrails 2026-05-22 17:21:59 -03:00
jenkins
c7edc81239 maintenance: stabilize recovered worker nodes 2026-05-22 17:10:01 -03:00
jenkins
46c3e97688 maintenance: make titan-22 link keeper passive 2026-05-22 15:56:50 -03:00
jenkins
5bce6c4c04 openclaw: allow recovered workers while excluding hdd nodes 2026-05-22 15:33:28 -03:00
jenkins
ee5688f297 maintenance: track titan-22 link recovery 2026-05-22 15:25:41 -03:00
flux-bot
c54c7b4452 chore(maintenance): automated image update 2026-05-22 17:11:37 +00:00
jenkins
17dc9a6e52 scheduling: target hdd storage node exclusions 2026-05-22 14:02:17 -03:00
jenkins
155d7d020e scheduling: keep apps off longhorn storage nodes 2026-05-22 13:38:29 -03:00
jenkins
f383818f93 nextcloud: keep collabora off descheduler 2026-05-22 06:57:01 -03:00
jenkins
1fe125b8b3 game-stream(wolf): expose runtime sockets to app containers 2026-05-22 05:37:52 -03:00
jenkins
361a4decb3 game-stream(wolf): retain failed app containers 2026-05-22 05:28:38 -03:00
jenkins
2aea5f4ace game-stream(wolf): use manual Nvidia driver mount 2026-05-22 05:12:41 -03:00
jenkins
ce13ac054c game-stream(wolf): mount Nvidia driver volume 2026-05-22 05:08:57 -03:00
jenkins
a19a19fbd5 maintenance(titan-24): avoid unnecessary Docker restarts 2026-05-22 05:07:40 -03:00
jenkins
f1a72d64fd gpu(titan-24): populate Nvidia driver volume without exec 2026-05-22 05:05:02 -03:00
jenkins
ac9c481ce7 gpu(titan-24): fix Nvidia driver volume bootstrap 2026-05-22 05:02:59 -03:00
jenkins
2ff55289a8 gpu(titan-24): prepare Wolf Nvidia driver volume 2026-05-22 04:59:52 -03:00
jenkins
2d8405d299 crypto: throttle mining during recovery 2026-05-22 04:26:29 -03:00
jenkins
5e27384ea2 monitoring(gpu): show activity share by namespace 2026-05-22 04:22:51 -03:00
flux-bot
ec972a52f1 chore(bstein-dev-home): automated image update 2026-05-22 07:07:12 +00:00
flux-bot
10eed46e81 chore(bstein-dev-home): automated image update 2026-05-22 07:06:25 +00:00
jenkins
d21b61f6d9 monitoring(gpu): count monitored GPU pool devices 2026-05-22 03:23:36 -03:00
jenkins
b367c6dea3 monitoring: keep quality probe on worker nodes 2026-05-22 03:16:01 -03:00
jenkins
6388ef5c6d monitoring(gpu): add pool utilization counters 2026-05-22 03:09:10 -03:00
flux-bot
4ce5a67b94 chore(bstein-dev-home): automated image update 2026-05-22 06:08:50 +00:00
flux-bot
1375bac117 chore(bstein-dev-home): automated image update 2026-05-22 06:08:18 +00:00
jenkins
570b1212d7 monitoring(gpu): normalize utilization pie to pool capacity 2026-05-22 02:55:24 -03:00
jenkins
ea21e106cf keycloak(portal): allow groups scope 2026-05-22 02:48:10 -03:00
jenkins
1c50af1d72 ci(data-prepper): avoid titan-04 during recovery 2026-05-22 02:37:21 -03:00
jenkins
b5dc723e02 monitoring(gpu): hide zero-utilization namespaces 2026-05-22 02:35:51 -03:00
flux-bot
3f24fbdc6d chore(bstein-dev-home): automated image update 2026-05-22 05:33:39 +00:00
flux-bot
1cd9fd18f4 chore(bstein-dev-home): automated image update 2026-05-22 05:32:46 +00:00
flux-bot
e7ad2c3955 chore(maintenance): automated image update 2026-05-22 05:28:56 +00:00
jenkins
fd3da0e2ae monitoring(gpu): add process-level utilization attribution 2026-05-22 02:28:08 -03:00
jenkins
5513608b1a monitoring(gpu): remove ambiguous shared wording 2026-05-22 01:55:25 -03:00
jenkins
72e4dcd84b monitoring(gpu): attribute utilization to namespaces 2026-05-22 01:46:32 -03:00
jenkins
26af225f06 ci(data-prepper): allow recovered titan-04 agents 2026-05-22 01:40:44 -03:00
flux-bot
e368927a0e chore(maintenance): automated image update 2026-05-22 01:50:35 +00:00
flux-bot
825a7a7f37 chore(maintenance): automated image update 2026-05-22 01:50:24 +00:00
flux-bot
0719b5317f chore(maintenance): automated image update 2026-05-22 01:47:23 +00:00
flux-bot
f0ed508277 chore(maintenance): automated image update 2026-05-22 01:42:20 +00:00