From d0ed18817923b9a4ed7124c83dfc252615ec2db9 Mon Sep 17 00:00:00 2001 From: Brad Stein Date: Sat, 13 Dec 2025 15:25:21 -0300 Subject: [PATCH] monitoring: drop README per convention --- services/monitoring/README.md | 28 ---------------------------- 1 file changed, 28 deletions(-) delete mode 100644 services/monitoring/README.md diff --git a/services/monitoring/README.md b/services/monitoring/README.md deleted file mode 100644 index 835ae1d..0000000 --- a/services/monitoring/README.md +++ /dev/null @@ -1,28 +0,0 @@ -# services/monitoring - -## Grafana admin secret - -The Grafana Helm release expects a pre-existing secret named `grafana-admin` -in the `monitoring` namespace. Create or rotate it with: - -```bash -kubectl create secret generic grafana-admin \ - --namespace monitoring \ - --from-literal=admin-user=admin \ - --from-literal=admin-password='REPLACE_ME' -``` - -Update the password whenever you rotate credentials. - -## DCGM exporter image - -The NVIDIA GPU metrics DaemonSet expects `registry.bstein.dev/monitoring/dcgm-exporter:4.4.2-4.7.0-ubuntu22.04`, mirrored from `docker.io/nvidia/dcgm-exporter:4.4.2-4.7.0-ubuntu22.04`. Refresh it in Zot when bumping versions: - -```bash -skopeo copy \ - --all \ - docker://docker.io/nvidia/dcgm-exporter:4.4.2-4.7.0-ubuntu22.04 \ - docker://registry.bstein.dev/monitoring/dcgm-exporter:4.4.2-4.7.0-ubuntu22.04 -``` - -When finished mirroring from the control-plane, you can remove temporary tooling with `sudo apt-get purge -y skopeo && sudo apt-get autoremove -y` and clear `~/.config/containers/auth.json`.