init: scaffold metis design

This commit is contained in:
Brad Stein 2026-01-10 20:52:37 -03:00
commit 26c0cced39
2 changed files with 40 additions and 0 deletions

28
README.md Normal file
View File

@ -0,0 +1,28 @@
# Metis
Metis produces fully configured recovery SD cards for any node in the lab (RPi 4/5 workers, control plane Pis, amd64 nodes like tethys, titan-db, titan-jh, future titan-20/21, and non-cluster hosts). Goal: 1 command + insert SD → node rejoins with identical identity, network, k3s role/labels/taints, and pre-baked log/GC drop-ins.
## Objectives
- Cross-platform (Linux + Windows) CLI/GUI with dead-simple UX.
- Pull class-specific golden images from Harbor (or other artifact store), inject per-node config, and write/verify SD cards.
- Minimal image set via node classes; inject per-node deltas at burn time.
- Idempotent bootstraps: hostname/IP, k3s server/agent setup, labels/taints, journald/log GC drop-ins, Longhorn mount validation, SSH keys/users.
- Works offline once artifacts are cached; verifies hashes/signatures before writing.
## Planned high-level workflow
1) Select target node (from inventory) + target disk.
2) Tool downloads/caches the right golden image for that node class.
3) Injects per-node config (net, k3s tokens/roles/labels/taints, SSH keys, runtime drop-ins, Longhorn mount metadata) and writes SD.
4) Verifies write; prints next-step: "insert and power on." No manual follow-up.
## Early design notes
- Implemented in Go for easy static builds and a lightweight GUI (e.g., Fyne or Wails) plus CLI.
- Inventory-driven: node classes (rpi5-ubuntu, rpi4-armbian-longhorn, rpi4-armbian-std, control-plane, amd64-agents, external hosts).
- Extensible per-node hooks for special hardware (Longhorn HDD UUIDs on titan-13/15/17/19; future titan-20/21; oceanus/titan-23; tethys/titan-jh/titan-db).
- Secure defaults: hash checking for downloaded images; avoids ever printing secrets; prepares k3s tokens/certs/keys via sealed source.
## Repo layout (initial)
- `cmd/` CLI/GUI entrypoints
- `pkg/` shared lib (inventory, imaging, injectors, platform abstraction)
- `docs/` user/operator docs (this will stay light; working notes live in AGENTS.md untracked)
- `AGENTS.md` local, untracked working notes (do not add to git)

12
docs/node-classes.md Normal file
View File

@ -0,0 +1,12 @@
# Node classes (draft)
Initial classes to minimize golden images while covering hardware/OS deltas:
- `rpi5-ubuntu-worker`: Ubuntu 24.04, k3s agent, hardware=rpi5 (titan-04..11, 0a/0c minus control-plane bits)
- `rpi5-ubuntu-control`: Ubuntu 24.04, k3s server (titan-0a/0b/0c specifics), control-plane taints, etcd snapshot hooks
- `rpi4-armbian-longhorn`: Armbian 6.6.x, k3s agent, hardware=rpi4 with Longhorn disks (titan-13/15/17/19; astreae/asteria mounts)
- `rpi4-armbian-worker`: Armbian 6.6.x, k3s agent, hardware=rpi4 without Longhorn disks (titan-12/14/18)
- `amd64-agent`: Debian 13 k3s agent with GPU/node labels (titan-22/24, avoid by preference)
- `external-hosts`: non-cluster (tethys, titan-db, titan-jh, oceanus/titan-23, future titan-20/21) per-host config over base image template
Per-node overlays capture hostname/IP, labels/taints, Longhorn UUID mounts, and drop-ins for logging/GC.