metis/docs/titan-rpi4-recovery.md

2.6 KiB

Titan rpi4 Longhorn Recovery

This flow is for titan-13, titan-15, titan-17, and titan-19.

Why this works

  • The replacement card is burned from a plain Armbian rpi4 image.
  • Metis injects the original node identity, k3s config, SSH key, and Longhorn disk UUIDs.
  • The image also carries a static NetworkManager profile for the node IP plus local k3s and open-iscsi payloads sourced from a healthy rpi4 Longhorn node.
  • An Armbian first-boot hook finishes the host bootstrap automatically:
    • enables SSH on port 2277
    • mounts /mnt/astreae and /mnt/asteria
    • ensures the iSCSI initiator identity exists
    • starts open-iscsi
    • starts k3s-agent
  • For this Armbian flow, the important recovery files live on the root partition; boot NoCloud files are optional and not required for node recovery.

Before burning

For a same-name replacement, remove the old node object first so k3s can re-register the node cleanly.

kubectl delete node titan-13
kubectl delete node titan-19

Then export the live cluster join token:

export METIS_K3S_TOKEN="$(ssh titan-0a 'sudo cat /var/lib/rancher/k3s/server/node-token')"
export METIS_IMAGE_RPI4_ARMBIAN_LONGHORN="file://${HOME}/Downloads/Armbian_25.8.1_Rpi4b_noble_current_6.12.41.img"

Burn commands

Inspect the merged config first:

go run ./cmd/metis config --inventory inventory.titan-rpi4.yaml --node titan-13
go run ./cmd/metis config --inventory inventory.titan-rpi4.yaml --node titan-19

If you want ready-to-flash artifacts before inserting SD cards, build them first:

go run ./cmd/metis image \
  --inventory inventory.titan-rpi4.yaml \
  --node titan-13 \
  --cache "${HOME}/.cache/metis" \
  --output artifacts/titan-13.img

go run ./cmd/metis image \
  --inventory inventory.titan-rpi4.yaml \
  --node titan-19 \
  --cache "${HOME}/.cache/metis" \
  --output artifacts/titan-19.img

Burn the cards:

sudo -E go run ./cmd/metis burn \
  --inventory inventory.titan-rpi4.yaml \
  --node titan-13 \
  --device /dev/sdX \
  --cache "${HOME}/.cache/metis" \
  --auto-mount \
  --yes

sudo -E go run ./cmd/metis burn \
  --inventory inventory.titan-rpi4.yaml \
  --node titan-19 \
  --device /dev/sdY \
  --cache "${HOME}/.cache/metis" \
  --auto-mount \
  --yes

After boot

Because the hardware stays the same, the Pi should keep the same MAC address and reclaim the same DHCP reservation.

Validate:

kubectl get nodes | grep 'titan-13\|titan-19'
kubectl -n longhorn-system get nodes.longhorn.io
kubectl -n longhorn-system get replicas.longhorn.io -o wide | grep 'titan-13\|titan-19'