# Titan rpi4 Remote Replacement This is the low-touch replacement flow for `titan-13` and `titan-19` when the person onsite can only: 1. insert an SD card into the flashing machine 2. swap the card into the Pi 3. power-cycle the Pi The remote operator does everything else. ## What the image does by itself After the stale Kubernetes node object is deleted and the replacement image is flashed, the booted Pi is expected to do the rest automatically: - bring up SSH on port `2277` - set the node hostname - bring up the node's static `192.168.22.x` address on `end0` - mount `/mnt/astreae` and `/mnt/asteria` - start `open-iscsi` - start `k3s-agent` - rejoin the cluster with the baked-in node token and server URL ## Version clarification As of **March 31, 2026**, the live cluster reports: - control plane: `k3s v1.33.3+k3s1` - healthy rpi4 Longhorn workers (`titan-15`, `titan-17`): `k3s v1.31.5+k3s1` The `6.6.63` and `6.12.41` numbers are Linux kernel versions, not Kubernetes versions. Kubernetes' official version skew policy says a `kubelet` may be up to three minor versions older than the `kube-apiserver`, so `1.31` workers against a `1.33` control plane are supported today: - https://kubernetes.io/releases/version-skew-policy/ The replacement images intentionally keep the rpi4 worker `k3s` version aligned with the healthy HDD-backed rpi4 workers to avoid introducing a Kubernetes minor change during node recovery. ## Remote flashing flow Run these commands from the machine that has the `metis` repo and your SSH access. ### 1. Build the image and delete the stale node object ```bash cd ~/Development/metis ./scripts/prepare_titan_rpi4_replacement.sh titan-13 titan-22 ./scripts/prepare_titan_rpi4_replacement.sh titan-19 titan-22 ``` This does all of the following: - fetches the current cluster node token from `titan-0a` - deletes the stale Kubernetes `Node` object - builds the replacement image under `artifacts/` - copies it to `titan-22:/tmp/metis-images/` ### 2. Ask the onsite helper to insert the SD card into `titan-22` When the card is inserted, identify the target device: ```bash ./scripts/remote_sd_candidates.sh titan-22 ``` ### 3. Flash the card remotely ```bash ./scripts/remote_flash_titan_image.sh titan-22 titan-13 /dev/sdX ./scripts/remote_flash_titan_image.sh titan-22 titan-19 /dev/sdY ``` The remote machine will ask for its `sudo` password during the flash. ### 4. Ask the onsite helper to swap the card and power-cycle the Pi That should be the end of the onsite work. ### 5. Validate remotely ```bash kubectl get nodes -w kubectl -n longhorn-system get nodes.longhorn.io kubectl -n longhorn-system get replicas.longhorn.io -o wide | grep 'titan-13\|titan-19' ssh titan-13 ssh titan-19 ``` ## USB boot Raspberry Pi 4 supports USB mass storage boot via its EEPROM bootloader: - https://www.raspberrypi.com/documentation/computers/raspberry-pi.html#usb-mass-storage-boot That means the same general recovery image approach can be used on a USB device instead of an SD card. For this cluster, the safer rollout is: 1. first recover `titan-13` and `titan-19` to known-good SD cards 2. pilot USB boot on one non-critical rpi4 3. only then migrate the Longhorn HDD-backed rpi4s USB boot is attractive for wear reduction, but it adds EEPROM boot-order, adapter, and power-delivery variables. The emergency replacement process above should stay SD-based until the USB path has been tested on your actual hardware.