docs: note ananke bring-up dependencies
This commit is contained in:
parent
09409660a2
commit
9a741def21
15
README.md
15
README.md
@ -20,6 +20,21 @@ Recovery cordons are given short 1hr leases. If Ananke cordons a node to repair
|
||||
|
||||
The following are notes for future Brad.
|
||||
|
||||
## Bring-up dependencies
|
||||
|
||||
Ananke should be one of the first things working. It does not need Harbor, Gitea, Longhorn, Grafana, or the apps to be healthy before it starts; those are often the mess it is there to sort out.
|
||||
|
||||
It does need:
|
||||
|
||||
- an Ananke host that came up on its own: usually `titan-db`, with the `tethys`/`titan-24` peer path as the backup
|
||||
- `/etc/ananke/ananke.yaml`, the Ananke SSH key, and enough host config to reach nodes on the Atlas SSH port
|
||||
- Kubernetes API access once the control plane is answering; before that it can only do host-side checks
|
||||
- Flux CRDs/controllers and the `titan-iac` source once the API is up, because most startup gates are Flux-shaped
|
||||
- basic node hygiene that Ananke cannot fake forever: SSH, sudo for managed repairs, sane clocks, and Longhorn host packages like `cryptsetup`, `open-iscsi`, `dmsetup`, and `nfs-common`
|
||||
- NUT/UPS access if this is making real shutdown decisions instead of just doing startup recovery
|
||||
|
||||
If this is a total bring-up, start Ananke after the host boots and before waiting on applications. If Ananke is not running, Atlas is missing the thing that knows the order of operations.
|
||||
|
||||
## Daily commands
|
||||
|
||||
```bash
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user