ananke: harden recovery checks and finalize naming migration
This commit is contained in:
parent
c1dc50cace
commit
cc316c472b
79
README.md
79
README.md
@ -1,3 +1,80 @@
|
|||||||
# titan-iac
|
# titan-iac
|
||||||
|
|
||||||
Flux-managed Kubernetes cluster for bstein.dev services.
|
Flux-managed Kubernetes cluster config for bstein.dev.
|
||||||
|
|
||||||
|
Canonical repo URL:
|
||||||
|
- `ssh://git@scm.bstein.dev:2242/bstein/titan-iac.git`
|
||||||
|
|
||||||
|
## Why `ananke`
|
||||||
|
|
||||||
|
`Ananke` is inevitability and constraint. That is exactly what this tooling is for:
|
||||||
|
- power events happen
|
||||||
|
- recovery windows are finite
|
||||||
|
- bootstrap has to be deterministic
|
||||||
|
|
||||||
|
The point is not clever automation. The point is boring, repeatable recovery.
|
||||||
|
|
||||||
|
## Power Domains
|
||||||
|
|
||||||
|
Two UPS domains matter during shutdown/startup drills:
|
||||||
|
- `Statera`: `titan-23`, `titan-24`, `titan-jh`
|
||||||
|
- `Pyrphoros`: all other nodes
|
||||||
|
|
||||||
|
Default UPS checks in Ananke read from `Pyrphoros` (`pyrphoros@localhost`) unless overridden.
|
||||||
|
|
||||||
|
## Breakglass
|
||||||
|
|
||||||
|
If primary operator access is lost, breakglass is on the remote Magic Mirror.
|
||||||
|
|
||||||
|
## Ananke Commands
|
||||||
|
|
||||||
|
Ananke is the recovery orchestrator. Flux desired-state source remains `titan-iac.git`.
|
||||||
|
|
||||||
|
Use `titan-db` as the canonical control host. `tethys` (`titan-24`) is the backup operator host.
|
||||||
|
|
||||||
|
From `titan-db`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
~/ananke-cluster-power status
|
||||||
|
~/ananke-cluster-power prepare --execute
|
||||||
|
~/ananke-cluster-power shutdown --execute --require-ups-battery
|
||||||
|
~/ananke-cluster-power startup --execute --force-flux-branch main --require-ups-battery
|
||||||
|
```
|
||||||
|
|
||||||
|
From `tethys` / `titan-24` (delegating to `titan-db`):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
~/ananke-tools/cluster_power_console.sh --delegate-host titan-db status
|
||||||
|
~/ananke-tools/cluster_power_console.sh --delegate-host titan-db prepare --execute
|
||||||
|
~/ananke-tools/cluster_power_console.sh --delegate-host titan-db shutdown --execute --require-ups-battery
|
||||||
|
~/ananke-tools/cluster_power_console.sh --delegate-host titan-db startup --execute --force-flux-branch main --require-ups-battery
|
||||||
|
```
|
||||||
|
|
||||||
|
## Shutdown Modes
|
||||||
|
|
||||||
|
`cluster_power_recovery.sh` supports two shutdown behaviors:
|
||||||
|
- `--shutdown-mode host-poweroff` (default): graceful cluster shutdown plus scheduled host poweroff.
|
||||||
|
- `--shutdown-mode cluster-only`: graceful cluster shutdown without host poweroff (stops `k3s` / `k3s-agent` only).
|
||||||
|
|
||||||
|
## Startup Completion Rules
|
||||||
|
|
||||||
|
Ananke startup is not “done” just because Flux says green once.
|
||||||
|
|
||||||
|
Startup now completes only after:
|
||||||
|
- Flux source drift checks pass (expected URL and branch)
|
||||||
|
- all non-optional Flux kustomizations report `Ready=True`
|
||||||
|
- external service checklist passes (default includes Gitea, Grafana, Harbor)
|
||||||
|
- generated ingress reachability checks pass (default accepted statuses: `200,301,302,307,308,401,403,404`)
|
||||||
|
- a stability soak window passes with no `CrashLoopBackOff` / image-pull failures and checklist still healthy
|
||||||
|
|
||||||
|
If you intentionally need to correct Flux source during recovery, use:
|
||||||
|
- `--force-flux-url ssh://git@scm.bstein.dev:2242/bstein/titan-iac.git`
|
||||||
|
- `--force-flux-branch main`
|
||||||
|
|
||||||
|
`--force-flux-url` is breakglass-only and requires `--allow-flux-source-mutation`.
|
||||||
|
|
||||||
|
The defaults live in:
|
||||||
|
- `scripts/bootstrap/recovery-config.env`
|
||||||
|
|
||||||
|
Detailed runbook:
|
||||||
|
- `knowledge/runbooks/cluster-power-recovery.md`
|
||||||
|
|||||||
@ -9,7 +9,7 @@ metadata:
|
|||||||
spec:
|
spec:
|
||||||
interval: 1m0s
|
interval: 1m0s
|
||||||
ref:
|
ref:
|
||||||
branch: feature/atlasbot
|
branch: main
|
||||||
secretRef:
|
secretRef:
|
||||||
name: flux-system-gitea
|
name: flux-system-gitea
|
||||||
url: ssh://git@scm.bstein.dev:2242/bstein/titan-iac.git
|
url: ssh://git@scm.bstein.dev:2242/bstein/titan-iac.git
|
||||||
|
|||||||
@ -45,33 +45,37 @@ Execute examples
|
|||||||
Manual remote console examples
|
Manual remote console examples
|
||||||
- Canonical operator hosts:
|
- Canonical operator hosts:
|
||||||
- `titan-db`
|
- `titan-db`
|
||||||
- `titan-24`
|
- `tethys` (`titan-24`)
|
||||||
- Both hosts now have:
|
- Both hosts now have:
|
||||||
- `~/hecate-tools/cluster_power_recovery.sh`
|
- `~/ananke-tools/cluster_power_recovery.sh`
|
||||||
- `~/hecate-tools/cluster_power_console.sh`
|
- `~/ananke-tools/cluster_power_console.sh`
|
||||||
- `~/hecate-tools/bootstrap/recovery-config.env`
|
- `~/ananke-tools/bootstrap/recovery-config.env`
|
||||||
- `~/hecate-tools/bootstrap/harbor-bootstrap-images.txt`
|
- `~/ananke-tools/bootstrap/harbor-bootstrap-images.txt`
|
||||||
- `~/hecate-tools/kubeconfig`
|
- `~/ananke-tools/kubeconfig`
|
||||||
- `~/hecate-cluster-power`
|
- `~/ananke-cluster-power`
|
||||||
- `~/bin/hecate-cluster-power`
|
- `~/bin/ananke-cluster-power`
|
||||||
- `~/hecate-repo/{infrastructure,services,scripts}`
|
- `~/ananke-repo/{infrastructure,services,scripts}`
|
||||||
- Both hosts also keep the Harbor bootstrap bundle at:
|
- Both hosts also keep the Harbor bootstrap bundle at:
|
||||||
- `~/.local/share/hecate/bundles/harbor-bootstrap-v2.14.1-arm64.tar.zst`
|
- `~/.local/share/ananke/bundles/harbor-bootstrap-v2.14.1-arm64.tar.zst`
|
||||||
- Remote usage:
|
- Remote usage:
|
||||||
- `ssh titan-db`
|
- `ssh titan-db`
|
||||||
- `~/hecate-cluster-power status`
|
- `~/ananke-cluster-power status`
|
||||||
- `~/hecate-cluster-power prepare --execute`
|
- `~/ananke-cluster-power prepare --execute`
|
||||||
- `~/hecate-cluster-power shutdown --execute`
|
- `~/ananke-cluster-power shutdown --execute`
|
||||||
- `~/hecate-cluster-power startup --execute --force-flux-branch main`
|
- `~/ananke-cluster-power startup --execute --force-flux-branch main`
|
||||||
- `ssh titan-24`
|
- `ssh tethys`
|
||||||
- `~/hecate-cluster-power status`
|
- `~/ananke-cluster-power status`
|
||||||
- `~/hecate-cluster-power prepare --execute`
|
- `~/ananke-cluster-power prepare --execute`
|
||||||
- `~/hecate-cluster-power shutdown --execute`
|
- `~/ananke-cluster-power shutdown --execute`
|
||||||
- `~/hecate-cluster-power startup --execute --force-flux-branch main`
|
- `~/ananke-cluster-power startup --execute --force-flux-branch main`
|
||||||
|
|
||||||
Useful options
|
Useful options
|
||||||
|
- `--shutdown-mode host-poweroff|cluster-only`
|
||||||
- `--expected-flux-branch main`
|
- `--expected-flux-branch main`
|
||||||
|
- `--expected-flux-url ssh://git@scm.bstein.dev:2242/bstein/titan-iac.git`
|
||||||
|
- `--force-flux-url ssh://git@scm.bstein.dev:2242/bstein/titan-iac.git`
|
||||||
- `--force-flux-branch main`
|
- `--force-flux-branch main`
|
||||||
|
- `--allow-flux-source-mutation` (required with `--force-flux-url`; breakglass only)
|
||||||
- `--skip-local-bootstrap` (not recommended for cold-start recovery)
|
- `--skip-local-bootstrap` (not recommended for cold-start recovery)
|
||||||
- `--skip-harbor-bootstrap` (skip the Harbor recovery stage if you know Harbor should stay deferred)
|
- `--skip-harbor-bootstrap` (skip the Harbor recovery stage if you know Harbor should stay deferred)
|
||||||
- `--skip-harbor-seed` (skip bundle import if Harbor images are already cached on the target node)
|
- `--skip-harbor-seed` (skip bundle import if Harbor images are already cached on the target node)
|
||||||
@ -81,8 +85,12 @@ Useful options
|
|||||||
- `--require-ups-battery`
|
- `--require-ups-battery`
|
||||||
- `--drain-timeout 180`
|
- `--drain-timeout 180`
|
||||||
- `--emergency-drain-timeout 45`
|
- `--emergency-drain-timeout 45`
|
||||||
- `--recovery-state-file ~/.local/share/hecate/cluster_power_recovery.state`
|
- `--flux-ready-timeout 1200`
|
||||||
- `--harbor-bundle-file ~/.local/share/hecate/bundles/harbor-bootstrap-v2.14.1-arm64.tar.zst`
|
- `--startup-checklist-timeout 900`
|
||||||
|
- `--startup-stability-window 180`
|
||||||
|
- `--startup-stability-timeout 900`
|
||||||
|
- `--recovery-state-file ~/.local/share/ananke/cluster_power_recovery.state`
|
||||||
|
- `--harbor-bundle-file ~/.local/share/ananke/bundles/harbor-bootstrap-v2.14.1-arm64.tar.zst`
|
||||||
|
|
||||||
Controlled drill checklist (recommended)
|
Controlled drill checklist (recommended)
|
||||||
- Operator host: use `titan-db` as canonical control host for the drill.
|
- Operator host: use `titan-db` as canonical control host for the drill.
|
||||||
@ -91,37 +99,48 @@ Controlled drill checklist (recommended)
|
|||||||
- Confirm they will manually power cluster nodes back on after shutdown completes.
|
- Confirm they will manually power cluster nodes back on after shutdown completes.
|
||||||
- Confirm who will announce "all nodes powered on" to resume startup.
|
- Confirm who will announce "all nodes powered on" to resume startup.
|
||||||
- Preflight on `titan-db`:
|
- Preflight on `titan-db`:
|
||||||
- `mkdir -p ~/hecate-logs`
|
- `mkdir -p ~/ananke-logs`
|
||||||
- `~/hecate-cluster-power status` and verify:
|
- `~/ananke-cluster-power status` and verify:
|
||||||
- `ups_host=pyrphoros@localhost`
|
- `ups_host=pyrphoros@localhost`
|
||||||
- `ups_battery` is numeric
|
- `ups_battery` is numeric
|
||||||
- `flux_source_ready=True`
|
- `flux_source_ready=True`
|
||||||
- Warm helper image just before shutdown:
|
- Warm helper image just before shutdown:
|
||||||
- `~/hecate-cluster-power prepare --execute`
|
- `~/ananke-cluster-power prepare --execute`
|
||||||
- Run in a persistent shell and capture logs:
|
- Run in a persistent shell and capture logs:
|
||||||
- `tmux new -s hecate-drill`
|
- `tmux new -s ananke-drill`
|
||||||
- `script -q -a ~/hecate-logs/hecate-drill-$(date +%Y%m%d-%H%M%S).log`
|
- `script -q -a ~/ananke-logs/ananke-drill-$(date +%Y%m%d-%H%M%S).log`
|
||||||
- Execute controlled shutdown with telemetry enforcement:
|
- Execute controlled shutdown with telemetry enforcement:
|
||||||
- `~/hecate-cluster-power shutdown --execute --require-ups-battery`
|
- `~/ananke-cluster-power shutdown --execute --require-ups-battery`
|
||||||
- After on-site power-on confirmation, execute startup:
|
- After on-site power-on confirmation, execute startup:
|
||||||
- `~/hecate-cluster-power startup --execute --force-flux-branch main --require-ups-battery`
|
- `~/ananke-cluster-power startup --execute --force-flux-branch main --require-ups-battery`
|
||||||
- Post-check:
|
- Post-check:
|
||||||
- `~/hecate-cluster-power status`
|
- `~/ananke-cluster-power status`
|
||||||
- Verify critical services (`longhorn`, `vault`, `postgres`, `gitea`, `harbor`, `pegasus`) and no widespread pull/crash failures.
|
- Verify critical services (`longhorn`, `vault`, `postgres`, `gitea`, `harbor`, `pegasus`) and no widespread pull/crash failures.
|
||||||
|
|
||||||
Operational notes
|
Operational notes
|
||||||
- The flow suspends Flux Kustomizations/HelmReleases during shutdown to prevent churn.
|
- The flow suspends Flux Kustomizations/HelmReleases during shutdown to prevent churn.
|
||||||
|
- Shutdown behavior is explicit:
|
||||||
|
- `host-poweroff` schedules host poweroff after service stop.
|
||||||
|
- `cluster-only` stops `k3s`/`k3s-agent` without powering hosts off.
|
||||||
- Worker drain is no longer best-effort only. The script now escalates from normal drain, to `--force`, to `--disable-eviction` once the configured timeout is exhausted.
|
- Worker drain is no longer best-effort only. The script now escalates from normal drain, to `--force`, to `--disable-eviction` once the configured timeout is exhausted.
|
||||||
- During startup, if Flux source is not `Ready`, local bootstrap fallback is applied first using the repo snapshot under `~/hecate-repo`.
|
- Startup fails fast if Flux source URL/branch drift from expected values (unless branch override is explicitly requested with `--force-flux-branch`).
|
||||||
|
- Flux desired-state source remains `titan-iac.git`. Ananke orchestrates runtime recovery and should not be used as the normal Flux source repo.
|
||||||
|
- During startup, if Flux source is not `Ready`, local bootstrap fallback is applied first using the repo snapshot under `~/ananke-repo`.
|
||||||
- Longhorn is reconciled before Vault/Postgres/Gitea so storage-backed services are not racing the volume layer.
|
- Longhorn is reconciled before Vault/Postgres/Gitea so storage-backed services are not racing the volume layer.
|
||||||
- Harbor is reconciled after the first critical stateful services.
|
- Harbor is reconciled after the first critical stateful services.
|
||||||
- Harbor bootstrap is now designed around a control-host bundle:
|
- Harbor bootstrap is now designed around a control-host bundle:
|
||||||
- Build the Harbor bundle locally with `scripts/build_harbor_bootstrap_bundle.sh`.
|
- Build the Harbor bundle locally with `scripts/build_harbor_bootstrap_bundle.sh`.
|
||||||
- Stage it on the operator host at `~/.local/share/hecate/bundles/harbor-bootstrap-v2.14.1-arm64.tar.zst`.
|
- Stage it on the operator host at `~/.local/share/ananke/bundles/harbor-bootstrap-v2.14.1-arm64.tar.zst`.
|
||||||
- Use `harbor-seed --execute` or a full `startup --execute` to stream/import that bundle onto `titan-05`.
|
- Use `harbor-seed --execute` or a full `startup --execute` to stream/import that bundle onto `titan-05`.
|
||||||
- The Harbor bundle remains arm64-only because Harbor is pinned to arm64 nodes. The node-helper image is multi-arch because Hecate uses it across both arm64 and amd64 nodes during prepare/shutdown operations.
|
- The Harbor bundle remains arm64-only because Harbor is pinned to arm64 nodes. The node-helper image is multi-arch because Ananke uses it across both arm64 and amd64 nodes during prepare/shutdown operations.
|
||||||
- Hecate uses a temporary privileged helper pod for host-side operations. The helper image is prewarmed with `prepare --execute` so later shutdown/startup steps do not stall on image pulls.
|
- Ananke uses a temporary privileged helper pod for host-side operations. The helper image is prewarmed with `prepare --execute` so later shutdown/startup steps do not stall on image pulls.
|
||||||
- The script persists outage state in `~/.local/state/cluster_power_recovery.state` by default. If startup is attempted during an outage window and power becomes unstable again, rerunning startup with insufficient UPS charge will flip into the emergency shutdown path instead of continuing to bootstrap.
|
- The script persists outage state in `~/.local/share/ananke/cluster_power_recovery.state` by default. If startup is attempted during an outage window and power becomes unstable again, rerunning startup with insufficient UPS charge will flip into the emergency shutdown path instead of continuing to bootstrap.
|
||||||
|
- Startup completion is strict now:
|
||||||
|
- all non-optional Flux kustomizations must be `Ready=True`
|
||||||
|
- external service checklist must pass (defaults include Gitea, Grafana, Harbor)
|
||||||
|
- generated ingress reachability checks must pass (default accepted codes: `200,301,302,307,308,401,403,404`)
|
||||||
|
- stability soak must pass with no crashloop/pull-failure churn
|
||||||
|
- If Flux hits immutable one-off Job drift during reconcile, Ananke now attempts self-heal by pruning failed Flux-managed Jobs and retrying reconcile.
|
||||||
- In dry-run mode, the script now skips the live API wait step so preview runs do not stall on an offline cluster.
|
- In dry-run mode, the script now skips the live API wait step so preview runs do not stall on an offline cluster.
|
||||||
- Dry-run mode no longer mutates outage recovery state.
|
- Dry-run mode no longer mutates outage recovery state.
|
||||||
- `harbor-seed --execute` was validated by:
|
- `harbor-seed --execute` was validated by:
|
||||||
|
|||||||
@ -1,14 +1,36 @@
|
|||||||
CANONICAL_CONTROL_HOST="titan-db"
|
CANONICAL_CONTROL_HOST="titan-db"
|
||||||
DEFAULT_FLUX_BRANCH="main"
|
DEFAULT_FLUX_BRANCH="main"
|
||||||
STATE_SUBDIR=".local/share/hecate"
|
EXPECTED_FLUX_URL="ssh://git@scm.bstein.dev:2242/bstein/titan-iac.git"
|
||||||
|
SHUTDOWN_MODE="host-poweroff"
|
||||||
|
STATE_SUBDIR=".local/share/ananke"
|
||||||
HARBOR_BUNDLE_BASENAME="harbor-bootstrap-v2.14.1-arm64.tar.zst"
|
HARBOR_BUNDLE_BASENAME="harbor-bootstrap-v2.14.1-arm64.tar.zst"
|
||||||
HARBOR_TARGET_NODE="titan-05"
|
HARBOR_TARGET_NODE=""
|
||||||
HARBOR_CANARY_NODE="titan-04"
|
HARBOR_CANARY_NODE=""
|
||||||
|
HARBOR_HOST_LABEL_KEY="ananke.bstein.dev/harbor-bootstrap"
|
||||||
HARBOR_CANARY_IMAGE="registry.bstein.dev/bstein/kubectl:1.35.0"
|
HARBOR_CANARY_IMAGE="registry.bstein.dev/bstein/kubectl:1.35.0"
|
||||||
NODE_HELPER_IMAGE="registry.bstein.dev/bstein/hecate-node-helper:0.1.0"
|
NODE_HELPER_IMAGE="registry.bstein.dev/bstein/ananke-node-helper:0.1.0"
|
||||||
NODE_HELPER_NAMESPACE="maintenance"
|
NODE_HELPER_NAMESPACE="maintenance"
|
||||||
NODE_HELPER_SERVICE_ACCOUNT="default"
|
NODE_HELPER_SERVICE_ACCOUNT="default"
|
||||||
REGISTRY_PULL_SECRET="harbor-regcred"
|
REGISTRY_PULL_SECRET="harbor-regcred"
|
||||||
BUNDLE_HTTP_PORT="8877"
|
BUNDLE_HTTP_PORT="8877"
|
||||||
UPS_HOST="pyrphoros@localhost"
|
UPS_HOST="pyrphoros@localhost"
|
||||||
UPS_BATTERY_KEY="battery.charge"
|
UPS_BATTERY_KEY="battery.charge"
|
||||||
|
FLUX_READY_TIMEOUT_SECONDS="1200"
|
||||||
|
FLUX_READY_POLL_SECONDS="10"
|
||||||
|
STARTUP_CHECKLIST_TIMEOUT_SECONDS="900"
|
||||||
|
STARTUP_CHECKLIST_POLL_SECONDS="10"
|
||||||
|
STARTUP_WORKLOAD_TIMEOUT_SECONDS="900"
|
||||||
|
STARTUP_WORKLOAD_POLL_SECONDS="10"
|
||||||
|
STARTUP_STABILITY_WINDOW_SECONDS="180"
|
||||||
|
STARTUP_STABILITY_TIMEOUT_SECONDS="900"
|
||||||
|
STARTUP_STABILITY_POLL_SECONDS="10"
|
||||||
|
STARTUP_OPTIONAL_KUSTOMIZATIONS=""
|
||||||
|
STARTUP_IGNORE_PODS_REGEX=""
|
||||||
|
STARTUP_IGNORE_WORKLOADS_REGEX=""
|
||||||
|
STARTUP_WORKLOAD_NAMESPACE_EXCLUDES_REGEX="^(kube-system|kube-public|kube-node-lease|flux-system)$"
|
||||||
|
STARTUP_SERVICE_CHECK_TIMEOUT_SECONDS="10"
|
||||||
|
STARTUP_INCLUDE_INGRESS_CHECKS="1"
|
||||||
|
STARTUP_INGRESS_ALLOWED_STATUSES="200,301,302,307,308,401,403,404"
|
||||||
|
STARTUP_IGNORE_INGRESS_HOSTS_REGEX=""
|
||||||
|
STARTUP_INGRESS_CHECK_TIMEOUT_SECONDS="10"
|
||||||
|
STARTUP_SERVICE_CHECKLIST='gitea|https://scm.bstein.dev/api/healthz|200|"status":"pass"||;grafana|https://metrics.bstein.dev/api/health|200|"database":"ok"||;harbor|https://registry.bstein.dev/v2/|200,401|||'
|
||||||
|
|||||||
@ -1,10 +1,10 @@
|
|||||||
#!/usr/bin/env bash
|
#!/usr/bin/env bash
|
||||||
set -euo pipefail
|
set -euo pipefail
|
||||||
|
|
||||||
IMAGE="registry.bstein.dev/bstein/hecate-node-helper:0.1.0"
|
IMAGE="registry.bstein.dev/bstein/ananke-node-helper:0.1.0"
|
||||||
DOCKER_CONFIG_PATH=""
|
DOCKER_CONFIG_PATH=""
|
||||||
PLATFORMS="linux/amd64,linux/arm64"
|
PLATFORMS="linux/amd64,linux/arm64"
|
||||||
BUILDER_NAME="hecate-node-helper-builder"
|
BUILDER_NAME="ananke-node-helper-builder"
|
||||||
|
|
||||||
while [[ $# -gt 0 ]]; do
|
while [[ $# -gt 0 ]]; do
|
||||||
case "$1" in
|
case "$1" in
|
||||||
@ -26,7 +26,7 @@ while [[ $# -gt 0 ]]; do
|
|||||||
;;
|
;;
|
||||||
-h|--help)
|
-h|--help)
|
||||||
cat <<USAGE
|
cat <<USAGE
|
||||||
Usage: scripts/build_hecate_node_helper.sh [--image <image>] [--docker-config <path>] [--platforms <csv>] [--builder <name>]
|
Usage: scripts/build_ananke_node_helper.sh [--image <image>] [--docker-config <path>] [--platforms <csv>] [--builder <name>]
|
||||||
USAGE
|
USAGE
|
||||||
exit 0
|
exit 0
|
||||||
;;
|
;;
|
||||||
@ -50,7 +50,7 @@ fi
|
|||||||
docker buildx inspect --bootstrap >/dev/null
|
docker buildx inspect --bootstrap >/dev/null
|
||||||
docker buildx build \
|
docker buildx build \
|
||||||
--platform "${PLATFORMS}" \
|
--platform "${PLATFORMS}" \
|
||||||
-f dockerfiles/Dockerfile.hecate-node-helper \
|
-f dockerfiles/Dockerfile.ananke-node-helper \
|
||||||
-t "${IMAGE}" \
|
-t "${IMAGE}" \
|
||||||
--push \
|
--push \
|
||||||
.
|
.
|
||||||
@ -7,11 +7,11 @@ Usage:
|
|||||||
scripts/cluster_power_console.sh [--repo-dir <path>] [--delegate-host <host>] [--allow-local] <prepare|status|shutdown|startup> [recovery-script-options...]
|
scripts/cluster_power_console.sh [--repo-dir <path>] [--delegate-host <host>] [--allow-local] <prepare|status|shutdown|startup> [recovery-script-options...]
|
||||||
|
|
||||||
Purpose:
|
Purpose:
|
||||||
Friendly manual entrypoint for running Hecate from a remote console.
|
Friendly manual entrypoint for running Ananke from a remote console.
|
||||||
The canonical control host is titan-db by default so bundle/state handling stays in one place.
|
The canonical control host is titan-db by default so bundle/state handling stays in one place.
|
||||||
|
|
||||||
Defaults:
|
Defaults:
|
||||||
--repo-dir \$HOME/Development/titan-iac
|
--repo-dir \$HOME/Development/ananke (fallback: \$HOME/Development/titan-iac)
|
||||||
--delegate-host titan-db
|
--delegate-host titan-db
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
@ -22,10 +22,14 @@ Examples:
|
|||||||
USAGE
|
USAGE
|
||||||
}
|
}
|
||||||
|
|
||||||
REPO_DIR="${HOME}/Development/titan-iac"
|
if [[ -d "${HOME}/Development/ananke" ]]; then
|
||||||
|
REPO_DIR="${HOME}/Development/ananke"
|
||||||
|
else
|
||||||
|
REPO_DIR="${HOME}/Development/titan-iac"
|
||||||
|
fi
|
||||||
DELEGATE_HOST="titan-db"
|
DELEGATE_HOST="titan-db"
|
||||||
ALLOW_LOCAL=0
|
ALLOW_LOCAL=0
|
||||||
REMOTE_REPO_DIR="${HECATE_REMOTE_REPO_DIR:-}"
|
REMOTE_REPO_DIR="${ANANKE_REMOTE_REPO_DIR:-}"
|
||||||
|
|
||||||
while [[ $# -gt 0 ]]; do
|
while [[ $# -gt 0 ]]; do
|
||||||
case "$1" in
|
case "$1" in
|
||||||
@ -73,6 +77,6 @@ fi
|
|||||||
quoted_args="$(printf '%q ' "$@")"
|
quoted_args="$(printf '%q ' "$@")"
|
||||||
remote_prefix=""
|
remote_prefix=""
|
||||||
if [[ -n "${REMOTE_REPO_DIR}" ]]; then
|
if [[ -n "${REMOTE_REPO_DIR}" ]]; then
|
||||||
remote_prefix="HECATE_REPO_DIR=$(printf '%q' "${REMOTE_REPO_DIR}") "
|
remote_prefix="ANANKE_REPO_DIR=$(printf '%q' "${REMOTE_REPO_DIR}") "
|
||||||
fi
|
fi
|
||||||
exec ssh -o BatchMode=yes -o ConnectTimeout=8 "${DELEGATE_HOST}" "${remote_prefix}~/hecate-tools/cluster_power_recovery.sh ${quoted_args}"
|
exec ssh -o BatchMode=yes -o ConnectTimeout=8 "${DELEGATE_HOST}" "${remote_prefix}~/ananke-tools/cluster_power_recovery.sh ${quoted_args}"
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
Loading…
x
Reference in New Issue
Block a user