Compare commits

..

No commits in common. "main" and "feature/atlas-monitoring" have entirely different histories.

590 changed files with 5335 additions and 73166 deletions

11
.gitignore vendored
View File

@ -1,10 +1 @@
*.md
!README.md
!knowledge/**/*.md
!services/comms/knowledge/**/*.md
__pycache__/
*.py[cod]
.pytest_cache
.venv
.venv-ci
tmp/
AGENTS.md

68
AGENTS.md Normal file
View File

@ -0,0 +1,68 @@
Repository Guidelines
## Project Structure & Module Organization
- `infrastructure/`: cluster-scoped building blocks (core, flux-system, traefik, longhorn). Add new platform features by mirroring this layout.
- `services/`: workload manifests per app (`services/gitea/`, etc.) with `kustomization.yaml` plus one file per kind; keep diffs small and focused.
- `dockerfiles/` hosts bespoke images, while `scripts/` stores operational Fish/Bash helpers—extend these directories instead of relying on ad-hoc commands.
## Build, Test, and Development Commands
- `kustomize build services/<app>` (or `kubectl kustomize ...`) renders manifests exactly as Flux will.
- `kubectl apply --server-side --dry-run=client -k services/<app>` checks schema compatibility without touching the cluster.
- `flux reconcile kustomization <name> --namespace flux-system --with-source` pulls the latest Git state after merges or hotfixes.
- `fish scripts/flux_hammer.fish --help` explains the recovery tool; read it before running against production workloads.
## Coding Style & Naming Conventions
- YAML uses two-space indents; retain the leading path comment (e.g. `# services/gitea/deployment.yaml`) to speed code review.
- Keep resource names lowercase kebab-case, align labels/selectors, and mirror namespaces with directory names.
- List resources in `kustomization.yaml` from namespace/config, through storage, then workloads and networking for predictable diffs.
- Scripts start with `#!/usr/bin/env fish` or bash, stay executable, and follow snake_case names such as `flux_hammer.fish`.
## Testing Guidelines
- Run `kustomize build` and the dry-run apply for every service you touch; capture failures before opening a PR.
- `flux diff kustomization <name> --path services/<app>` previews reconciliations—link notable output when behavior shifts.
- Docker edits: `docker build -f dockerfiles/Dockerfile.monerod .` (swap the file you changed) to verify image builds.
## Commit & Pull Request Guidelines
- Keep commit subjects short, present-tense, and optionally scoped (`gpu(titan-24): add RuntimeClass`); squash fixups before review.
- Describe linked issues, affected services, and required operator steps (e.g. `flux reconcile kustomization services-gitea`) in the PR body.
- Focus each PR on one kustomization or service and update `infrastructure/flux-system` when Flux must track new folders.
- Record the validation you ran (dry-runs, diffs, builds) and add screenshots only when ingress or UI behavior changes.
## Security & Configuration Tips
- Never commit credentials; use Vault workflows (`services/vault/`) or SOPS-encrypted manifests wired through `infrastructure/flux-system`.
- Node selectors and tolerations gate workloads to hardware like `hardware: rpi4`; confirm labels before scaling or renaming nodes.
- Pin external images by digest or rely on Flux image automation to follow approved tags and avoid drift.
## Dashboard roadmap / context (2025-12-02)
- Atlas dashboards are generated via `scripts/dashboards_render_atlas.py --build`, which writes JSON under `services/monitoring/dashboards/` and ConfigMaps under `services/monitoring/`. Keep the Grafana manifests in sync by regenerating after edits.
- Atlas Overview panels are paired with internal dashboards (pods, nodes, storage, network, GPU). A new `atlas-gpu` internal dashboard holds the detailed GPU metrics that feed the overview share pie.
- Old Grafana folders (`Atlas Storage`, `Atlas SRE`, `Atlas Public`, `Atlas Nodes`) should be removed in Grafana UI when convenient; only `Atlas Overview` and `Atlas Internal` should remain provisioned.
- Future work: add a separate generator (e.g., `dashboards_render_oceanus.py`) for SUI/oceanus validation dashboards, mirroring the atlas pattern of internal dashboards feeding a public overview.
## Monitoring state (2025-12-03)
- dcgm-exporter DaemonSet pulls `registry.bstein.dev/monitoring/dcgm-exporter:4.4.2-4.7.0-ubuntu22.04` with nvidia runtime/imagePullSecret; titan-24 exports metrics, titan-22 remains NotReady.
- Atlas Overview is the Grafana home (1h range, 1m refresh), Overview folder UID `overview`, internal folder `atlas-internal` (oceanus-internal stub).
- Panels standardized via generator; hottest row compressed, worker/control rows taller, root disk row taller and top12 bar gauge with labels. GPU share pie uses 1h avg_over_time to persist idle activity.
- Internal dashboards are provisioned without Viewer role; if anonymous still sees them, restart Grafana and tighten auth if needed.
## Upcoming priorities (SSO/storage/mail)
- Establish SSO (Keycloak or similar) and federate Grafana, Gitea, Zot, Nextcloud, Pegasus/Jellyfin; keep Vaultwarden separate until safe.
- Add Nextcloud (limit to rpi5 workers) with office suite; integrate with SSO; plan storage class and ingress.
- Plan mail: mostly self-hosted, relay through trusted provider for outbound; integrate with services (Nextcloud, Vaultwarden, etc.) for notifications and account flows.
## SSO plan sketch (2025-12-03)
- IdP: use Keycloak (preferred) in a new `sso` namespace, Bitnami or codecentric chart with Postgres backing store (single PVC), ingress `sso.bstein.dev`, admin user bound to brad@bstein.dev; stick with local DB initially (no external IdP).
- Auth flow goals: Grafana (OIDC), Gitea (OAuth2/Keycloak), Zot (via Traefik forward-auth/oauth2-proxy), Jellyfin/Pegasus via Jellyfin OAuth/OpenID plugin (map existing usernames; run migration to pre-create users in Keycloak with same usernames/emails and temporary passwords), Pegasus keeps using Jellyfin tokens.
- Steps to implement:
1) Add service folder `services/keycloak/` (namespace, PVC, HelmRelease, ingress, secret for admin creds). Verify with kustomize + Flux reconcile.
2) Seed realm `atlas` with users (import CSV/realm). Create client for Grafana (public/implicit), Gitea (confidential), and a “jellyfin” client for the OAuth plugin; set email for brad@bstein.dev as admin.
3) Reconfigure Grafana to OIDC (disable anonymous to internal folders, leave Overview public via folder permissions). Reconfigure Gitea to OIDC (app.ini).
4) Add Traefik forward-auth (oauth2-proxy) in front of Zot and any other services needing headers-based auth.
5) Deploy Jellyfin OpenID plugin; map Keycloak users to existing Jellyfin usernames; communicate password reset path.
- Migration caution: do not delete existing local creds until SSO validated; keep Pegasus working via Jellyfin tokens during transition.
## Postgres centralization (2025-12-03)
- Prefer a shared in-cluster Postgres deployment with per-service databases to reduce resource sprawl on Pi nodes. Use it for services that can easily point at an external DB.
- Candidates to migrate to shared Postgres: Keycloak (realm DB), Gitea (git DB), Nextcloud (app DB), possibly Grafana (if persistence needed beyond current provisioner), Jitsi prosody/JVB state (if external DB supported). Keep tightly-coupled or lightweight embedded DBs as-is when migration is painful or not supported.

77
Jenkinsfile vendored
View File

@ -1,77 +0,0 @@
// Mirror of ci/Jenkinsfile.titan-iac for multibranch discovery.
pipeline {
agent {
kubernetes {
defaultContainer 'python'
yaml """
apiVersion: v1
kind: Pod
spec:
nodeSelector:
hardware: rpi5
kubernetes.io/arch: arm64
node-role.kubernetes.io/worker: "true"
containers:
- name: python
image: python:3.12-slim
command:
- cat
tty: true
"""
}
}
environment {
PIP_DISABLE_PIP_VERSION_CHECK = '1'
PYTHONUNBUFFERED = '1'
}
stages {
stage('Checkout') {
steps {
checkout scm
}
}
stage('Install deps') {
steps {
sh 'pip install --no-cache-dir -r ci/requirements.txt'
}
}
stage('Glue tests') {
steps {
sh 'pytest -q ci/tests/glue'
}
}
stage('Resolve Flux branch') {
steps {
script {
env.FLUX_BRANCH = sh(
returnStdout: true,
script: "awk '/branch:/{print $2; exit}' clusters/atlas/flux-system/gotk-sync.yaml"
).trim()
if (!env.FLUX_BRANCH) {
error('Flux branch not found in gotk-sync.yaml')
}
echo "Flux branch: ${env.FLUX_BRANCH}"
}
}
}
stage('Promote') {
when {
expression {
def branch = env.BRANCH_NAME ?: (env.GIT_BRANCH ?: '').replaceFirst('origin/', '')
return env.FLUX_BRANCH && branch == env.FLUX_BRANCH
}
}
steps {
withCredentials([usernamePassword(credentialsId: 'gitea-pat', usernameVariable: 'GIT_USER', passwordVariable: 'GIT_TOKEN')]) {
sh '''
set +x
git config user.email "jenkins@bstein.dev"
git config user.name "jenkins"
git remote set-url origin https://${GIT_USER}:${GIT_TOKEN}@scm.bstein.dev/bstein/titan-iac.git
git push origin HEAD:${FLUX_BRANCH}
'''
}
}
}
}
}

View File

@ -1,3 +0,0 @@
# titan-iac
Flux-managed Kubernetes cluster for bstein.dev services.

View File

@ -1,76 +0,0 @@
pipeline {
agent {
kubernetes {
defaultContainer 'python'
yaml """
apiVersion: v1
kind: Pod
spec:
nodeSelector:
hardware: rpi5
kubernetes.io/arch: arm64
node-role.kubernetes.io/worker: "true"
containers:
- name: python
image: python:3.12-slim
command:
- cat
tty: true
"""
}
}
environment {
PIP_DISABLE_PIP_VERSION_CHECK = '1'
PYTHONUNBUFFERED = '1'
}
stages {
stage('Checkout') {
steps {
checkout scm
}
}
stage('Install deps') {
steps {
sh 'pip install --no-cache-dir -r ci/requirements.txt'
}
}
stage('Glue tests') {
steps {
sh 'pytest -q ci/tests/glue'
}
}
stage('Resolve Flux branch') {
steps {
script {
env.FLUX_BRANCH = sh(
returnStdout: true,
script: "awk '/branch:/{print $2; exit}' clusters/atlas/flux-system/gotk-sync.yaml"
).trim()
if (!env.FLUX_BRANCH) {
error('Flux branch not found in gotk-sync.yaml')
}
echo "Flux branch: ${env.FLUX_BRANCH}"
}
}
}
stage('Promote') {
when {
expression {
def branch = env.BRANCH_NAME ?: (env.GIT_BRANCH ?: '').replaceFirst('origin/', '')
return env.FLUX_BRANCH && branch == env.FLUX_BRANCH
}
}
steps {
withCredentials([usernamePassword(credentialsId: 'gitea-pat', usernameVariable: 'GIT_USER', passwordVariable: 'GIT_TOKEN')]) {
sh '''
set +x
git config user.email "jenkins@bstein.dev"
git config user.name "jenkins"
git remote set-url origin https://${GIT_USER}:${GIT_TOKEN}@scm.bstein.dev/bstein/titan-iac.git
git push origin HEAD:${FLUX_BRANCH}
'''
}
}
}
}
}

View File

@ -1,4 +0,0 @@
pytest==8.3.4
kubernetes==30.1.0
PyYAML==6.0.2
requests==2.32.3

View File

@ -1,16 +0,0 @@
max_success_age_hours: 48
allow_suspended:
- bstein-dev-home/vaultwarden-cred-sync
- comms/othrys-room-reset
- comms/pin-othrys-invite
- comms/seed-othrys-room
- finance/firefly-user-sync
- health/wger-admin-ensure
- health/wger-user-sync
- mailu-mailserver/mailu-sync-nightly
- nextcloud/nextcloud-mail-sync
ariadne_schedule_tasks:
- schedule.mailu_sync
- schedule.nextcloud_sync
- schedule.vaultwarden_sync
- schedule.wger_admin

View File

@ -1,46 +0,0 @@
from __future__ import annotations
from datetime import datetime, timezone
from pathlib import Path
import yaml
from kubernetes import client, config
CONFIG_PATH = Path(__file__).with_name("config.yaml")
def _load_config() -> dict:
with CONFIG_PATH.open("r", encoding="utf-8") as handle:
return yaml.safe_load(handle) or {}
def _load_kube():
try:
config.load_incluster_config()
except config.ConfigException:
config.load_kube_config()
def test_glue_cronjobs_recent_success():
cfg = _load_config()
max_age_hours = int(cfg.get("max_success_age_hours", 48))
allow_suspended = set(cfg.get("allow_suspended", []))
_load_kube()
batch = client.BatchV1Api()
cronjobs = batch.list_cron_job_for_all_namespaces(label_selector="atlas.bstein.dev/glue=true").items
assert cronjobs, "No glue cronjobs found with atlas.bstein.dev/glue=true"
now = datetime.now(timezone.utc)
for cronjob in cronjobs:
name = f"{cronjob.metadata.namespace}/{cronjob.metadata.name}"
if cronjob.spec.suspend:
assert name in allow_suspended, f"{name} is suspended but not in allow_suspended"
continue
last_success = cronjob.status.last_successful_time
assert last_success is not None, f"{name} has no lastSuccessfulTime"
age_hours = (now - last_success).total_seconds() / 3600
assert age_hours <= max_age_hours, f"{name} last success {age_hours:.1f}h ago"

View File

@ -1,48 +0,0 @@
from __future__ import annotations
import os
from pathlib import Path
import requests
import yaml
VM_URL = os.environ.get("VM_URL", "http://victoria-metrics-single-server:8428").rstrip("/")
CONFIG_PATH = Path(__file__).with_name("config.yaml")
def _load_config() -> dict:
with CONFIG_PATH.open("r", encoding="utf-8") as handle:
return yaml.safe_load(handle) or {}
def _query(promql: str) -> list[dict]:
response = requests.get(f"{VM_URL}/api/v1/query", params={"query": promql}, timeout=10)
response.raise_for_status()
payload = response.json()
return payload.get("data", {}).get("result", [])
def test_glue_metrics_present():
series = _query('kube_cronjob_labels{label_atlas_bstein_dev_glue="true"}')
assert series, "No glue cronjob label series found"
def test_glue_metrics_success_join():
query = (
"kube_cronjob_status_last_successful_time "
'and on(namespace,cronjob) kube_cronjob_labels{label_atlas_bstein_dev_glue="true"}'
)
series = _query(query)
assert series, "No glue cronjob last success series found"
def test_ariadne_schedule_metrics_present():
cfg = _load_config()
expected = cfg.get("ariadne_schedule_tasks", [])
if not expected:
return
series = _query("ariadne_schedule_next_run_timestamp_seconds")
tasks = {item.get("metric", {}).get("task") for item in series}
missing = [task for task in expected if task not in tasks]
assert not missing, f"Missing Ariadne schedule metrics for: {', '.join(missing)}"

View File

@ -0,0 +1,12 @@
# clusters/atlas/applications/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../services/crypto
- ../../services/gitea
- ../../services/jellyfin
- ../../services/jitsi
- ../../services/monitoring
- ../../services/pegasus
- ../../services/vault
- ../../services/zot

View File

@ -1,23 +0,0 @@
# clusters/atlas/flux-system/applications/ai-llm/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: ai-llm
namespace: flux-system
spec:
interval: 10m
path: ./services/ai-llm
targetNamespace: ai
prune: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
wait: true
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: ollama
namespace: ai
dependsOn:
- name: core

View File

@ -1,17 +0,0 @@
# clusters/atlas/flux-system/applications/bstein-dev-home-migrations/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: bstein-dev-home-migrations
namespace: flux-system
spec:
interval: 10m
path: ./services/bstein-dev-home/oneoffs/migrations
prune: true
force: true
sourceRef:
kind: GitRepository
name: flux-system
targetNamespace: bstein-dev-home
wait: false
suspend: true

View File

@ -1,26 +0,0 @@
# clusters/atlas/flux-system/applications/bstein-dev-home/image-automation.yaml
apiVersion: image.toolkit.fluxcd.io/v1
kind: ImageUpdateAutomation
metadata:
name: bstein-dev-home
namespace: bstein-dev-home
spec:
interval: 1m0s
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
git:
checkout:
ref:
branch: feature/ariadne
commit:
author:
email: ops@bstein.dev
name: flux-bot
messageTemplate: "chore(bstein-dev-home): automated image update"
push:
branch: feature/ariadne
update:
strategy: Setters
path: services/bstein-dev-home

View File

@ -1,15 +0,0 @@
# clusters/atlas/flux-system/applications/bstein-dev-home/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: bstein-dev-home
namespace: flux-system
spec:
interval: 10m
path: ./services/bstein-dev-home
prune: true
sourceRef:
kind: GitRepository
name: flux-system
targetNamespace: bstein-dev-home
wait: false

View File

@ -1,17 +0,0 @@
# clusters/atlas/flux-system/applications/comms/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: comms
namespace: flux-system
spec:
interval: 10m
prune: true
sourceRef:
kind: GitRepository
name: flux-system
path: ./services/comms
targetNamespace: comms
timeout: 2m
dependsOn:
- name: traefik

View File

@ -1,24 +0,0 @@
# clusters/atlas/flux-system/applications/finance/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: finance
namespace: flux-system
spec:
interval: 10m
path: ./services/finance
prune: true
sourceRef:
kind: GitRepository
name: flux-system
targetNamespace: finance
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: actual-budget
namespace: finance
- apiVersion: apps/v1
kind: Deployment
name: firefly
namespace: finance
wait: false

View File

@ -1,27 +0,0 @@
# clusters/atlas/flux-system/applications/harbor/image-automation.yaml
apiVersion: image.toolkit.fluxcd.io/v1
kind: ImageUpdateAutomation
metadata:
name: harbor
namespace: harbor
spec:
suspend: true
interval: 5m0s
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
git:
checkout:
ref:
branch: feature/ci-gitops
commit:
author:
email: ops@bstein.dev
name: flux-bot
messageTemplate: "chore(harbor): apply image updates"
push:
branch: feature/ci-gitops
update:
strategy: Setters
path: ./services/harbor

View File

@ -1,25 +0,0 @@
# clusters/atlas/flux-system/applications/health/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: health
namespace: flux-system
spec:
interval: 10m
path: ./services/health
prune: true
sourceRef:
kind: GitRepository
name: flux-system
targetNamespace: health
dependsOn:
- name: keycloak
- name: postgres
- name: traefik
- name: vault
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: wger
namespace: health
wait: false

View File

@ -15,6 +15,5 @@ spec:
namespace: flux-system
dependsOn:
- name: core
- name: openldap
wait: true
timeout: 5m

View File

@ -1,27 +0,0 @@
# clusters/atlas/flux-system/applications/jenkins/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: jenkins
namespace: flux-system
spec:
interval: 10m
path: ./services/jenkins
prune: true
sourceRef:
kind: GitRepository
name: flux-system
targetNamespace: jenkins
dependsOn:
- name: helm
- name: traefik
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: jenkins
namespace: jenkins
- apiVersion: v1
kind: Service
name: jenkins
namespace: jenkins
wait: false

View File

@ -1,18 +1,18 @@
# clusters/atlas/flux-system/applications/openldap/kustomization.yaml
# clusters/atlas/flux-system/applications/jitsi/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: openldap
name: jitsi
namespace: flux-system
spec:
interval: 10m
path: ./services/jitsi
targetNamespace: jitsi
prune: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
path: ./services/openldap
targetNamespace: sso
dependsOn:
- name: core
wait: true

View File

@ -1,15 +0,0 @@
# clusters/atlas/flux-system/applications/keycloak/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: keycloak
namespace: flux-system
spec:
interval: 10m
prune: true
sourceRef:
kind: GitRepository
name: flux-system
path: ./services/keycloak
targetNamespace: sso
timeout: 2m

View File

@ -2,32 +2,14 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- zot/kustomization.yaml
- gitea/kustomization.yaml
- vault/kustomization.yaml
- vaultwarden/kustomization.yaml
- comms/kustomization.yaml
- jitsi/kustomization.yaml
- crypto/kustomization.yaml
- monerod/kustomization.yaml
- pegasus/kustomization.yaml
- pegasus/image-automation.yaml
- bstein-dev-home/kustomization.yaml
- bstein-dev-home/image-automation.yaml
- bstein-dev-home-migrations/kustomization.yaml
- harbor/kustomization.yaml
- harbor/image-automation.yaml
- jellyfin/kustomization.yaml
- xmr-miner/kustomization.yaml
- wallet-monero-temp/kustomization.yaml
- sui-metrics/kustomization.yaml
- openldap/kustomization.yaml
- keycloak/kustomization.yaml
- oauth2-proxy/kustomization.yaml
- mailu/kustomization.yaml
- jenkins/kustomization.yaml
- ai-llm/kustomization.yaml
- nextcloud/kustomization.yaml
- nextcloud-mail-sync/kustomization.yaml
- outline/kustomization.yaml
- planka/kustomization.yaml
- finance/kustomization.yaml
- health/kustomization.yaml

View File

@ -1,18 +0,0 @@
# clusters/atlas/flux-system/applications/mailu/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: mailu
namespace: flux-system
spec:
interval: 10m
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
path: ./services/mailu
targetNamespace: mailu-mailserver
prune: true
wait: true
dependsOn:
- name: helm

View File

@ -1,17 +0,0 @@
# clusters/atlas/flux-system/applications/nextcloud-mail-sync/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: nextcloud-mail-sync
namespace: flux-system
spec:
interval: 10m
prune: true
sourceRef:
kind: GitRepository
name: flux-system
path: ./services/nextcloud-mail-sync
targetNamespace: nextcloud
timeout: 2m
dependsOn:
- name: keycloak

View File

@ -1,16 +0,0 @@
# clusters/atlas/flux-system/applications/nextcloud/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: nextcloud
namespace: flux-system
spec:
interval: 10m
path: ./services/nextcloud
targetNamespace: nextcloud
prune: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
wait: true

View File

@ -1,15 +0,0 @@
# clusters/atlas/flux-system/applications/oauth2-proxy/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: oauth2-proxy
namespace: flux-system
spec:
interval: 10m
prune: true
sourceRef:
kind: GitRepository
name: flux-system
path: ./services/oauth2-proxy
targetNamespace: sso
timeout: 2m

View File

@ -1,28 +0,0 @@
# clusters/atlas/flux-system/applications/outline/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: outline
namespace: flux-system
spec:
interval: 10m
path: ./services/outline
prune: true
sourceRef:
kind: GitRepository
name: flux-system
targetNamespace: outline
dependsOn:
- name: keycloak
- name: mailu
- name: traefik
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: outline
namespace: outline
- apiVersion: v1
kind: Service
name: outline
namespace: outline
wait: false

View File

@ -1,26 +1,20 @@
# clusters/atlas/flux-system/applications/pegasus/image-automation.yaml
apiVersion: image.toolkit.fluxcd.io/v1
apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
name: pegasus
namespace: jellyfin
namespace: flux-system
spec:
interval: 1m0s
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
git:
checkout:
ref:
branch: feature/ci-gitops
commit:
author:
email: ops@bstein.dev
name: flux-bot
messageTemplate: "chore(pegasus): apply image updates"
push:
branch: feature/ci-gitops
messageTemplate: "chore(pegasus): update image to {{range .Updated.Images}}{{.}}{{end}}"
update:
strategy: Setters
path: services/pegasus
path: ./services/pegasus

View File

@ -1,28 +0,0 @@
# clusters/atlas/flux-system/applications/planka/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: planka
namespace: flux-system
spec:
interval: 10m
path: ./services/planka
prune: true
sourceRef:
kind: GitRepository
name: flux-system
targetNamespace: planka
dependsOn:
- name: keycloak
- name: mailu
- name: traefik
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: planka
namespace: planka
- apiVersion: v1
kind: Service
name: planka
namespace: planka
wait: false

View File

@ -1,20 +0,0 @@
# clusters/atlas/flux-system/applications/vaultwarden/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: vaultwarden
namespace: flux-system
spec:
interval: 10m
suspend: false
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
path: ./services/vaultwarden
targetNamespace: vaultwarden
prune: true
wait: true
dependsOn:
- name: helm
- name: traefik

View File

@ -1,19 +0,0 @@
# clusters/atlas/flux-system/applications/wallet-monero-temp/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: wallet-monero-temp
namespace: flux-system
spec:
interval: 10m
path: ./services/crypto/wallet-monero-temp
targetNamespace: crypto
prune: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
dependsOn:
- name: crypto
- name: xmr-miner
wait: true

View File

@ -1,18 +1,18 @@
# clusters/atlas/flux-system/applications/harbor/kustomization.yaml
# clusters/atlas/flux-system/applications/zot/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: harbor
name: zot
namespace: flux-system
spec:
interval: 10m
path: ./services/harbor
targetNamespace: harbor
path: ./services/zot
targetNamespace: zot
prune: false
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
wait: false
wait: true
dependsOn:
- name: core
- name: core

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,3 @@
# clusters/atlas/flux-system/gotk-sync.yaml
# This manifest was generated by flux. DO NOT EDIT.
---
apiVersion: source.toolkit.fluxcd.io/v1
@ -9,7 +8,7 @@ metadata:
spec:
interval: 1m0s
ref:
branch: feature/ariadne
branch: feature/atlas-monitoring
secretRef:
name: flux-system-gitea
url: ssh://git@scm.bstein.dev:2242/bstein/titan-iac.git

View File

@ -1,17 +0,0 @@
# clusters/atlas/flux-system/platform/cert-manager-cleanup/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: cert-manager-cleanup
namespace: flux-system
spec:
interval: 30m
path: ./infrastructure/cert-manager/cleanup
prune: true
force: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
targetNamespace: cert-manager
wait: true

View File

@ -1,19 +0,0 @@
# clusters/atlas/flux-system/platform/cert-manager/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: cert-manager
namespace: flux-system
spec:
interval: 30m
path: ./infrastructure/cert-manager
prune: true
force: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
targetNamespace: cert-manager
dependsOn:
- name: helm
wait: true

View File

@ -1,20 +0,0 @@
# clusters/atlas/flux-system/platform/gitops-ui/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: gitops-ui
namespace: flux-system
spec:
interval: 10m
timeout: 10m
path: ./services/gitops-ui
prune: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
targetNamespace: flux-system
dependsOn:
- name: helm
- name: traefik
wait: true

View File

@ -4,17 +4,6 @@ kind: Kustomization
resources:
- core/kustomization.yaml
- helm/kustomization.yaml
- cert-manager/kustomization.yaml
- metallb/kustomization.yaml
- traefik/kustomization.yaml
- gitops-ui/kustomization.yaml
- monitoring/kustomization.yaml
- logging/kustomization.yaml
- maintenance/kustomization.yaml
- maintenance/image-automation.yaml
- longhorn-adopt/kustomization.yaml
- longhorn/kustomization.yaml
- longhorn-ui/kustomization.yaml
- postgres/kustomization.yaml
- ../platform/vault-csi/kustomization.yaml
- ../platform/vault-injector/kustomization.yaml

View File

@ -1,14 +0,0 @@
# clusters/atlas/flux-system/platform/logging/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: logging
namespace: flux-system
spec:
interval: 10m
path: ./services/logging
prune: true
sourceRef:
kind: GitRepository
name: flux-system
wait: false

View File

@ -1,17 +0,0 @@
# clusters/atlas/flux-system/platform/longhorn-adopt/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: longhorn-adopt
namespace: flux-system
spec:
interval: 30m
path: ./infrastructure/longhorn/adopt
prune: true
force: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
targetNamespace: longhorn-system
wait: true

View File

@ -15,5 +15,4 @@ spec:
namespace: flux-system
dependsOn:
- name: core
- name: longhorn
wait: true

View File

@ -1,20 +0,0 @@
# clusters/atlas/flux-system/platform/longhorn/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: longhorn
namespace: flux-system
spec:
interval: 30m
path: ./infrastructure/longhorn/core
prune: true
force: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
targetNamespace: longhorn-system
dependsOn:
- name: helm
- name: longhorn-adopt
wait: false

View File

@ -1,26 +0,0 @@
# clusters/atlas/flux-system/platform/maintenance/image-automation.yaml
apiVersion: image.toolkit.fluxcd.io/v1
kind: ImageUpdateAutomation
metadata:
name: maintenance
namespace: maintenance
spec:
interval: 1m0s
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
git:
checkout:
ref:
branch: feature/ariadne
commit:
author:
email: ops@bstein.dev
name: flux-bot
messageTemplate: "chore(maintenance): automated image update"
push:
branch: feature/ariadne
update:
strategy: Setters
path: services/maintenance

View File

@ -1,15 +0,0 @@
# clusters/atlas/flux-system/platform/maintenance/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: maintenance
namespace: flux-system
spec:
interval: 10m
path: ./services/maintenance
prune: true
force: true
sourceRef:
kind: GitRepository
name: flux-system
wait: false

View File

@ -1,16 +0,0 @@
# clusters/atlas/flux-system/platform/metallb/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: metallb
namespace: flux-system
spec:
interval: 30m
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
path: ./infrastructure/metallb
prune: true
wait: true
targetNamespace: metallb-system

View File

@ -1,24 +0,0 @@
# clusters/atlas/flux-system/platform/postgres/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: postgres
namespace: flux-system
spec:
interval: 10m
path: ./infrastructure/postgres
prune: true
force: true
sourceRef:
kind: GitRepository
name: flux-system
targetNamespace: postgres
dependsOn:
- name: vault
- name: vault-csi
healthChecks:
- apiVersion: apps/v1
kind: StatefulSet
name: postgres
namespace: postgres
wait: true

View File

@ -15,5 +15,4 @@ spec:
namespace: flux-system
dependsOn:
- name: core
- name: metallb
wait: true

View File

@ -1,16 +0,0 @@
# clusters/atlas/flux-system/platform/vault-csi/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: vault-csi
namespace: flux-system
spec:
interval: 30m
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
path: ./infrastructure/vault-csi
prune: true
wait: true
targetNamespace: kube-system

View File

@ -1,16 +0,0 @@
# clusters/atlas/flux-system/platform/vault-injector/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: vault-injector
namespace: flux-system
spec:
interval: 30m
path: ./infrastructure/vault-injector
targetNamespace: vault
prune: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
wait: true

View File

@ -0,0 +1,7 @@
# clusters/atlas/platform/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../infrastructure/modules/base
- ../../../infrastructure/modules/profiles/atlas-ha
- ../../../infrastructure/sources/cert-manager/letsencrypt.yaml

View File

@ -0,0 +1,5 @@
# Oceanus Cluster Scaffold
This directory prepares the Flux and Kustomize layout for a future Oceanus-managed cluster.
Populate `flux-system/` with `gotk-components.yaml` and related manifests after running `flux bootstrap`.
Define node-specific resources under `infrastructure/modules/profiles/oceanus-validator/` and reference workloads in `applications/` as they come online.

View File

@ -1,5 +0,0 @@
FROM python:3.11-slim
ENV PIP_DISABLE_PIP_VERSION_CHECK=1
RUN pip install --no-cache-dir requests psycopg2-binary

View File

@ -1,16 +0,0 @@
FROM --platform=$BUILDPLATFORM opensearchproject/data-prepper:2.8.0 AS source
FROM --platform=$TARGETPLATFORM eclipse-temurin:17-jre
ENV DATA_PREPPER_PATH=/usr/share/data-prepper
RUN useradd -u 10001 -M -U -d / -s /usr/sbin/nologin data_prepper \
&& mkdir -p /var/log/data-prepper
COPY --from=source /usr/share/data-prepper /usr/share/data-prepper
RUN chown -R 10001:10001 /usr/share/data-prepper /var/log/data-prepper
USER 10001
WORKDIR /usr/share/data-prepper
CMD ["bin/data-prepper"]

View File

@ -1,9 +0,0 @@
FROM registry.bstein.dev/infra/harbor-core:v2.14.1-arm64
USER root
COPY dockerfiles/vault-entrypoint.sh /entrypoint.sh
RUN chmod 0755 /entrypoint.sh
USER harbor
ENTRYPOINT ["/entrypoint.sh"]
CMD ["/harbor/entrypoint.sh"]

View File

@ -1,9 +0,0 @@
FROM registry.bstein.dev/infra/harbor-jobservice:v2.14.1-arm64
USER root
COPY dockerfiles/vault-entrypoint.sh /entrypoint.sh
RUN chmod 0755 /entrypoint.sh
USER harbor
ENTRYPOINT ["/entrypoint.sh"]
CMD ["/harbor/entrypoint.sh"]

View File

@ -1,9 +0,0 @@
FROM registry.bstein.dev/infra/harbor-registry:v2.14.1-arm64
USER root
COPY dockerfiles/vault-entrypoint.sh /entrypoint.sh
RUN chmod 0755 /entrypoint.sh
USER harbor
ENTRYPOINT ["/entrypoint.sh"]
CMD ["/home/harbor/entrypoint.sh"]

View File

@ -1,9 +0,0 @@
FROM registry.bstein.dev/infra/harbor-registryctl:v2.14.1-arm64
USER root
COPY dockerfiles/vault-entrypoint.sh /entrypoint.sh
RUN chmod 0755 /entrypoint.sh
USER harbor
ENTRYPOINT ["/entrypoint.sh"]
CMD ["/home/harbor/start.sh"]

View File

@ -1,10 +0,0 @@
FROM ghcr.io/element-hq/lk-jwt-service:0.3.0 AS base
FROM alpine:3.20
RUN apk add --no-cache ca-certificates
COPY --from=base /lk-jwt-service /lk-jwt-service
COPY dockerfiles/vault-entrypoint.sh /entrypoint.sh
RUN chmod 0755 /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
CMD ["/lk-jwt-service"]

View File

@ -1,10 +0,0 @@
FROM quay.io/oauth2-proxy/oauth2-proxy:v7.6.0 AS base
FROM alpine:3.20
RUN apk add --no-cache ca-certificates
COPY --from=base /bin/oauth2-proxy /bin/oauth2-proxy
COPY dockerfiles/vault-entrypoint.sh /entrypoint.sh
RUN chmod 0755 /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
CMD ["/bin/oauth2-proxy"]

View File

@ -1,10 +0,0 @@
FROM registry.bstein.dev/streaming/pegasus:1.2.32 AS base
FROM alpine:3.20
RUN apk add --no-cache ca-certificates
COPY --from=base /pegasus /pegasus
COPY dockerfiles/vault-entrypoint.sh /entrypoint.sh
RUN chmod 0755 /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
CMD ["/pegasus"]

View File

@ -1,34 +0,0 @@
#!/bin/sh
set -eu
if [ -n "${VAULT_ENV_FILE:-}" ]; then
if [ -f "${VAULT_ENV_FILE}" ]; then
# shellcheck disable=SC1090
. "${VAULT_ENV_FILE}"
else
echo "Vault env file not found: ${VAULT_ENV_FILE}" >&2
exit 1
fi
fi
if [ -n "${VAULT_COPY_FILES:-}" ]; then
old_ifs="$IFS"
IFS=','
for pair in ${VAULT_COPY_FILES}; do
src="${pair%%:*}"
dest="${pair#*:}"
if [ -z "${src}" ] || [ -z "${dest}" ]; then
echo "Vault copy entry malformed: ${pair}" >&2
exit 1
fi
if [ ! -f "${src}" ]; then
echo "Vault file not found: ${src}" >&2
exit 1
fi
mkdir -p "$(dirname "${dest}")"
cp "${src}" "${dest}"
done
IFS="$old_ifs"
fi
exec "$@"

16
docs/topology.md Normal file
View File

@ -0,0 +1,16 @@
# Titan Homelab Topology
| Hostname | Role / Function | Managed By | Notes |
|------------|--------------------------------|---------------------|-------|
| titan-0a | Kubernetes control-plane | Flux (atlas cluster)| HA leader, tainted for control only |
| titan-0b | Kubernetes control-plane | Flux (atlas cluster)| Standby control node |
| titan-0c | Kubernetes control-plane | Flux (atlas cluster)| Standby control node |
| titan-04-19| Raspberry Pi workers | Flux (atlas cluster)| Workload nodes, labelled per hardware |
| titan-22 | GPU mini-PC (Jellyfin) | Flux + Ansible | NVIDIA runtime managed via `modules/profiles/atlas-ha` |
| titan-24 | Tethys hybrid node | Flux + Ansible | Runs SUI metrics via K8s, validator via Ansible |
| titan-db | HA control plane database | Ansible | PostgreSQL / etcd backing services |
| titan-jh | Jumphost & bastion | Ansible | Entry point / future KVM services |
| oceanus | Dedicated SUI validator host | Ansible / Flux prep | Baremetal validator workloads, exposes metrics to atlas; Kustomize scaffold under `clusters/oceanus/` |
| styx | Air-gapped workstation | Manual / Scripts | Remains isolated, scripts tracked in `hosts/styx` |
Use the `clusters/` directory for cluster-scoped state and the `hosts/` directory for baremetal orchestration.

View File

@ -1,18 +1,5 @@
# hosts/roles/titan_jh/tasks/main.yaml
---
- name: Install node exporter
ansible.builtin.package:
name: prometheus-node-exporter
state: present
tags: ['jumphost', 'monitoring']
- name: Enable node exporter
ansible.builtin.service:
name: prometheus-node-exporter
enabled: true
state: started
tags: ['jumphost', 'monitoring']
- name: Placeholder for jumphost hardening
ansible.builtin.debug:
msg: "Harden SSH, manage bastion tooling, and configure audit logging here."

2
hosts/styx/README.md Normal file
View File

@ -0,0 +1,2 @@
# hosts/styx/README.md
Styx is air-gapped; provisioning scripts live under `scripts/`.

View File

@ -1,40 +0,0 @@
# infrastructure/cert-manager/cleanup/cert-manager-cleanup-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: cert-manager-cleanup-2
namespace: cert-manager
spec:
backoffLimit: 1
template:
spec:
serviceAccountName: cert-manager-cleanup
restartPolicy: Never
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/worker
operator: Exists
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: kubernetes.io/arch
operator: In
values: ["arm64"]
containers:
- name: cleanup
image: bitnami/kubectl@sha256:554ab88b1858e8424c55de37ad417b16f2a0e65d1607aa0f3fe3ce9b9f10b131
command: ["/usr/bin/env", "bash"]
args: ["/scripts/cert_manager_cleanup.sh"]
volumeMounts:
- name: script
mountPath: /scripts
readOnly: true
volumes:
- name: script
configMap:
name: cert-manager-cleanup-script
defaultMode: 0555

View File

@ -1,58 +0,0 @@
# infrastructure/cert-manager/cleanup/cert-manager-cleanup-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: cert-manager-cleanup
namespace: cert-manager
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cert-manager-cleanup
rules:
- apiGroups: [""]
resources:
- pods
- services
- endpoints
- configmaps
- secrets
- serviceaccounts
verbs: ["get", "list", "watch", "delete"]
- apiGroups: ["apps"]
resources:
- deployments
- daemonsets
- statefulsets
- replicasets
verbs: ["get", "list", "watch", "delete"]
- apiGroups: ["batch"]
resources:
- jobs
- cronjobs
verbs: ["get", "list", "watch", "delete"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources:
- roles
- rolebindings
- clusterroles
- clusterrolebindings
verbs: ["get", "list", "watch", "delete"]
- apiGroups: ["admissionregistration.k8s.io"]
resources:
- validatingwebhookconfigurations
- mutatingwebhookconfigurations
verbs: ["get", "list", "watch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cert-manager-cleanup
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cert-manager-cleanup
subjects:
- kind: ServiceAccount
name: cert-manager-cleanup
namespace: cert-manager

View File

@ -1,15 +0,0 @@
# infrastructure/cert-manager/cleanup/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- cert-manager-cleanup-rbac.yaml
- cert-manager-cleanup-job.yaml
configMapGenerator:
- name: cert-manager-cleanup-script
namespace: cert-manager
files:
- cert_manager_cleanup.sh=scripts/cert_manager_cleanup.sh
options:
disableNameSuffixHash: true

View File

@ -1,5 +0,0 @@
# infrastructure/cert-manager/cleanup/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager

View File

@ -1,37 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
namespace="cert-manager"
selectors=(
"app.kubernetes.io/name=cert-manager"
"app.kubernetes.io/instance=cert-manager"
"app.kubernetes.io/instance=certmanager-prod"
)
delete_namespaced() {
local selector="$1"
kubectl -n "${namespace}" delete deployment,daemonset,statefulset,replicaset \
--selector "${selector}" --ignore-not-found --wait=false
kubectl -n "${namespace}" delete pod,service,endpoints,serviceaccount,configmap,secret \
--selector "${selector}" --ignore-not-found --wait=false
kubectl -n "${namespace}" delete role,rolebinding \
--selector "${selector}" --ignore-not-found --wait=false
kubectl -n "${namespace}" delete job,cronjob \
--selector "${selector}" --ignore-not-found --wait=false
}
delete_cluster_scoped() {
local selector="$1"
kubectl delete clusterrole,clusterrolebinding \
--selector "${selector}" --ignore-not-found --wait=false
kubectl delete mutatingwebhookconfiguration,validatingwebhookconfiguration \
--selector "${selector}" --ignore-not-found --wait=false
}
for selector in "${selectors[@]}"; do
delete_namespaced "${selector}"
delete_cluster_scoped "${selector}"
done
kubectl delete mutatingwebhookconfiguration cert-manager-webhook --ignore-not-found --wait=false
kubectl delete validatingwebhookconfiguration cert-manager-webhook --ignore-not-found --wait=false

View File

@ -1,67 +0,0 @@
# infrastructure/cert-manager/helmrelease.yaml
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: cert-manager
namespace: cert-manager
spec:
interval: 30m
chart:
spec:
chart: cert-manager
version: v1.17.0
sourceRef:
kind: HelmRepository
name: jetstack
namespace: flux-system
install:
crds: CreateReplace
remediation: { retries: 3 }
timeout: 10m
upgrade:
crds: CreateReplace
remediation:
retries: 3
remediateLastFailure: true
cleanupOnFail: true
timeout: 10m
values:
installCRDs: true
nodeSelector:
node-role.kubernetes.io/worker: "true"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values:
- rpi5
- rpi4
webhook:
nodeSelector:
node-role.kubernetes.io/worker: "true"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values:
- rpi5
- rpi4
cainjector:
nodeSelector:
node-role.kubernetes.io/worker: "true"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values:
- rpi5
- rpi4

View File

@ -1,6 +0,0 @@
# infrastructure/cert-manager/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- helmrelease.yaml

View File

@ -1,5 +0,0 @@
# infrastructure/cert-manager/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager

View File

@ -1,47 +0,0 @@
# infrastructure/core/coredns-custom.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
bstein-dev.server: |
bstein.dev:53 {
errors
cache 30
hosts {
192.168.22.9 alerts.bstein.dev
192.168.22.9 auth.bstein.dev
192.168.22.9 bstein.dev
10.43.6.87 budget.bstein.dev
192.168.22.9 call.live.bstein.dev
192.168.22.9 cd.bstein.dev
192.168.22.9 chat.ai.bstein.dev
192.168.22.9 ci.bstein.dev
192.168.22.9 cloud.bstein.dev
192.168.22.9 health.bstein.dev
192.168.22.9 kit.live.bstein.dev
192.168.22.9 live.bstein.dev
192.168.22.9 logs.bstein.dev
192.168.22.9 longhorn.bstein.dev
192.168.22.4 mail.bstein.dev
192.168.22.9 matrix.live.bstein.dev
192.168.22.9 metrics.bstein.dev
192.168.22.9 monero.bstein.dev
10.43.6.87 money.bstein.dev
192.168.22.9 notes.bstein.dev
192.168.22.9 office.bstein.dev
192.168.22.9 pegasus.bstein.dev
3.136.224.193 pm-bounces.bstein.dev
3.150.68.49 pm-bounces.bstein.dev
18.189.137.81 pm-bounces.bstein.dev
192.168.22.9 registry.bstein.dev
192.168.22.9 scm.bstein.dev
192.168.22.9 secret.bstein.dev
192.168.22.9 sso.bstein.dev
192.168.22.9 stream.bstein.dev
192.168.22.9 tasks.bstein.dev
192.168.22.9 vault.bstein.dev
fallthrough
}
}

View File

@ -1,141 +0,0 @@
# infrastructure/core/coredns-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/name: CoreDNS
spec:
progressDeadlineSeconds: 600
replicas: 2
revisionHistoryLimit: 0
selector:
matchLabels:
k8s-app: kube-dns
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25%
maxUnavailable: 1
template:
metadata:
labels:
k8s-app: kube-dns
spec:
containers:
- name: coredns
image: registry.bstein.dev/infra/coredns:1.12.1
imagePullPolicy: IfNotPresent
args:
- -conf
- /etc/coredns/Corefile
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
periodSeconds: 2
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 3
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
- name: custom-config-volume
mountPath: /etc/coredns/custom
readOnly: true
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values:
- rpi5
- rpi4
- key: node-role.kubernetes.io/worker
operator: In
values:
- "true"
dnsPolicy: Default
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
serviceAccountName: coredns
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
k8s-app: kube-dns
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
k8s-app: kube-dns
volumes:
- name: config-volume
configMap:
name: coredns
defaultMode: 420
items:
- key: Corefile
path: Corefile
- key: NodeHosts
path: NodeHosts
- name: custom-config-volume
configMap:
name: coredns-custom
optional: true
defaultMode: 420

View File

@ -4,8 +4,4 @@ kind: Kustomization
resources:
- ../modules/base
- ../modules/profiles/atlas-ha
- coredns-custom.yaml
- coredns-deployment.yaml
- ntp-sync-daemonset.yaml
- ../sources/cert-manager/letsencrypt.yaml
- ../sources/cert-manager/letsencrypt-prod.yaml

View File

@ -1,50 +0,0 @@
# infrastructure/core/ntp-sync-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: ntp-sync
namespace: kube-system
labels:
app: ntp-sync
spec:
selector:
matchLabels:
app: ntp-sync
template:
metadata:
labels:
app: ntp-sync
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: DoesNotExist
- key: node-role.kubernetes.io/master
operator: DoesNotExist
containers:
- name: ntp-sync
image: public.ecr.aws/docker/library/busybox:1.36.1
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c"]
args:
- |
set -eu
while true; do
ntpd -q -p pool.ntp.org || true
sleep 300
done
securityContext:
capabilities:
add: ["SYS_TIME"]
runAsUser: 0
runAsGroup: 0
resources:
requests:
cpu: 10m
memory: 16Mi
limits:
cpu: 50m
memory: 64Mi

View File

@ -1,15 +0,0 @@
# infrastructure/longhorn/adopt/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- longhorn-adopt-rbac.yaml
- longhorn-helm-adopt-job.yaml
configMapGenerator:
- name: longhorn-helm-adopt-script
namespace: longhorn-system
files:
- longhorn_helm_adopt.sh=scripts/longhorn_helm_adopt.sh
options:
disableNameSuffixHash: true

View File

@ -1,56 +0,0 @@
# infrastructure/longhorn/adopt/longhorn-adopt-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: longhorn-helm-adopt
namespace: longhorn-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: longhorn-helm-adopt
rules:
- apiGroups: [""]
resources:
- configmaps
- services
- serviceaccounts
- secrets
verbs: ["get", "list", "watch", "patch", "update"]
- apiGroups: ["apps"]
resources:
- deployments
- daemonsets
verbs: ["get", "list", "watch", "patch", "update"]
- apiGroups: ["batch"]
resources:
- jobs
verbs: ["get", "list", "watch", "patch", "update"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources:
- roles
- rolebindings
- clusterroles
- clusterrolebindings
verbs: ["get", "list", "watch", "patch", "update"]
- apiGroups: ["apiextensions.k8s.io"]
resources:
- customresourcedefinitions
verbs: ["get", "list", "watch", "patch", "update"]
- apiGroups: ["scheduling.k8s.io"]
resources:
- priorityclasses
verbs: ["get", "list", "watch", "patch", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: longhorn-helm-adopt
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: longhorn-helm-adopt
subjects:
- kind: ServiceAccount
name: longhorn-helm-adopt
namespace: longhorn-system

View File

@ -1,40 +0,0 @@
# infrastructure/longhorn/adopt/longhorn-helm-adopt-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: longhorn-helm-adopt-2
namespace: longhorn-system
spec:
backoffLimit: 1
template:
spec:
serviceAccountName: longhorn-helm-adopt
restartPolicy: Never
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/worker
operator: Exists
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: kubernetes.io/arch
operator: In
values: ["arm64"]
containers:
- name: adopt
image: bitnami/kubectl@sha256:554ab88b1858e8424c55de37ad417b16f2a0e65d1607aa0f3fe3ce9b9f10b131
command: ["/usr/bin/env", "bash"]
args: ["/scripts/longhorn_helm_adopt.sh"]
volumeMounts:
- name: script
mountPath: /scripts
readOnly: true
volumes:
- name: script
configMap:
name: longhorn-helm-adopt-script
defaultMode: 0555

View File

@ -1,5 +0,0 @@
# infrastructure/longhorn/adopt/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: longhorn-system

View File

@ -1,52 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
release_name="longhorn"
release_namespace="longhorn-system"
selector="app.kubernetes.io/instance=${release_name}"
annotate_and_label() {
local scope="$1"
local kind="$2"
if [ "${scope}" = "namespaced" ]; then
kubectl -n "${release_namespace}" annotate "${kind}" -l "${selector}" \
meta.helm.sh/release-name="${release_name}" \
meta.helm.sh/release-namespace="${release_namespace}" \
--overwrite >/dev/null 2>&1 || true
kubectl -n "${release_namespace}" label "${kind}" -l "${selector}" \
app.kubernetes.io/managed-by=Helm --overwrite >/dev/null 2>&1 || true
else
kubectl annotate "${kind}" -l "${selector}" \
meta.helm.sh/release-name="${release_name}" \
meta.helm.sh/release-namespace="${release_namespace}" \
--overwrite >/dev/null 2>&1 || true
kubectl label "${kind}" -l "${selector}" \
app.kubernetes.io/managed-by=Helm --overwrite >/dev/null 2>&1 || true
fi
}
namespaced_kinds=(
configmap
service
serviceaccount
deployment
daemonset
job
role
rolebinding
)
cluster_kinds=(
clusterrole
clusterrolebinding
customresourcedefinition
priorityclass
)
for kind in "${namespaced_kinds[@]}"; do
annotate_and_label "namespaced" "${kind}"
done
for kind in "${cluster_kinds[@]}"; do
annotate_and_label "cluster" "${kind}"
done

View File

@ -1,80 +0,0 @@
# infrastructure/longhorn/core/helmrelease.yaml
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: longhorn
namespace: longhorn-system
spec:
interval: 30m
chart:
spec:
chart: longhorn
version: 1.8.2
sourceRef:
kind: HelmRepository
name: longhorn
namespace: flux-system
install:
crds: Skip
remediation: { retries: 3 }
timeout: 15m
upgrade:
crds: Skip
remediation:
retries: 3
remediateLastFailure: true
cleanupOnFail: true
timeout: 15m
values:
service:
ui:
type: NodePort
nodePort: 30824
privateRegistry:
createSecret: false
registrySecret: longhorn-registry
image:
pullPolicy: Always
longhorn:
engine:
repository: registry.bstein.dev/infra/longhorn-engine
tag: v1.8.2
manager:
repository: registry.bstein.dev/infra/longhorn-manager
tag: v1.8.2
ui:
repository: registry.bstein.dev/infra/longhorn-ui
tag: v1.8.2
instanceManager:
repository: registry.bstein.dev/infra/longhorn-instance-manager
tag: v1.8.2
shareManager:
repository: registry.bstein.dev/infra/longhorn-share-manager
tag: v1.8.2
backingImageManager:
repository: registry.bstein.dev/infra/longhorn-backing-image-manager
tag: v1.8.2
supportBundleKit:
repository: registry.bstein.dev/infra/longhorn-support-bundle-kit
tag: v0.0.56
csi:
attacher:
repository: registry.bstein.dev/infra/longhorn-csi-attacher
tag: v4.9.0
provisioner:
repository: registry.bstein.dev/infra/longhorn-csi-provisioner
tag: v5.3.0
nodeDriverRegistrar:
repository: registry.bstein.dev/infra/longhorn-csi-node-driver-registrar
tag: v2.14.0
resizer:
repository: registry.bstein.dev/infra/longhorn-csi-resizer
tag: v1.13.2
snapshotter:
repository: registry.bstein.dev/infra/longhorn-csi-snapshotter
tag: v8.2.0
livenessProbe:
repository: registry.bstein.dev/infra/longhorn-livenessprobe
tag: v2.16.0
defaultSettings:
systemManagedPodsImagePullPolicy: Always

View File

@ -1,18 +0,0 @@
# infrastructure/longhorn/core/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- vault-serviceaccount.yaml
- secretproviderclass.yaml
- vault-sync-deployment.yaml
- helmrelease.yaml
- longhorn-settings-ensure-job.yaml
configMapGenerator:
- name: longhorn-settings-ensure-script
files:
- longhorn_settings_ensure.sh=scripts/longhorn_settings_ensure.sh
generatorOptions:
disableNameSuffixHash: true

View File

@ -1,36 +0,0 @@
# infrastructure/longhorn/core/longhorn-settings-ensure-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: longhorn-settings-ensure-4
namespace: longhorn-system
spec:
backoffLimit: 0
ttlSecondsAfterFinished: 3600
template:
spec:
serviceAccountName: longhorn-service-account
restartPolicy: Never
volumes:
- name: longhorn-settings-ensure-script
configMap:
name: longhorn-settings-ensure-script
defaultMode: 0555
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/arch
operator: In
values: ["arm64"]
- key: node-role.kubernetes.io/worker
operator: Exists
containers:
- name: apply
image: bitnami/kubectl@sha256:554ab88b1858e8424c55de37ad417b16f2a0e65d1607aa0f3fe3ce9b9f10b131
command: ["/scripts/longhorn_settings_ensure.sh"]
volumeMounts:
- name: longhorn-settings-ensure-script
mountPath: /scripts
readOnly: true

View File

@ -1,5 +0,0 @@
# infrastructure/longhorn/core/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: longhorn-system

View File

@ -1,42 +0,0 @@
#!/usr/bin/env sh
set -eu
# Longhorn blocks direct CR patches for some settings; use the internal API instead.
api_base="http://longhorn-backend.longhorn-system.svc:9500/v1/settings"
wait_for_api() {
attempts=30
while [ "${attempts}" -gt 0 ]; do
if curl -fsS "${api_base}" >/dev/null 2>&1; then
return 0
fi
attempts=$((attempts - 1))
sleep 2
done
echo "Longhorn API not ready after retries." >&2
return 1
}
update_setting() {
name="$1"
value="$2"
current="$(curl -fsS "${api_base}/${name}" || true)"
if echo "${current}" | grep -Fq "\"value\":\"${value}\""; then
echo "Setting ${name} already set."
return 0
fi
echo "Setting ${name} -> ${value}"
curl -fsS -X PUT \
-H "Content-Type: application/json" \
-d "{\"value\":\"${value}\"}" \
"${api_base}/${name}" >/dev/null
}
wait_for_api
update_setting default-engine-image "registry.bstein.dev/infra/longhorn-engine:v1.8.2"
update_setting default-instance-manager-image "registry.bstein.dev/infra/longhorn-instance-manager:v1.8.2"
update_setting default-backing-image-manager-image "registry.bstein.dev/infra/longhorn-backing-image-manager:v1.8.2"
update_setting support-bundle-manager-image "registry.bstein.dev/infra/longhorn-support-bundle-kit:v0.0.56"

View File

@ -1,21 +0,0 @@
# infrastructure/longhorn/core/secretproviderclass.yaml
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: longhorn-vault
namespace: longhorn-system
spec:
provider: vault
parameters:
vaultAddress: "http://vault.vault.svc.cluster.local:8200"
roleName: "longhorn"
objects: |
- objectName: "harbor-pull__dockerconfigjson"
secretPath: "kv/data/atlas/shared/harbor-pull"
secretKey: "dockerconfigjson"
secretObjects:
- secretName: longhorn-registry
type: kubernetes.io/dockerconfigjson
data:
- objectName: harbor-pull__dockerconfigjson
key: .dockerconfigjson

View File

@ -1,6 +0,0 @@
# infrastructure/longhorn/core/vault-serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: longhorn-vault-sync
namespace: longhorn-system

View File

@ -1,45 +0,0 @@
# infrastructure/longhorn/core/vault-sync-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: longhorn-vault-sync
namespace: longhorn-system
spec:
replicas: 1
selector:
matchLabels:
app: longhorn-vault-sync
template:
metadata:
labels:
app: longhorn-vault-sync
spec:
serviceAccountName: longhorn-vault-sync
nodeSelector:
node-role.kubernetes.io/worker: "true"
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
preference:
matchExpressions:
- key: hardware
operator: In
values: ["rpi5", "rpi4"]
containers:
- name: sync
image: alpine:3.20
command: ["/bin/sh", "-c"]
args:
- "sleep infinity"
volumeMounts:
- name: vault-secrets
mountPath: /vault/secrets
readOnly: true
volumes:
- name: vault-secrets
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: longhorn-vault

View File

@ -7,7 +7,7 @@ metadata:
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: websecure
traefik.ingress.kubernetes.io/router.tls: "true"
traefik.ingress.kubernetes.io/router.middlewares: ""
traefik.ingress.kubernetes.io/router.middlewares: longhorn-system-longhorn-basicauth@kubernetescrd,longhorn-system-longhorn-headers@kubernetescrd
spec:
ingressClassName: traefik
tls:
@ -21,6 +21,6 @@ spec:
pathType: Prefix
backend:
service:
name: oauth2-proxy-longhorn
name: longhorn-frontend
port:
number: 80

View File

@ -2,7 +2,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- serviceaccount.yaml
- oauth2-proxy-longhorn.yaml
- middleware.yaml
- ingress.yaml

View File

@ -20,20 +20,3 @@ spec:
headers:
customRequestHeaders:
X-Forwarded-Proto: "https"
---
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: longhorn-forward-auth
namespace: longhorn-system
spec:
forwardAuth:
address: https://auth.bstein.dev/oauth2/auth
trustForwardHeader: true
authResponseHeaders:
- Authorization
- X-Auth-Request-Email
- X-Auth-Request-User
- X-Auth-Request-Groups

View File

@ -1,98 +0,0 @@
# infrastructure/longhorn/ui-ingress/oauth2-proxy-longhorn.yaml
apiVersion: v1
kind: Service
metadata:
name: oauth2-proxy-longhorn
namespace: longhorn-system
labels:
app: oauth2-proxy-longhorn
spec:
ports:
- name: http
port: 80
targetPort: 4180
selector:
app: oauth2-proxy-longhorn
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: oauth2-proxy-longhorn
namespace: longhorn-system
labels:
app: oauth2-proxy-longhorn
spec:
replicas: 2
selector:
matchLabels:
app: oauth2-proxy-longhorn
template:
metadata:
labels:
app: oauth2-proxy-longhorn
annotations:
vault.hashicorp.com/agent-inject: "true"
vault.hashicorp.com/role: "longhorn"
vault.hashicorp.com/agent-inject-secret-oidc-config: "kv/data/atlas/longhorn/oauth2-proxy"
vault.hashicorp.com/agent-inject-template-oidc-config: |
{{- with secret "kv/data/atlas/longhorn/oauth2-proxy" -}}
client_id = "{{ .Data.data.client_id }}"
client_secret = "{{ .Data.data.client_secret }}"
cookie_secret = "{{ .Data.data.cookie_secret }}"
{{- end -}}
spec:
serviceAccountName: longhorn-vault
nodeSelector:
node-role.kubernetes.io/worker: "true"
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 90
preference:
matchExpressions:
- key: hardware
operator: In
values: ["rpi5","rpi4"]
containers:
- name: oauth2-proxy
image: quay.io/oauth2-proxy/oauth2-proxy:v7.6.0
imagePullPolicy: IfNotPresent
args:
- --provider=oidc
- --config=/vault/secrets/oidc-config
- --redirect-url=https://longhorn.bstein.dev/oauth2/callback
- --oidc-issuer-url=https://sso.bstein.dev/realms/atlas
- --scope=openid profile email groups
- --email-domain=*
- --allowed-group=admin
- --set-xauthrequest=true
- --pass-access-token=true
- --set-authorization-header=true
- --cookie-secure=true
- --cookie-samesite=lax
- --cookie-refresh=20m
- --cookie-expire=168h
- --insecure-oidc-allow-unverified-email=true
- --upstream=http://longhorn-frontend.longhorn-system.svc.cluster.local
- --http-address=0.0.0.0:4180
- --skip-provider-button=true
- --skip-jwt-bearer-tokens=true
- --oidc-groups-claim=groups
- --cookie-domain=longhorn.bstein.dev
ports:
- containerPort: 4180
name: http
readinessProbe:
httpGet:
path: /ping
port: 4180
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /ping
port: 4180
initialDelaySeconds: 20
periodSeconds: 20

View File

@ -1,6 +0,0 @@
# infrastructure/longhorn/ui-ingress/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: longhorn-vault
namespace: longhorn-system

View File

@ -1,47 +0,0 @@
# infrastructure/metallb/helmrelease.yaml
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: metallb
namespace: metallb-system
spec:
interval: 30m
chart:
spec:
chart: metallb
version: 0.15.3
sourceRef:
kind: HelmRepository
name: metallb
namespace: flux-system
install:
crds: CreateReplace
remediation: { retries: 3 }
timeout: 10m
upgrade:
crds: CreateReplace
remediation:
retries: 3
remediateLastFailure: true
cleanupOnFail: true
timeout: 10m
values:
loadBalancerClass: metallb
prometheus:
metricsPort: 7472
controller:
logLevel: info
webhookMode: enabled
tlsMinVersion: VersionTLS12
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values:
- rpi4
- rpi5
speaker:
logLevel: info

View File

@ -1,20 +0,0 @@
# infrastructure/metallb/ippool.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: communication-pool
namespace: metallb-system
spec:
addresses:
- 192.168.22.4-192.168.22.6
- 192.168.22.9-192.168.22.9
autoAssign: true
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: communication-adv
namespace: metallb-system
spec:
ipAddressPools:
- communication-pool

View File

@ -1,7 +0,0 @@
# infrastructure/metallb/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- helmrelease.yaml
- ippool.yaml

View File

@ -1,5 +0,0 @@
# infrastructure/metallb/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: metallb-system

View File

@ -1,24 +0,0 @@
# infrastructure/modules/base/storageclass/asteria-encrypted.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: asteria-encrypted
parameters:
diskSelector: asteria
fromBackup: ""
numberOfReplicas: "2"
staleReplicaTimeout: "30"
fsType: "ext4"
replicaAutoBalance: "least-effort"
dataLocality: "disabled"
encrypted: "true"
csi.storage.k8s.io/provisioner-secret-name: ${pvc.name}
csi.storage.k8s.io/provisioner-secret-namespace: ${pvc.namespace}
csi.storage.k8s.io/node-publish-secret-name: ${pvc.name}
csi.storage.k8s.io/node-publish-secret-namespace: ${pvc.namespace}
csi.storage.k8s.io/node-stage-secret-name: ${pvc.name}
csi.storage.k8s.io/node-stage-secret-namespace: ${pvc.namespace}
provisioner: driver.longhorn.io
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate

Some files were not shown because too many files have changed in this diff Show More