3.7 KiB
3.7 KiB
Repository Guidelines
Project Structure & Module Organization
infrastructure/: cluster-scoped building blocks (core, flux-system, traefik, longhorn). Add new platform features by mirroring this layout.services/: workload manifests per app (services/gitea/, etc.) withkustomization.yamlplus one file per kind; keep diffs small and focused.dockerfiles/hosts bespoke images, whilescripts/stores operational Fish/Bash helpers—extend these directories instead of relying on ad-hoc commands.
Build, Test, and Development Commands
kustomize build services/<app>(orkubectl kustomize ...) renders manifests exactly as Flux will.kubectl apply --server-side --dry-run=client -k services/<app>checks schema compatibility without touching the cluster.flux reconcile kustomization <name> --namespace flux-system --with-sourcepulls the latest Git state after merges or hotfixes.fish scripts/flux_hammer.fish --helpexplains the recovery tool; read it before running against production workloads.
Coding Style & Naming Conventions
- YAML uses two-space indents; retain the leading path comment (e.g.
# services/gitea/deployment.yaml) to speed code review. - Keep resource names lowercase kebab-case, align labels/selectors, and mirror namespaces with directory names.
- List resources in
kustomization.yamlfrom namespace/config, through storage, then workloads and networking for predictable diffs. - Scripts start with
#!/usr/bin/env fishor bash, stay executable, and follow snake_case names such asflux_hammer.fish.
Testing Guidelines
- Run
kustomize buildand the dry-run apply for every service you touch; capture failures before opening a PR. flux diff kustomization <name> --path services/<app>previews reconciliations—link notable output when behavior shifts.- Docker edits:
docker build -f dockerfiles/Dockerfile.monerod .(swap the file you changed) to verify image builds.
Commit & Pull Request Guidelines
- Keep commit subjects short, present-tense, and optionally scoped (
gpu(titan-24): add RuntimeClass); squash fixups before review. - Describe linked issues, affected services, and required operator steps (e.g.
flux reconcile kustomization services-gitea) in the PR body. - Focus each PR on one kustomization or service and update
infrastructure/flux-systemwhen Flux must track new folders. - Record the validation you ran (dry-runs, diffs, builds) and add screenshots only when ingress or UI behavior changes.
Security & Configuration Tips
- Never commit credentials; use Vault workflows (
services/vault/) or SOPS-encrypted manifests wired throughinfrastructure/flux-system. - Node selectors and tolerations gate workloads to hardware like
hardware: rpi4; confirm labels before scaling or renaming nodes. - Pin external images by digest or rely on Flux image automation to follow approved tags and avoid drift.
Dashboard roadmap / context (2025-12-02)
- Atlas dashboards are generated via
scripts/dashboards_render_atlas.py --build, which writes JSON underservices/monitoring/dashboards/and ConfigMaps underservices/monitoring/. Keep the Grafana manifests in sync by regenerating after edits. - Atlas Overview panels are paired with internal dashboards (pods, nodes, storage, network, GPU). A new
atlas-gpuinternal dashboard holds the detailed GPU metrics that feed the overview share pie. - Old Grafana folders (
Atlas Storage,Atlas SRE,Atlas Public,Atlas Nodes) should be removed in Grafana UI when convenient; onlyAtlas OverviewandAtlas Internalshould remain provisioned. - Future work: add a separate generator (e.g.,
dashboards_render_oceanus.py) for SUI/oceanus validation dashboards, mirroring the atlas pattern of internal dashboards feeding a public overview.