From 09d80a63f619d215f1d1f4f320286450d1369482 Mon Sep 17 00:00:00 2001 From: s8n Date: Wed, 6 May 2026 10:02:28 +0100 Subject: [PATCH] init: nullstone deploys + runbooks + audits MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sourced from previous audits + agent-wave outputs (2026-05-05): AUDIT-2026-05-05.md — 5-agent stack synthesis forgejo/DEPLOY.md — git.s8n.ru deploy runbook forgejo/forgejo-compose.yml — production compose forgejo/runner-compose.yml — forgejo-runner forgejo/migration-report-... — GH→Forgejo migration audit (6/6 green) runbooks/MIGRATION-... — nullstone→cobblestone runbook runbooks/DE-DECISION-... — keep-vs-strip DE on cobblestone repos/REPO-AUDIT-2026-05-05.md — repo trees + ownership --- AUDIT-2026-05-05.md | 370 ++++++++++ README.md | 23 + forgejo/DEPLOY.md | 176 +++++ forgejo/forgejo-compose.yml | 68 ++ forgejo/migration-report-2026-05-05.md | 59 ++ forgejo/runner-compose.yml | 63 ++ repos/REPO-AUDIT-2026-05-05.md | 486 ++++++++++++++ runbooks/DE-DECISION-cobblestone.md | 170 +++++ .../MIGRATION-nullstone-to-cobblestone.md | 630 ++++++++++++++++++ 9 files changed, 2045 insertions(+) create mode 100644 AUDIT-2026-05-05.md create mode 100644 README.md create mode 100644 forgejo/DEPLOY.md create mode 100644 forgejo/forgejo-compose.yml create mode 100644 forgejo/migration-report-2026-05-05.md create mode 100644 forgejo/runner-compose.yml create mode 100644 repos/REPO-AUDIT-2026-05-05.md create mode 100644 runbooks/DE-DECISION-cobblestone.md create mode 100644 runbooks/MIGRATION-nullstone-to-cobblestone.md diff --git a/AUDIT-2026-05-05.md b/AUDIT-2026-05-05.md new file mode 100644 index 0000000..09d04b8 --- /dev/null +++ b/AUDIT-2026-05-05.md @@ -0,0 +1,370 @@ +# 5-Agent Audit Report — 2026-05-05 + +Synthesis of 5 parallel agents covering: GitHub→Forgejo migration, +ai-lab structure, nullstone services, stack rating, recommended +additions. + +Source agent outputs: +1. Migration agent → `nullstone-server/forgejo/migration-report-2026-05-05.md` +2. ai-lab structural audit +3. nullstone services + deployment audit +4. Stack rating (10 axes) +5. Recommended service additions + +--- + +## TL;DR + +- **GH → Forgejo migration: complete.** 6/6 repos mirrored + (5× s8n-ru/* + veilor-org/veilor-os). All HEADs match, branches + match, tags match, push-mirrors back to GH all green. Repaired one + default-branch metadata drift on `s8n-ru/x`. Zero failures. +- **Stack rating: 7/10.** Above-average self-hosted setup. Audit + discipline + identity/CA story unusually strong. Fragile on + monitoring + offsite backup + single-host. +- **Top 5 weaknesses (severity-ordered):** F4 no LUKS on nullstone + (regression), no monitoring/alerting, backups local-only with + silently broken script, `:latest` floats on most stacks, single + point of failure (nullstone + home WAN). +- **Top 5 services to add (priority):** Restic+autorestic, Vaultwarden, + Gatus, CrowdSec, Beszel. +- **Top 4 anti-recommendations:** Nextcloud, full LGTM stack, Mastodon, + HashiCorp Vault. + +--- + +## 1 — GitHub repo migration + +**Status: complete.** Per migration agent's report. + +- 6 repos enumerated under `s8n-ru` user + admin'd orgs. +- 6 mirrored to `git.s8n.ru` (Forgejo); 5 fresh, 1 already pre-migrated + (`veilor-org/veilor-os`). +- HEADs / branches / tags match GH for all 6. +- Push-mirrors Forgejo → GH configured (8h interval + sync-on-commit), + all green. +- One repair: `s8n-ru/x` default branch was stuck on + `KisaragiEffective-patch-1` from Misskey upstream; PATCHed to + `master`. + +Detail: `nullstone-server/forgejo/migration-report-2026-05-05.md`. + +--- + +## 2 — ai-lab structural audit + +### Devices + +| codename | type | OS | role | +|---|---|---|---| +| onyx | laptop | Fedora 43 KDE | Dev workstation (DHCP `.28`, registry says `.6` — drift) | +| nullstone | server | Debian 13 | Infra host — Docker stack, mesh, Matrix/Misskey/RC | +| office | workstation | Fedora 43 KDE (pending install since 2026-04-19) | Office/sales (.5) | + +External: friend PC `100.64.0.3` (RTX 4080, vLLM in WSL2). + +### Active projects (`_github/`) + +| repo | purpose | status | +|---|---|---| +| veilor-os | Hardened Fedora 43 KDE remix | actively iterating, BlueBuild + kickstart | +| auth-limbo | Paper plugin (racked.ru AuthMe fix) | active, released jars | +| minecraft-launcher | Custom MC launcher (PrismLauncher fork) | active, v1 build script | +| minecraft-server | Purpur MC at `mc.racked.ru:25565` | live in prod | +| minecraft-client | racked.ru MC client (FO 11.3.2 fork) | active | + +### Per-device security audit cadence + +| device | last audit | folder | +|---|---|---| +| nullstone | 2026-05-05 (ACL hardening); full 2026-05-02 | `security/nullstone-server/` (9 reports) | +| onyx | 2026-04-15 | `security/onyx-laptop/` (2 reports) | +| office | never | `security/office-workstation/` (empty) | + +### Memory record (31 files, 1 index) + +- 2 user, 7 feedback, 1 reference, 21 project memos. +- Top-active: matrix_veilor, txt_cinny, x_misskey_fork, tailscale_mesh, + friend_gpu, org_charter, brand_separation, simplex_org_chat. + +### What this lab is + +The operator runs a small home-lab/3-member CTO-style org +(`P M=CTO, nullstone=Runtime Owner, onyx-ai=Research/Review`) split +cleanly across **two brands** (per `project_brand_separation.md`): + +1. **racked.ru** — privacy-first Minecraft platform (MC server + + client + custom launcher + AuthLimbo plugin) +2. **veilor** — security company stack (veilor-os hardened Fedora + ISO, veilor-server-bootstrap Debian preseed, Matrix at veilor.uk, + Misskey-fork at x.veilor) + +All self-hosted on nullstone behind Traefik+Headscale+Pi-hole. Mesh +includes friend's RTX 4080 for remote LLM inference via Tailscale. + +### Drift / gaps + +- `office-workstation/` registered in CLAUDE.md but install pending + since 2026-04-19; no audit folder populated. +- README onyx IP `.6` vs actual DHCP `.28`. +- README folder tree doesn't match real repo (lists `_project_code/` + + `scripts/`; reality has `_github/`, `_projects/`, `_archive/`, + `archive/`, `github/`, several `.sync-conflict-*` files, 30 MB + binary `re` at root). +- Two parallel `nullstone-server/` and `server/` device folders — + drift candidate. +- `MEMORY.md` index missing entry for `project_forgejo_nullstone.md` + (file present, index not updated). +- Sync-conflict files for CLAUDE.md / README.md / SYSTEM.md from + Syncthing merge never resolved. +- SYSTEM.md still mentions Jitsi/coturn / MAS Element X test + retired per project_matrix_veilor.md — TODO list not pruned. + +--- + +## 3 — nullstone services + deployment audit + +### Hardware + +- **CPU:** AMD Ryzen 5 2600X (6c/12t) +- **RAM:** 32 GiB (15 used, 15 free, 24 GiB swap, 256 KiB used) +- **GPU:** GTX 1660 Ti 6 GB (Ollama) +- **Disk:** 477 GiB NVMe, LVM (`keystone-vg`): + - root 30 G (35% used) + - var 12 G (15%) + - **home 399 G (60%, 227 G used / 153 G free)** — watch growth + - tmp 2.7 G, swap 24 G +- **OS:** Debian 13, kernel 6.12.85+deb13 +- **Docker:** v29.4.2, overlay2, **userns-remap=default**, + live-restore=true, icc=false, no-new-privileges=true. Data root + symlinked `/var/lib/docker → /home/user/docker-data`. + +### Active services (28 containers) + +Including: traefik, socket-proxy, authentik (server+worker+pg+redis), +forgejo + forgejo-runner, misskey + db + redis, x-source nginx, +rocketchat + mongodb, tuwunel + tuwunel-txt, cinny-txt, commet-web, +signup-page + signup-txt, livekit + lk-jwt-service, dl-veilor, pihole, +headscale, n8n + postgres, step-ca, filebrowser-mc, minecraft-mc, +anythingllm, plus 2 stale `alpine:3` shells from userns-host bypass. + +### Domain → service map (all on `*.s8n.ru` or `*.veilor[.uk]`) + +`sys.s8n.ru` (traefik dash), `git.s8n.ru` (forgejo, NEW), `auth.s8n.ru` +(authentik), `pihole.s8n.ru`, `signup.txt.s8n.ru`, `hs.s8n.ru` +(headscale), `rc.s8n.ru` (rocketchat), `n8n.s8n.ru`, `txt.s8n.ru` +(cinny), `mx.s8n.ru` (tuwunel-txt), `x.veilor` (misskey), +`matrix.veilor.uk`, `chat.veilor.uk` (commet), `livekit.veilor.uk`, +`signup.veilor.uk`, `dl.veilor.org`. + +### Deployment patterns + +- Compose: `/opt/docker//docker-compose.yml` +- Data: named docker volumes under + `/home/user/docker-data/100000.100000/volumes/` + per-service + bind mounts. Newer services (forgejo, forgejo-runner, minecraft) + on `/home/docker//` to dodge 30 G root. +- userns-remap quirk: container UIDs shifted +100000. + Workaround: alpine root container or chown to 101000. +- Docker socket exposure: traefik does NOT mount docker.sock; goes + via tecnativa/docker-socket-proxy on socket-proxy-net. +- Networks: `proxy` + `socket-proxy-net` + `misskey-frontend` + + per-stack internals (authentik-internal, misskey-internal, etc.). +- Middleware chain: `trusted-only@file → security-headers@file + → rate-limit@file → ` with `no-guest@file` + for routers needing tailnet+LAN but blocking public. + +### Auth patterns + +- **Authentik (auth.s8n.ru)** — central OIDC, all 4 components healthy. + **Currently mostly unwired.** Forgejo runs native auth (no OAUTH + section in app.ini). RC, n8n, anythingllm, filebrowser likely + local-auth too. Authentik present but underused. +- **Forgejo** — local users + PAT, admin `s8n-ru`, SSH 222. +- **Headscale** — preauthkey enrollment + `headscale-deny-leaks@file`. +- **Traefik dashboard** — basicauth + trusted-only@file. + +### Backup state + +- `/etc/cron.d/docker-backup` runs `/opt/docker/backup.sh` at 02:00 + daily, 7-day rotation to `/opt/backups/`. +- **Script silently broken (HIGH):** matrix-postgres container is + gone (Synapse retired); rocketchat-mongodb name mismatch (script + expects `mongodb`); Mongo password reads literal + `CHANGE_ME_MONGO_ADMIN_PASSWORD`. So Rocket.Chat + (former) Matrix + dumps **not happening**. Misskey side-script works. +- **No off-host replication.** Single NVMe = total loss on disk + failure. + +### Drift / risk register + +- 🔴 Backup script broken (RC + ex-Matrix not dumping) +- 🔴 `anythingllm` listens 0.0.0.0:3001 with no traefik label, + bypasses entire L7 trust model. Either bind LAN-only or front via + traefik. +- 🟠 Resource limits: only minecraft-mc has memory/CPU limits. + 30 other containers unbounded — runaway can OOM-kill neighbours. +- 🟠 No service-level health checks on ~half the containers. +- 🔴 `no-guest@file` IPAllowList stub: declares only + `sourceRange: ["127.0.0.0/8"]`. Routers chained with `no-guest` + reject everything except loopback unless XFF restores client IP. + **Verify** entryPoint forwardedHeaders.trustedIPs + middleware + ipStrategy.depth — misconfig either 403s real users or accepts + spoofed XFF. +- 🟡 office (100.64.0.4) not in `trusted-only@file` despite + `tag:infra` per SYSTEM.md. +- 🟠 RocketChat: first-admin setup still pending — wizard endpoint + takeover risk until claimed. +- 🟡 Stale `alpine:3` shell containers (userns-host bypass leftovers). + `docker rm -f` after each one-shot. +- 🟡 Archived compose dirs (`pocket-id.archived-*`, `matrix-old`) + contain secrets — move off docker tree. +- 🟡 `/home` 60% with growing volumes (Ollama, mongo, postgres ×3). + No quotas. + +### Mem pressure: none right now + +Top consumer minecraft 9.35 / 18 GiB cap (52% of cap, ~30% host). +All others < 2.2%. Headroom good. + +--- + +## 4 — Stack rating (10 axes) + +| Axis | Score | Top weakness | +|---|---|---| +| Architectural coherence | 8 | Drift artifacts (sync-conflict files, parallel `_archive`/`archive`) | +| Security posture | 7 | F4 no LUKS on server (regression); F30 ip_forward=1; F12 partial revert | +| Reproducibility | 6 | Most stacks on `:latest`; no IaC; admin bootstrap uncoded | +| Operational maturity | **4** | **No metrics/alerts; backups untested; on-call="user reads logs"** | +| Cost discipline | 9 | Single residential ISP + single home server is "cheap because fragile" | +| Threat model clarity | 6 | No written THREAT_MODEL.md; AGPL §13 source endpoint deferred | +| Update hygiene | 5 | `:latest` floats; no staged rollout; recovery = "edit compose, restart" | +| Documentation quality | 8 | SYSTEM.md is 979 lines; CV + team-msg.txt + sync-conflicts in repo root | +| Network resilience | 5 | Single residential WAN; control + data plane same box; no Tor/SOCKS fallback | +| Branding/product discipline | 9 | "X" rebrand close to veilor — easy to confuse in logs/docs | + +### Overall: **7/10** + +Above-average self-hosted stack. Better-documented than 90% of +homelabs, with audit discipline most small SaaS shops don't reach, +and a coherent identity/CA story (own root CA via step-ca, own VPN +control plane via Headscale, own Matrix homeserver). Loses points on +operational maturity (no monitoring, no offsite/tested backups, no +rollback), one critical regression (no LUKS on nullstone), and +inherent fragility from single-host single-ISP design. + +The gap between **known weaknesses** and **fixed weaknesses** is the +limiting factor — operator clearly *can* fix these (audit closes 27/35 +findings in 3 days), they just haven't yet. + +### Comparison + +- vs **Stock Fedora desktop + GitHub:** wins decisively (8 vs 3) on + network/identity/AGPL discipline. +- vs **secureblue + GH Actions:** stronger on server-side sovereignty; + weaker on client posture and CI. Roughly tied overall, different axes. +- vs **Hetzner-VPS hobbyist stack:** loses on resilience + update + hygiene, wins on cost + GPU inference + identity depth. This stack + more ambitious; Hetzner more boring-and-reliable. +- vs **Cloudflare/Workers managed:** wins on sovereignty + GPU + Matrix + ability. Loses on uptime + DDoS + zero-patching. This stack's whole + reason to exist is the inverse tradeoff — and it makes that tradeoff + coherently. + +--- + +## 5 — Recommended service additions + +### Top 5 priority (deploy in this order) + +| # | Service | Why now | Effort | Maintenance | +|---|---|---|---|---| +| 1 | **Restic + autorestic** | Single biggest gap. nullstone NVMe failure = total loss right now. Encrypted incremental to B2/Wasabi or to onyx. | M | S | +| 2 | **Vaultwarden** | N services with N storage methods for secrets. Centralize before count grows. | S | S | +| 3 | **Gatus** | Otherwise you find out about a downed service from a friend on Matrix. Cert-expiry alone catches the silent killer. Alerts via Tuwunel webhook or ntfy. | S | S | +| 4 | **CrowdSec** | Pi-hole only sees DNS layer. Public Matrix fed candidates + RC + Misskey + signup pages = HTTP attack surface. Bouncer plugin blocks at Traefik. | M | S | +| 5 | **Beszel** | Once Restic is filling disk + CrowdSec flagging IPs, you want one dashboard. | S | S | + +### Anti-recommendations + +| Service | Why NOT | +|---|---| +| **Nextcloud** | Heavy (1.5 GB+ RAM idle), notorious upgrade pain. Use Seafile if you need files. | +| **Full LGTM stack** (Grafana+Prom+Loki+Alertmanager) | Five services to do what Beszel+Gatus do for solo-op. | +| **Mastodon** | You already run Misskey-fork. Federating two ActivityPub silos doubles moderation. | +| **HashiCorp Vault** | Complexity-to-benefit ratio terrible for one operator. Infisical or pass-with-git enough. | +| **Authelia** | Duplicates Authentik. Pick one. | + +### Consolidation suggestions + +- **Cinny + various Element/Commet forks:** pick **one** web client + per Matrix instance. Each fork = separate audit + CSP + branding burden. +- **n8n:** if only used for 2-3 simple flows, replace with shell + scripts in Forgejo Actions cron. n8n's value is the GUI for + non-coders; you're a coder. +- **Step-CA + Let's Encrypt:** confirm zero overlap. If step-ca only + issues one cert, kill it. +- **dl-veilor + signup pages:** if static, fold into single Caddy + container behind Traefik. Two containers for static HTML is two + too many. + +### Other notable picks (lower priority) + +- **Seafile CE** — file sync (much lighter than Nextcloud) +- **Karakeep** (formerly Hoarder) — bookmarks/RSS/read-later, AI tags + via your local Ollama / friend RTX 4080 +- **ntfy** — formalize the push-notification target you're already + using ad-hoc +- **Forgejo Packages** — already implicit, just enable for container + registry + npm/cargo/maven/generic + +--- + +## 6 — Action items (severity-ordered) + +### Ship-blocking (do this week) + +1. **Fix `/opt/docker/backup.sh`** — remove dead matrix-postgres, + correct rocketchat-mongodb container name, replace literal + `CHANGE_ME_MONGO_ADMIN_PASSWORD`. Verify next 02:00 run produces + non-zero RC + Mongo dumps. +2. **Bind anythingllm to LAN-only** OR add traefik front with + `no-guest@file`. Currently public on :3001. +3. **Verify `no-guest@file` ACL** — confirm `sourceRange` covers + LAN + tailnet, not just loopback. Verify XFF chain restores + real client IP. +4. **Claim RocketChat first-admin** — takeover risk until then. +5. **Enable LUKS2 on nullstone** (F4 regression) — schedule reinstall + window with TPM2 unlock; or until then, LUKS-on-file loopback + for step-ca root key + acme.json + Mongo keyfile. + +### High-value next (do this month) + +6. Deploy **Restic + autorestic** with B2/Wasabi target + restore drill. +7. Deploy **Vaultwarden** + migrate secrets out of `.env` files. +8. Deploy **Gatus** with cert-expiry checks + Matrix/ntfy alerts. +9. Resolve **sync-conflict files** at ai-lab repo root. +10. **Pin docker images by digest** for critical stacks (already done + for Misskey; do tuwunel/livekit/cinny/pihole/RC/Traefik next). + +### Defer / planned + +- Office workstation install + first audit +- Fold dl-veilor + signup pages into single Caddy +- Replace n8n with Forgejo Actions cron (if usage <5 flows) +- Move Headscale + step-ca to $4/mo VPS for SPOF mitigation + +--- + +## 7 — File index + +| Output | Path | +|---|---| +| This synthesis | `~/ai-lab/nullstone-server/audit-report-2026-05-05.md` | +| Migration detail | `~/ai-lab/nullstone-server/forgejo/migration-report-2026-05-05.md` | +| Forgejo runbook | `~/ai-lab/nullstone-server/forgejo/deploy-runbook.md` | +| Forgejo memory | `~/.claude/projects/-home-admin-ai-lab/memory/project_forgejo_nullstone.md` | +| veilor-os strategy | `~/ai-lab/_github/veilor-os/docs/STRATEGY.md` | +| veilor-os roadmap | `~/ai-lab/_github/veilor-os/docs/ROADMAP.md` | +| veilor-os threat model | `~/ai-lab/_github/veilor-os/docs/THREAT-MODEL.md` | diff --git a/README.md b/README.md new file mode 100644 index 0000000..e2d68be --- /dev/null +++ b/README.md @@ -0,0 +1,23 @@ +# infra + +nullstone + cobblestone deploys, runbooks, audits. + +## Layout + +``` +forgejo/ Forgejo + runner deploy artifacts (live on nullstone) +runbooks/ Migration + decision docs + ├─ MIGRATION-nullstone-to-cobblestone.md + └─ DE-DECISION-cobblestone.md +repos/ Repo audits (cross-host inventory) + └─ REPO-AUDIT-2026-05-05.md +AUDIT-2026-05-05.md 5-agent stack audit (synthesis) +``` + +## Conventions + +- Per-service deploy at `/` mirrors `/opt/docker//` + on nullstone/cobblestone host. +- Runbooks dated; do not silently update — append a new dated entry + if procedure changes. +- Memory record: `~/.claude/projects/-home-admin-ai-lab/memory/project_forgejo_nullstone.md` diff --git a/forgejo/DEPLOY.md b/forgejo/DEPLOY.md new file mode 100644 index 0000000..655bd6e --- /dev/null +++ b/forgejo/DEPLOY.md @@ -0,0 +1,176 @@ +# Forgejo deploy runbook — nullstone + +Self-host plan: replace GH Actions free-tier (quota-bound) with +Forgejo + forgejo-runner running on nullstone. Same `build-iso.yml` +workflow, no GH dependency. + +## Pre-flight + +- nullstone reachable at 192.168.0.100 (LAN) and via tailscale (mesh) +- Traefik running, `proxy` docker network exists +- Gandi API token configured in traefik env (LiveDNS scope, s8n.ru only) + → letsencrypt resolver works for new hostnames automatically +- DNS for `git.s8n.ru` must point at nullstone's public IP (Gandi + manual web UI; API can't add new records per memory + reference_gandi_api.md) + +## Step 1 — DNS + +Add A record `git.s8n.ru → ` via Gandi web UI. +Wait ~2min for propagation. Verify: + +```bash +dig +short git.s8n.ru @1.1.1.1 +``` + +## Step 2 — copy compose files to nullstone + +```bash +scp /home/admin/ai-lab/nullstone-server/forgejo/docker-compose.yml \ + nullstone:/tmp/forgejo-compose.yml +scp /home/admin/ai-lab/nullstone-server/forgejo/runner-compose.yml \ + nullstone:/tmp/forgejo-runner-compose.yml + +ssh nullstone bash <<'EOF' +sudo mkdir -p /opt/docker/forgejo/{data,config} +sudo mkdir -p /opt/docker/forgejo-runner/{data,cache} +sudo chown -R 1000:1000 /opt/docker/forgejo +sudo mv /tmp/forgejo-compose.yml /opt/docker/forgejo/docker-compose.yml +sudo mv /tmp/forgejo-runner-compose.yml /opt/docker/forgejo-runner/docker-compose.yml +EOF +``` + +## Step 3 — first-start Forgejo + +```bash +ssh nullstone 'cd /opt/docker/forgejo && docker compose up -d' +ssh nullstone 'docker logs -f forgejo' & # watch first-start +``` + +When you see `Listen: http://0.0.0.0:3000`, Forgejo is up. Hit + in your browser. Traefik gets the LE cert +automatically. + +## Step 4 — initial admin user + +The first-time wizard at `/install` is *disabled* by env (we set +`FORGEJO__service__DISABLE_REGISTRATION=true`). Create the admin via +CLI inside the container: + +```bash +ssh nullstone 'docker exec -u 1000 forgejo \ + forgejo admin user create \ + --admin \ + --username admin \ + --email \ + --random-password \ + --must-change-password=false' +``` + +The random password gets printed once — save it somewhere safe. +Login at git.s8n.ru with `admin` + that password, change it via the +web UI's user settings. + +## Step 5 — generate runner registration token + +```bash +ssh nullstone 'docker exec -u 1000 forgejo \ + forgejo actions generate-runner-token' +``` + +Output is a single line — copy it into `.env` next to the runner +compose: + +```bash +echo "RUNNER_TOKEN=" | ssh nullstone 'sudo tee /opt/docker/forgejo-runner/.env' +ssh nullstone 'sudo chmod 600 /opt/docker/forgejo-runner/.env' +``` + +## Step 6 — start runner + +```bash +ssh nullstone 'cd /opt/docker/forgejo-runner && docker compose up -d' +ssh nullstone 'docker logs -f forgejo-runner' +``` + +Look for `Runner registered successfully`. Verify in Forgejo web UI: +Site Administration → Actions → Runners — should list `nullstone`. + +## Step 7 — mirror veilor-os repo + +In the Forgejo web UI: +1. Create org `veilor-org` (matches GH org name). +2. Click + → Migrate Repository. +3. Type: GitHub. URL: `https://github.com/veilor-org/veilor-os`. +4. Mirror = ON. Description: "self-hosted mirror; primary dev here". +5. Click Migrate. + +Forgejo pulls the repo + all branches + tags + actions config. Once +done, push from local will go to BOTH (set as second remote): + +```bash +cd ~/ai-lab/_github/veilor-os +git remote add nullstone https://git.s8n.ru/veilor-org/veilor-os +git push nullstone main v0.7-bluebuild-spike +``` + +## Step 8 — flip workflow to nullstone runner + +Change `build-iso.yml`: + +```yaml +runs-on: ubuntu-24.04 # before +runs-on: nullstone # after — picks up our forgejo runner +``` + +Push to nullstone remote. Watch Forgejo Actions tab. Same workflow, +runs on our hardware, no GH minutes. + +## Step 9 — close the loop + +Mirror Forgejo → GitHub for public visibility. Forgejo settings on +the repo → Mirror → Push mirror → `https://github.com/veilor-org/veilor-os` +with a GH PAT that has write access. Forgejo pushes on every commit. + +End state: +- `git push origin` → GH (public mirror) +- `git push nullstone` → Forgejo (primary; runs CI) +- Forgejo auto-pushes to GH for visibility +- ISO builds run unlimited on nullstone hardware +- 0 GH Actions minutes consumed + +## Disk needs + +- Forgejo data: ~1GB initial, grows ~100MB/yr per repo +- Runner workspace: ~80GB free recommended for ISO builds (squashfs + + downloaded RPMs + xorriso staging) +- Runner cache: ~20GB for `actions/cache`-style hits across builds + +Confirm with `df -h /` on nullstone before kickoff. + +## Resource cost + +- Forgejo: ~200MB RAM idle, ~500MB during build queues +- Runner: idle 50MB, ~4GB during ISO build (depsolve + squashfs) +- Network: ~2GB/build (Fedora package download) + +Should fit alongside existing nullstone services without contention. + +## Rollback + +If anything breaks: + +```bash +ssh nullstone 'cd /opt/docker/forgejo && docker compose down' +ssh nullstone 'cd /opt/docker/forgejo-runner && docker compose down' +``` + +Local repo `origin` still points at GH; nothing on the dev side +changes. ISO builds fall back to GH Actions until quota cycles. + +## See also + +- veilor-os roadmap: `_github/veilor-os/docs/ROADMAP.md` +- nullstone service inventory: `~/ai-lab/SYSTEM.md` +- Existing service patterns: `/opt/docker/headscale/`, + `/opt/docker/authentik/` diff --git a/forgejo/forgejo-compose.yml b/forgejo/forgejo-compose.yml new file mode 100644 index 0000000..adeec13 --- /dev/null +++ b/forgejo/forgejo-compose.yml @@ -0,0 +1,68 @@ +# Forgejo — self-hosted git + CI for veilor-org +# Deploy path on nullstone: /opt/docker/forgejo/ +# Domain: git.s8n.ru +# +# Why: GH Actions free-tier minute quota was hammering veilor-os builds +# (5+ ISO builds = 150min, repeatable runner-shortage failures). Forgejo +# Actions takes the same `build-iso.yml` workflow unmodified and runs it +# on hardware we own. Bonus: full git host independence. +# +# Design notes: +# - Image pinned by tag, not digest, until we automate pinning. Forgejo +# releases roughly every 1-2 months; bump in this file. +# - SSH on host port 222 (host 22 is sshd for nullstone admin). Forgejo's +# internal ssh server reads /var/lib/forgejo/.ssh/authorized_keys, no +# pam, no sudo. +# - HTTP-only inside the proxy network; Traefik terminates TLS at the edge +# via the existing letsencrypt resolver (Gandi LiveDNS DNS-01). +# - `userns: host` matches the nullstone Docker convention so volume +# ownership maps cleanly to host UID 1000 (memory: project_nullstone_docker_userns.md). + +services: + forgejo: + image: codeberg.org/forgejo/forgejo:9-rootless + container_name: forgejo + restart: unless-stopped + user: "1000:1000" + environment: + - USER_UID=1000 + - USER_GID=1000 + - FORGEJO__database__DB_TYPE=sqlite3 + - FORGEJO__server__DOMAIN=git.s8n.ru + - FORGEJO__server__ROOT_URL=https://git.s8n.ru/ + - FORGEJO__server__SSH_DOMAIN=git.s8n.ru + - FORGEJO__server__SSH_PORT=222 # public-facing SSH on host:222 + - FORGEJO__server__SSH_LISTEN_PORT=2222 # container-internal listen port + - FORGEJO__server__START_SSH_SERVER=true + - FORGEJO__server__OFFLINE_MODE=false + - FORGEJO__security__INSTALL_LOCK=true # skip /install wizard; use envs above + - FORGEJO__service__DISABLE_REGISTRATION=true # invite-only; no public signup + - FORGEJO__service__REQUIRE_SIGNIN_VIEW=false # public repos viewable + - FORGEJO__actions__ENABLED=true # turns on Forgejo Actions + - FORGEJO__actions__DEFAULT_ACTIONS_URL=github # so `uses: actions/checkout@v4` resolves + - FORGEJO__webhook__ALLOWED_HOST_LIST=* # for GH mirroring webhooks + - FORGEJO__log__LEVEL=Info + # Email (optional; SMTP via authentik or local relay) + # - FORGEJO__mailer__ENABLED=false + volumes: + - /home/docker/forgejo/data:/var/lib/gitea + - /home/docker/forgejo/config:/etc/gitea + - /etc/timezone:/etc/timezone:ro + - /etc/localtime:/etc/localtime:ro + ports: + - "0.0.0.0:222:2222/tcp" # public SSH for git-over-ssh (port 22 is host sshd) + networks: + - proxy + labels: + - "traefik.enable=true" + - "traefik.docker.network=proxy" + - "traefik.http.routers.forgejo.rule=Host(`git.s8n.ru`)" + - "traefik.http.routers.forgejo.entrypoints=websecure" + - "traefik.http.routers.forgejo.tls=true" + - "traefik.http.routers.forgejo.tls.certresolver=letsencrypt" + - "traefik.http.routers.forgejo.middlewares=security-headers@file,rate-limit@file,no-guest@file" + - "traefik.http.services.forgejo.loadbalancer.server.port=3000" + +networks: + proxy: + external: true diff --git a/forgejo/migration-report-2026-05-05.md b/forgejo/migration-report-2026-05-05.md new file mode 100644 index 0000000..f81a52f --- /dev/null +++ b/forgejo/migration-report-2026-05-05.md @@ -0,0 +1,59 @@ +# Forgejo Migration Report — 2026-05-05 + +**Summary:** All 6 GitHub repos owned/admined by `s8n-ru` are mirrored to `git.s8n.ru` with healthy push-mirrors GH→Forgejo→GH; only fix this run was correcting the default branch on `s8n-ru/x` from `KisaragiEffective-patch-1` back to `master`. + +## Scope + +- GitHub auth verified: `gh auth status` → `s8n-ru` (token scopes incl. `repo`, `admin:org`, `delete_repo`). +- Forgejo auth verified: `~/.config/veilor-forgejo-pat.txt` → API user `s8n-ru` (id=1, is_admin=true). +- Inventory taken via `gh repo list` for user `s8n-ru` and org `veilor-org` (only org user belongs to). No archived repos and no forks were returned. +- All API calls to Forgejo went via the internal-via-alpine route (`docker run --rm --network proxy alpine:3 ... http://forgejo:3000`) since `https://git.s8n.ru/` is locked by the `no-guest@file` ACL. + +## State file + +`/tmp/migrate-state.tsv` was used as the resume-tracker so a re-run wouldn't redo work. Final contents: + +| owner | name | status | notes | +|-------------|-------------------|---------|--------------------------------| +| s8n-ru | x | done | default-branch-fixed-this-run | +| s8n-ru | minecraft-launcher| done | already-mirrored | +| s8n-ru | auth-limbo | done | already-mirrored | +| s8n-ru | minecraft-server | done | already-mirrored | +| s8n-ru | 8bit-icons | done | already-mirrored | +| veilor-org | veilor-os | skipped | already-migrated (per spec) | + +## Audit + +GH HEAD (default branch) compared against Forgejo HEAD on the same branch name; branch and tag counts compared with full pagination; push-mirror existence verified with `last_error == ""`. + +| Owner | Name | Default | GH HEAD | FJ HEAD | Branches GH/FJ | Tags GH/FJ | Push-mirror | Last sync (UTC+1) | +|-------------|-------------------|---------|-------------|-------------|----------------|------------|-------------|---------------------| +| s8n-ru | x | master | a2c1ed23 | a2c1ed23 | 84 / 84 | 1310 / 1310| yes | 2026-05-06 02:17:27 | +| s8n-ru | minecraft-launcher| main | ae760edd | ae760edd | 1 / 1 | 1 / 1 | yes | 2026-05-06 02:14:24 | +| s8n-ru | auth-limbo | main | b6863806 | b6863806 | 1 / 1 | 0 / 0 | yes | 2026-05-06 02:14:26 | +| s8n-ru | minecraft-server | main | ede60294 | ede60294 | 1 / 1 | 0 / 0 | yes | 2026-05-06 02:14:26 | +| s8n-ru | 8bit-icons | main | 42a3252d | 42a3252d | 1 / 1 | 0 / 0 | yes | 2026-05-06 02:14:26 | +| veilor-org | veilor-os | main | b40e89a3 | b40e89a3 | 22 / 22 | 2 / 2 | yes | (pre-existing) | + +All push-mirrors target `https://github.com//.git` with `sync_on_commit: true`. + +## Findings on this run + +- Previous attempt's API timeout left every repo intact and content-correct, but on `s8n-ru/x` the default-branch metadata had been set to `KisaragiEffective-patch-1` (an in-flight feature branch from upstream `KisaragiEffective`, presumably the last branch processed when the timeout hit). Fixed via `PATCH /api/v1/repos/s8n-ru/x { "default_branch": "master" }`. All 84 branches and 1310 tags were already present, so no re-mirror was needed. +- All five s8n-ru push-mirrors and the veilor-org/veilor-os mirror reported `last_error: ""` and recent successful syncs, confirming the GH→Forgejo→GH bidirectional path was healthy before this run started. + +## Failures + +None. + +## Skipped + +| Owner | Name | Reason | +|------------|-----------|-----------------------------------------| +| veilor-org | veilor-os | Already migrated before this task began | + +No archived repos and no forks (where user is not source author) were encountered. No repo exceeded 1 GB (largest is `s8n-ru/x` at ~268 MB). + +## Cleanup + +`/tmp/migrate/` removed. `/tmp/migrate-state.tsv` retained for the next re-run. diff --git a/forgejo/runner-compose.yml b/forgejo/runner-compose.yml new file mode 100644 index 0000000..76756ca --- /dev/null +++ b/forgejo/runner-compose.yml @@ -0,0 +1,63 @@ +# Forgejo Runner — CI executor for veilor-org repos +# Deploy path on nullstone: /opt/docker/forgejo-runner/ +# +# act_runner is Forgejo's drop-in GH Actions runner. Reads workflow +# YAML, spawns container per job, reports results back to Forgejo. +# +# Design notes: +# - Privileged + host networking + Docker socket access. Required for the +# veilor-os ISO build because livecd-creator needs loop devices and +# --privileged. This is the same trust model as our existing GH Actions +# workflow which uses `--privileged` inside `addnab/docker-run-action@v3`. +# - Single runner with label `nullstone` so workflows can opt in via +# `runs-on: nullstone`. Existing `runs-on: ubuntu-24.04` will not be +# picked up — that's intentional, lets us flip workflows one at a time. +# - Cache + workdir on host SSD, persistent across container restarts. +# - act_runner config gets generated on first start; registration token +# must be set in `.env` (see deploy-runbook.md). + +services: + forgejo-runner: + image: code.forgejo.org/forgejo/runner:6.4.0 + container_name: forgejo-runner + restart: unless-stopped + user: "0:0" # runner needs root to dind + privileged: true + userns_mode: "host" # privileged ⊥ userns-remap default + environment: + # Internal hostname — runner reaches forgejo container directly on + # the proxy net, bypasses traefik + no-guest@file ACL. Cleaner + + # faster than going out the public path. + - INSTANCE_URL=http://forgejo:3000 + - REGISTRATION_TOKEN=${RUNNER_TOKEN} + - RUNNER_NAME=nullstone + # Labels map `runs-on:` keys in workflow YAML to docker images. + # ubuntu-24.04 → catthehacker/ubuntu (widely-used GH Actions image). + # Add `nullstone` label resolving to privileged Fedora 43 so our + # build-iso.yml can opt in selectively (`runs-on: nullstone`). + - RUNNER_LABELS=ubuntu-24.04:docker://ghcr.io/catthehacker/ubuntu:act-24.04,nullstone:docker://registry.fedoraproject.org/fedora:43 + entrypoint: ["/bin/sh", "-c"] + command: + - | + set -e + # Register only on first start; subsequent restarts read /data/.runner. + # $$VAR escapes compose interpolation so vars resolve in the container. + if [ ! -f /data/.runner ]; then + /bin/forgejo-runner register \ + --no-interactive \ + --instance "$$INSTANCE_URL" \ + --token "$$REGISTRATION_TOKEN" \ + --name "$$RUNNER_NAME" \ + --labels "$$RUNNER_LABELS" + fi + exec /bin/forgejo-runner daemon + volumes: + - /home/docker/forgejo-runner/data:/data + - /var/run/docker.sock:/var/run/docker.sock # docker-out-of-docker + - /home/docker/forgejo-runner/cache:/cache + networks: + - proxy + +networks: + proxy: + external: true diff --git a/repos/REPO-AUDIT-2026-05-05.md b/repos/REPO-AUDIT-2026-05-05.md new file mode 100644 index 0000000..c3f2db0 --- /dev/null +++ b/repos/REPO-AUDIT-2026-05-05.md @@ -0,0 +1,486 @@ + + +# Repo Audit — 2026-05-05 + +Combined audit of every repo on `git.s8n.ru` (Forgejo, primary host) and +`github.com` (mirror destination for migrated repos), plus per-repo file +trees and an ownership/anomaly summary. + +- **Scopes covered:** `s8n-ru` (user, both hosts), `veilor-org` (org, both hosts). +- **`racked-team`:** does **not** exist on github.com (`gh` returns + "owner handle was not recognized"). User memory references the brand + but no GH org with that handle is registered. See Anomalies section. +- **Forgejo access:** internal-only via `alpine:3 + curl` on the + `proxy` docker network (`http://forgejo:3000`). `https://git.s8n.ru/` + is locked by the `no-guest@file` ACL. + +--- + +## 1. Summary + +| Metric | Count | +|---|---| +| Total distinct repos | 8 | +| Mirrored (Forgejo ↔ GitHub) | 6 | +| Forgejo-only | 2 (`veilor-org/veilor-server`, `veilor-org/infra`) | +| GitHub-only | 0 | +| Empty repos | 1 (`veilor-org/infra` — initial bare on Forgejo) | +| Archived | 0 | +| Forks | 0 (per migration report; `s8n-ru/x` is detached fork-of-Misskey, treated as standalone) | + +**By owner:** + +| Owner | Forgejo repos | GitHub repos | +|---|---|---| +| `s8n-ru` (user) | 5 | 5 | +| `veilor-org` (org) | 3 | 1 | + +**By primary host:** + +| Primary host | Repos | +|---|---| +| `git.s8n.ru` | 8 (all repos that exist on Forgejo) | +| `github.com` only | 0 | + +`git.s8n.ru` is the canonical write side for every migrated repo; GitHub +receives sync-on-commit + 8h interval pushes (`sync_on_commit: true`, +`interval: "8h0m0s"`). The two Forgejo-only repos (`veilor-server`, +`infra`) currently have no GH counterpart — see Anomalies. + +--- + +## 2. Ownership Table + +Last-commit timestamps are taken from the default branch tip on Forgejo +(equal to GitHub for all mirrored repos — SHAs verified identical on +2026-05-06). Sizes are Forgejo-reported KiB unless the repo is GH-only. + +| Repo | Owner | Visibility | Default | Stars (GH) | Last commit | Size (KiB) | License | Mirror status | Primary host | +|---|---|---|---|---|---|---|---|---|---| +| `x` | `s8n-ru` | private | `master` | 0 | 2026-05-05 13:46 | 283 674 | AGPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru | +| `minecraft-launcher` | `s8n-ru` | public | `main` | 0 | 2026-05-05 05:26 | 3 644 | GPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru | +| `auth-limbo` | `s8n-ru` | public | `main` | 0 | 2026-05-05 05:09 | 79 | AGPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru | +| `minecraft-server` | `s8n-ru` | private | `main` | 0 | 2026-05-04 18:37 | 383 | AGPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru | +| `8bit-icons` | `s8n-ru` | private | `main` | 0 | 2026-05-01 21:31 | 554 | AGPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru | +| `veilor-os` | `veilor-org` | private | `main` | 0 | 2026-05-06 02:01 | 376 | MIT | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru | +| `veilor-server` | `veilor-org` | public | `main` | n/a | 2026-05-06 04:00 | 54 | none in tree | **no mirror** — Forgejo-only | git.s8n.ru | +| `infra` | `veilor-org` | private | `main` | n/a | (empty) | 22 | n/a | **no mirror** — Forgejo-only, currently bare | git.s8n.ru | + +Notes: +- All push-mirrors target `https://github.com//.git`. +- Last sync per push-mirror reported `last_error: ""` on both 2026-05-05 (migration run) and 2026-05-06 (this audit run). +- "Stars (GH)" is 0 for every repo — these are personal/org-internal projects, not public-marketing items. +- License column reflects what GitHub's API picked up; private repos may have a `LICENSE` file in tree without a GH-detected SPDX id (see `s8n-ru/x` and `s8n-ru/8bit-icons` — verified by file presence in tree). + +--- + +## 3. Per-Repo File Trees + +Two-level trees (root + first level under each top-level dir). Repos +with > 500 entries are summarised: top-dirs ranked by entry count, with +per-dir totals shown instead of expanding. Listed alphabetically by +`/`. + +### `s8n-ru/8bit-icons` + +131 entries. AMOLED pixel-art icon pack for Android (24×24 monochrome). + +``` +. +├── .github/ +│ ├── ISSUE_TEMPLATE/ +│ ├── workflows/ +│ └── PULL_REQUEST_TEMPLATE.md +├── android-app/ +│ ├── app/ +│ ├── gradle/ +│ ├── .gitignore +│ ├── build.gradle +│ ├── gradle.properties +│ ├── gradlew +│ ├── gradlew.bat +│ └── settings.gradle +├── assets/ +│ ├── png/ +│ ├── previews/ +│ └── svg/ +├── docs/ +│ └── .gitkeep +├── mappings/ +│ ├── aliases.json +│ ├── appfilter.xml +│ └── requests.json +├── scripts/ +│ ├── build-appfilter.py +│ ├── lint-icons.py +│ ├── png-to-svg.py +│ ├── svg2png.sh +│ └── sync-android.py +├── .gitignore +├── CHANGELOG.md +├── CODE_OF_CONDUCT.md +├── CONTRIBUTING.md +├── LICENSE +├── README.md +├── ROADMAP.md +└── STYLE_GUIDE.md +``` + +### `s8n-ru/auth-limbo` + +32 entries. Paper plugin fixing AuthMe post-login teleport race. + +``` +. +├── .github/ +│ ├── ISSUE_TEMPLATE/ +│ └── workflows/ +├── docs/ +│ ├── compatibility.md +│ ├── configuration.md +│ ├── how-it-works.md +│ └── installation.md +├── lib/ +│ ├── .gitkeep +│ └── README.md +├── src/ +│ └── main/ +├── .gitignore +├── CHANGELOG.md +├── LICENSE +├── README.md +└── pom.xml +``` + +### `s8n-ru/minecraft-launcher` + +_Large repo — 1796 entries._ +_Top dirs by size:_ `app/` (1529), `libraries/` (154), `program_info/` (32), `cmake/` (22), `docs/` (8), `scripts/` (6), `buildconfig/` (3), `.github/` (2) + +``` +. +├── .github/ (2 entries) +├── app/ (1529 entries) +├── buildconfig/ (3 entries) +├── cmake/ (22 entries) +├── docs/ (8 entries) +├── libraries/ (154 entries) +├── program_info/ (32 entries) +├── scripts/ (6 entries) +├── tools/ (1 entries) +├── .clang-format +├── .clang-tidy +├── .editorconfig +├── .envrc +├── .git-blame-ignore-revs +├── .gitattributes +├── .gitignore +├── .gitmodules +├── .markdownlint.yaml +├── .markdownlintignore +├── BUILD_AND_DEPLOY_V1.sh +├── BUILD_GUIDE.md +├── CHANGELOG.md +├── CMakeLists.txt +├── CMakePresets.json +├── CODE_OF_CONDUCT.md +├── CONTRIBUTING.md +├── COPYING.md +├── Containerfile +├── INSTALL_DEPS.sh +├── LICENSE +├── PROJECT_SUMMARY.md +├── README.md +├── README_RELEASE.md +├── RELEASE_CHECKLIST.md +├── default.nix +├── renovate.json +├── shell.nix +├── vcpkg-configuration.json +└── vcpkg.json +``` + +### `s8n-ru/minecraft-server` + +224 entries. racked.ru Minecraft server config + custom plugin (Purpur 1.21.11). + +``` +. +├── docs/ +│ ├── migrations/ +│ ├── plugins/ +│ ├── BACKUP.md +│ ├── DEPLOY.md +│ ├── PERMISSIONS.md +│ ├── PLUGINS.md +│ ├── PLUGIN_ALTERNATIVES.md +│ ├── RACKED_BRAND.md +│ ├── REBRAND_2026-04-30.md +│ └── ROADMAP.md +├── live-server/ +│ ├── plugins/ +│ ├── .modrinth-manifest.json +│ ├── .rcon-cli.env +│ ├── .rcon-cli.yaml +│ ├── bukkit.yml +│ ├── commands.yml +│ ├── docker-compose.yml +│ ├── eula.txt +│ ├── help.yml +│ ├── log4j2.xml +│ ├── ops.json +│ ├── permissions.yml +│ ├── pufferfish.yml +│ ├── purpur.yml +│ ├── server.properties +│ ├── spigot.yml +│ ├── wepif.yml +│ └── whitelist.json +├── scripts/ +│ └── backup.sh +├── .gitignore +├── LICENSE +├── MISSION.md +├── README.md +├── RULES.md +├── TELEMETRY_AUDIT.md +├── THANKS.md +├── VIBE.md +└── docker-compose.yml +``` + +### `s8n-ru/x` + +_Large repo — 3249 entries._ +_Top dirs by size:_ `packages/` (3062), `locales/` (41), `.github/` (35), `scripts/` (19), `cypress/` (11), `assets/` (9), `chart/` (9), `.devcontainer/` (5) + +Private fork of Misskey, rebranded as Twitter/X for the `x.veilor` silo. Default branch `master` (mid-migration metadata bug fixed 2026-05-05; see migration report). + +``` +. +├── .config/ (4 entries) +├── .devcontainer/ (5 entries) +├── .github/ (35 entries) +├── .okteto/ (1 entries) +├── .vscode/ (2 entries) +├── assets/ (9 entries) +├── chart/ (9 entries) +├── cypress/ (11 entries) +├── fluent-emojis/ (0 entries) +├── idea/ (4 entries) +├── locales/ (41 entries) +├── packages/ (3062 entries) +├── patches/ (1 entries) +├── scripts/ (19 entries) +├── .dockerignore +├── .dockleignore +├── .editorconfig +├── .gitattributes +├── .gitignore +├── .gitmodules +├── .node-version +├── .npmrc +├── .vsls.json +├── CHANGELOG-X.md +├── CHANGELOG.md +├── CODE_OF_CONDUCT.md +├── CONTRIBUTING.md +├── COPYING +├── Dockerfile +├── LICENSE +├── NOTICE.md +├── Procfile +├── README.md +├── ROADMAP-X.md +├── ROADMAP.md +├── SECURITY.md +├── codecov.yml +├── compose.local-db.yml +├── compose_example.yml +├── crowdin.yml +├── cypress.config.ts +├── healthcheck.sh +├── package.json +├── pnpm-lock.yaml +├── pnpm-workspace.yaml +└── renovate.json5 +``` + +### `veilor-org/infra` + +Empty repo (no commits). Created 2026-05-06 09:51 BST as the canonical +home for nullstone+cobblestone deploys, runbooks, and audits. Local +working clone is at `~/ai-lab/_github/infra/` but has no commits and no +remote wired up yet. No tree to render. + +### `veilor-org/veilor-os` + +162 entries. Hardened Fedora KDE remix; primary on Forgejo since 2026-05-06 02:01. + +``` +. +├── .github/ +│ ├── workflows/ +│ ├── CODEOWNERS +│ └── PULL_REQUEST_TEMPLATE.md +├── assets/ +│ ├── branding/ +│ ├── fonts/ +│ ├── installer/ +│ ├── kde/ +│ ├── konsole/ +│ ├── plymouth/ +│ ├── sddm/ +│ └── wallpapers/ +├── build/ +│ ├── Containerfile +│ └── build-iso.sh +├── docs/ +│ ├── research/ +│ ├── BUILD.md +│ ├── CLI.md +│ ├── HARDENING.md +│ ├── INSTALL.md +│ ├── INSTALLER.md +│ ├── POWER.md +│ ├── ROADMAP.md +│ ├── STRATEGY.md +│ └── THREAT-MODEL.md +├── kickstart/ +│ └── veilor-os.ks +├── overlay/ +│ ├── etc/ +│ └── usr/ +├── scripts/ +│ ├── apparmor/ +│ ├── selinux/ +│ ├── 10-harden-base.sh +│ ├── 20-harden-kernel.sh +│ ├── 30-apply-v03-theme.sh +│ ├── firstboot.sh +│ └── kde-theme-apply.sh +├── test/ +│ ├── test-runs/ +│ ├── METHOD-CHANGELOG.md +│ ├── README.md +│ ├── TESTING.md +│ ├── auto-install-keymap.sh +│ ├── auto-install.sh +│ ├── boot-checklist.md +│ └── run-vm.sh +├── upstream/ +│ ├── fedora-kde-common.ks +│ ├── fedora-live-base.ks +│ ├── fedora-live-kde-base.ks +│ ├── fedora-live-kde.ks +│ ├── fedora-live-minimization.ks +│ ├── fedora-repo-not-rawhide.ks +│ └── fedora-repo.ks +├── .gitignore +├── CHANGELOG.md +├── CONTRIBUTING.md +├── LICENSE +└── README.md +``` + +### `veilor-org/veilor-server` + +7 entries. Hardened-by-default Debian server-install ISO builder. + +``` +. +├── .gitignore +├── LICENSE +├── README.md +├── build.sh +├── flash.sh +├── post-install.sh +└── preseed.cfg.tpl +``` + +--- + +## 4. Anomalies + +Items the operator should review. + +### Forgejo-only repos with no GitHub mirror + +These will diverge from GH if/when a GH counterpart is created later, and they currently have no off-site mirror at all. Both live in `veilor-org`. + +| Repo | State | Recommendation | +|---|---|---| +| `veilor-org/veilor-server` | Public on Forgejo, 7 files (Debian preseed bootstrap), no push-mirror | Mirror to `github.com/veilor-org/veilor-server` to match the `veilor-os` policy (Forgejo primary, GH read-only). Memory entry `project_veilor_server_bootstrap` confirms this is intended sibling to veilor-os. | +| `veilor-org/infra` | Private on Forgejo, **empty repo** (no commits, no default branch tip), no push-mirror | Either initial-commit and add a push-mirror to GH, or delete if the canonical infra repo is intended to live elsewhere. Local `~/ai-lab/_github/infra/` has uncommitted content but no `.git/config` remote pointing at this repo. | + +### Off-site backup gap + +All five `s8n-ru/*` repos and `veilor-org/veilor-os` have a healthy +push-mirror to GH; the two anomalies above do not. If `git.s8n.ru` (on +nullstone) goes down, the two `veilor-org` Forgejo-only repos have **no +remote copy**. This is the only off-site-backup gap in the inventory. + +### Missing GitHub org `racked-team` + +CLAUDE.md / user memory references a `racked-team` GH org (separate +from `veilor-org`, for the racked.ru Minecraft brand). `gh api` confirms +the handle is **not registered** on github.com: "the owner handle +'racked-team' was not recognized as either a GitHub user or an +organization". The racked-related repos (`minecraft-launcher`, +`minecraft-server`, `auth-limbo`) all live under `s8n-ru/`, not under a +brand-scoped org. + +Either: +- the org was never created (memory drift — should be reconciled), or +- the org has a different handle (e.g. `racked-ru`, `rackedteam`). + +`gh api /user/orgs` returns only `veilor-org` for the active token; no +other org membership exists for `s8n-ru`. + +### Missing/undetected licenses + +| Repo | Tree has `LICENSE`? | GH SPDX detected | Note | +|---|---|---|---| +| `s8n-ru/x` | yes (`LICENSE` + `COPYING`) | AGPL-3.0 | OK | +| `s8n-ru/8bit-icons` | yes | AGPL-3.0 | OK | +| `s8n-ru/minecraft-server` | yes | AGPL-3.0 | OK | +| `s8n-ru/auth-limbo` | yes | AGPL-3.0 | OK | +| `s8n-ru/minecraft-launcher` | yes | GPL-3.0 | OK | +| `veilor-org/veilor-os` | yes | MIT | OK | +| `veilor-org/veilor-server` | yes | (no GH copy yet) | Will need GH detection once mirrored. | +| `veilor-org/infra` | n/a | n/a | Empty repo. | + +No undetected-license repos. (All public repos surface the correct SPDX id on GitHub.) + +### Default-branch hygiene + +All repos: default branch matches the active dev branch. The +`s8n-ru/x` `master`-vs-`KisaragiEffective-patch-1` drift caught in the +previous migration run is still resolved (Forgejo & GH both report +`master` at SHA `a2c1ed2…`). + +### Archived / dormant + +No archived repos on either host. No repos with > 30 days since last +commit. Latest activity per repo (default branch tip): + +``` +veilor-org/veilor-server 2026-05-06 04:00 +veilor-org/veilor-os 2026-05-06 02:01 +s8n-ru/x 2026-05-05 13:46 +s8n-ru/minecraft-launcher 2026-05-05 05:26 +s8n-ru/auth-limbo 2026-05-05 05:09 +s8n-ru/minecraft-server 2026-05-04 18:37 +s8n-ru/8bit-icons 2026-05-01 21:31 +veilor-org/infra (empty) +``` + +--- + +_Generated: 2026-05-05. Verifies state of git.s8n.ru and github.com as of the API responses captured during this run. Push-mirror SHAs verified equal between hosts (`s8n-ru/x` `a2c1ed23…`, `s8n-ru/minecraft-launcher` `ae760edd…`, `s8n-ru/auth-limbo` `b6863806…`, `s8n-ru/minecraft-server` `ede60294…`, `s8n-ru/8bit-icons` `42a3252d…`, `veilor-org/veilor-os` `b40e89a3…`)._ diff --git a/runbooks/DE-DECISION-cobblestone.md b/runbooks/DE-DECISION-cobblestone.md new file mode 100644 index 0000000..00c2303 --- /dev/null +++ b/runbooks/DE-DECISION-cobblestone.md @@ -0,0 +1,170 @@ +# Cobblestone Desktop Environment: Keep or Strip + +**Status:** Decision pending operator confirmation of which DE shipped. +**Date:** 2026-05-06 +**Scope:** cobblestone (Debian server, fresh install with DE present). + +--- + +## TL;DR + +Cobblestone is a service host, not a workstation. The operator already has a Fedora 43 KDE laptop (onyx) for daily driving and a precedent (nullstone) for headless servers. A desktop environment on cobblestone costs ~500 MB RAM, 5–8 GB disk, and an attack surface dominated by Xorg/Wayland plus the DE session manager — none of which earns its keep once the box is in steady state. The honest counter-argument is bring-up convenience: during the first few weeks of migrating Traefik, Forgejo, Authentik, Headscale, step-ca, Matrix (Tuwunel + LiveKit), Misskey, Pi-hole, n8n, and Minecraft, an operator who needs to debug TLS chains or federation handshakes may want a local browser. Recommendation: **strip after a 30-day soak (target 2026-06-05)**, install `cockpit` behind Authentik OIDC at `cobblestone.s8n.ru` for occasional GUI-feeling admin, and treat the bare console (HDMI + USB keyboard) as the recovery path. Strip-now is also defensible if the operator is comfortable doing all bring-up via SSH from onyx — that is genuinely how nullstone runs today. + +--- + +## Side-by-side comparison + +| Axis | Keep DE | Strip DE | +|---|---|---| +| RAM idle | ~500 MB | ~50 MB | +| Disk | ~5–8 GB | ~400 MB | +| Attack surface | Xorg/Wayland + DM (sddm/gdm3/lightdm) + ~200 GUI deps + plymouth | sshd + cron + journalctl + dockerd | +| Recovery (network down) | Plug monitor + kbd, GUI login, debug | Plug monitor + kbd, console login, debug | +| Update cadence | Track DE CVEs (KDE Plasma is frequent; GNOME less so; XFCE quiet) | Kernel + sshd + dockerd only | +| Useful when | First 24h bring-up; Firefox to hit internal CA pages; rare on-box troubleshooting | Almost always after week 1 | + +**Key insight on recovery:** the GUI login does *not* save you when the network is down. A console login on `tty1` lets you run the same `journalctl`, `ip a`, `systemctl status` commands. The DE adds polish, not capability. + +--- + +## Decision matrix + +``` + Cobblestone has DE installed + | + +-----------+----------+ + | | + Operator works Cobblestone is + mainly on onyx? daily-driver too? + | | + YES NO + | | + +------+------+ KEEP DE + | | + Mid-migration? Settled? + | | + KEEP (soak) STRIP NOW + 30-day flip +``` + +Operator works mainly on onyx (yes), cobblestone is not a daily driver (no). We are mid-migration (services not yet moved). **Path: KEEP for soak, flip on 2026-06-05.** + +--- + +## Recommendation: strip after 30-day soak + +1. Leave the DE in place during the migration of the listed services. +2. Calendar a reminder for **2026-06-05** to revisit. +3. On that date, if no service troubleshooting still depends on a local browser/GUI editor, run the strip procedure below. +4. Install `cockpit` immediately (today) regardless — it is useful with or without the DE and gives a soft landing for "I just want to see disk usage". + +Why not strip now: Tuwunel federation debugging, Misskey AGPL endpoint validation, and step-ca chain inspection sometimes benefit from a browser pointed at `localhost`. SSH port-forwarding from onyx covers 95% of that, but the first migration of each service is the worst time to discover the 5%. + +Why not keep forever: cobblestone is not a workstation. Every Plasma/GNOME CVE becomes a patch obligation for zero return. + +--- + +## Install instead of DE (do this today) + +- **cockpit + cockpit-machines + cockpit-podman** — web admin on port 9090. Front it with a Traefik vhost `cobblestone.s8n.ru` behind Authentik OIDC. Drop-in for "show me disk/CPU/services in a UI". +- **lazydocker** — TUI for docker. Faster than `docker ps -a` for daily ops. +- **dive** — image-layer inspector. Useful when an image is 2 GB and you want to know why. +- **glances** — htop with optional web UI on port 61208 (firewall it; cockpit covers most cases). +- **mc** (midnight commander) — file manager replacement for the no-GUI case. +- **Claude Code on cobblestone** — separate decision; not blocking. Running it on cobblestone enables ssh-less ops and lets cron/agent jobs operate on the box natively. If installed, gate it behind the same SSO posture as cockpit. + +--- + +## Strip commands per DE flavour + +The operator has not confirmed which DE shipped. Run `ls /usr/bin/*session* 2>/dev/null; dpkg -l | grep -E 'task-(xfce|gnome|kde|mate|cinnamon)-desktop'` first to identify it. + +**Important:** `task-*-desktop` is a meta-package. Removing it alone does NOT remove the desktop — you must remove the actual package set too, then `apt autoremove --purge`. Always run `apt autoremove --purge` with caution: review the list before pressing `y`. It can sweep packages you wanted to keep if a DE dependency was the only reverse-dep. + +### XFCE +``` +sudo apt remove --purge \ + task-xfce-desktop xfce4 xfce4-* \ + lightdm lightdm-gtk-greeter \ + xorg xserver-xorg* \ + plymouth plymouth-themes +sudo apt autoremove --purge +``` + +### GNOME +``` +sudo apt remove --purge \ + task-gnome-desktop gnome-shell gnome-session gnome-* \ + gdm3 \ + xorg xserver-xorg* xwayland \ + plymouth plymouth-themes +sudo apt autoremove --purge +``` + +### KDE Plasma +``` +sudo apt remove --purge \ + task-kde-desktop kde-plasma-desktop plasma-* kde-* \ + sddm sddm-theme-* \ + xorg xserver-xorg* xwayland \ + plymouth plymouth-themes +sudo apt autoremove --purge +``` + +### MATE +``` +sudo apt remove --purge \ + task-mate-desktop mate-desktop-environment mate-* \ + lightdm lightdm-gtk-greeter \ + xorg xserver-xorg* \ + plymouth plymouth-themes +sudo apt autoremove --purge +``` + +### Cinnamon +``` +sudo apt remove --purge \ + task-cinnamon-desktop cinnamon cinnamon-* \ + lightdm lightdm-gtk-greeter \ + xorg xserver-xorg* \ + plymouth plymouth-themes +sudo apt autoremove --purge +``` + +### After any of the above +``` +sudo systemctl set-default multi-user.target +sudo systemctl disable --now sddm gdm3 lightdm 2>/dev/null +sudo apt install --no-install-recommends cockpit cockpit-podman lazydocker mc glances +sudo reboot +``` + +Confirm `systemctl get-default` returns `multi-user.target` and `who` shows only ssh/console sessions after reboot. + +--- + +## What breaks when you strip + +| Lost capability | Replacement | +|---|---| +| Browser to test internal CA pages | `curl --cacert /etc/step-ca/certs/root_ca.crt https://...` or SSH port-forward from onyx | +| GUI text editor | vim / nano (already installed) | +| File manager | `mc` or shell | +| LightDM/SDDM/GDM autostart | `multi-user.target` (pure systemd) | +| Plymouth boot splash | Plain text scroll (better for debugging boot issues) | +| Local Firefox for OIDC login flows | Port-forward `ssh -L 9090:localhost:9090 cobblestone` from onyx, then hit `http://localhost:9090` in onyx Firefox | + +None of these are losses for a service host. The text-scroll boot is arguably an upgrade — Plymouth hides the systemd unit that hung on boot, which is exactly the moment you need to see it. + +--- + +## Open questions for the operator + +1. Which DE actually shipped on cobblestone? (XFCE / GNOME / KDE / MATE / Cinnamon) +2. Strip-now or 30-day soak? Default recommendation is soak. +3. Install Claude Code on cobblestone? Out of scope for this doc, but related. +4. Cockpit vhost name confirmed as `cobblestone.s8n.ru`? + +--- + +**Path:** `/home/admin/ai-lab/_github/infra/runbooks/DE-DECISION-cobblestone.md` diff --git a/runbooks/MIGRATION-nullstone-to-cobblestone.md b/runbooks/MIGRATION-nullstone-to-cobblestone.md new file mode 100644 index 0000000..33468e2 --- /dev/null +++ b/runbooks/MIGRATION-nullstone-to-cobblestone.md @@ -0,0 +1,630 @@ + + +# Migration runbook — nullstone → cobblestone + +Goal: relocate the Docker stack (~28 containers, ~227 GiB state) from +**nullstone** (Debian 13, 192.168.0.100, AMD Ryzen 5 2600X / 32 GiB / +477 GiB NVMe, no LUKS) to **cobblestone** (Debian, fresh, LAN, hardware +TBD by operator), and close audit regression **F4 (no LUKS at rest)** +in the same window. + +This runbook is read-only on both hosts until cutover (section 4). +Sections 1–3 are inventory + planning; section 4 is the destructive +cutover; sections 5–7 are follow-through. + +## Things we don't know about cobblestone yet — operator to fill in + +| Question | Why it matters | Default if unset | +|---|---|---| +| CPU model / cores / threads | Sizing for parallel postgres + Ollama + MC | Assume ≥ Ryzen 5 2600X parity | +| RAM | 32 GiB nullstone runs 50 % util peak; less = trim MC + Ollama | Require ≥ 32 GiB | +| Storage layout (LVM? ZFS? plain?) | Decides LUKS strategy in 3a | Assume single NVMe, plain ext4 | +| GPU present (any) | Ollama / vLLM / Misskey thumb GPU helpers | Assume none, leave Ollama on friend RTX 4080 | +| LUKS already enabled at install? | If no → reinstall window or LUKS-on-file fallback | Assume **no** (act accordingly) | +| Static IP allocated? | Cutover plan needs a parking IP | Assume DHCP, target `.101` for cutover | +| DE installed? | Strip vs keep debate | Confirmed installed; default = strip | +| User account name + uid | Bind-mount permissions on /home/docker | Assume `user`, uid 1000 (mirror nullstone) | + +Update this table before running section 3. + +--- + +## 1 — Pre-migration audit (run on nullstone) + +All commands read-only. SSH as `user@192.168.0.100` +(per `feedback_nullstone_ssh_user.md` — `admin@` is rejected). + +### 1.1 Container inventory + +```bash +ssh user@192.168.0.100 'docker ps -a --format "{{json .}}"' \ + > nullstone-containers-$(date +%F).jsonl +ssh user@192.168.0.100 'docker inspect $(docker ps -aq)' \ + > nullstone-inspect-$(date +%F).json +``` + +Parse for `Names`, `Image`, `Mounts[].Source`, `NetworkSettings.Networks`, +`HostConfig.RestartPolicy`, `Config.Labels` (Traefik routers). + +### 1.2 Volumes (size estimate) + +```bash +ssh user@192.168.0.100 'docker volume ls --format "{{.Name}}"' \ + | xargs -I {} ssh user@192.168.0.100 \ + "docker run --rm -v {}:/v alpine du -sh /v 2>/dev/null | sed 's|/v|{}|'" +``` + +Cross-reference with `/home/user/docker-data/100000.100000/volumes/` +(userns-remapped path) for per-volume bytes. + +### 1.3 Network + +```bash +ssh user@192.168.0.100 'docker network ls; \ + ss -tlnp 2>/dev/null | grep LISTEN; \ + iptables-save 2>/dev/null; nft list ruleset 2>/dev/null' +``` + +Capture Traefik vhosts: + +```bash +ssh user@192.168.0.100 'cd /opt/docker/traefik && \ + ls dynamic/; cat dynamic/*.yml | grep -E "rule:|sourceRange:"' +``` + +### 1.4 Cron + scheduled tasks + +```bash +ssh user@192.168.0.100 'sudo cat /etc/crontab /etc/cron.d/* 2>/dev/null; \ + for u in $(cut -d: -f1 /etc/passwd); do \ + crontab -u $u -l 2>/dev/null && echo "(user $u)"; done' +``` + +Known: `/etc/cron.d/docker-backup` runs `/opt/docker/backup.sh` daily at +02:00 — **broken** (F-backup-1, fix in section 5). + +### 1.5 Systemd + +```bash +ssh user@192.168.0.100 'systemctl list-unit-files \ + --state=enabled --type=service --no-pager' +``` + +Watch for: `docker.service`, `tailscaled.service`, `ollama.service` +(Ollama runs on host, not in Docker), `chrony.service`, `ssh.service`. + +### 1.6 Disk + memory + cpu baseline + +```bash +ssh user@192.168.0.100 'df -hT; \ + sudo du -sh /home/docker/* /opt/docker/* /opt/backups 2>/dev/null; \ + free -h; lscpu | head -20; nproc' +``` + +Reference (2026-05-06 spot check): +`/` 30 G (37 %) · `/var` 12 G (17 %) · `/home` 399 G (60 %, 226 G used). +Most state is on `/home`. + +### 1.7 Daemon config + +```bash +ssh user@192.168.0.100 'cat /etc/docker/daemon.json /etc/subuid /etc/subgid; \ + sudo cat /etc/systemd/system/docker.service.d/override.conf 2>/dev/null' +``` + +Known good (carry forward except possibly userns-remap, see 3c): + +```json +{ + "log-driver": "json-file", + "log-opts": {"max-size": "10m", "max-file": "3"}, + "live-restore": true, + "icc": false, + "userns-remap": "default", + "default-address-pools": [{"base": "172.20.0.0/16", "size": 24}], + "storage-driver": "overlay2", + "no-new-privileges": true +} +``` + +--- + +## 2 — Secret + state catalog + +Anything in this table that is **lost** or **corrupted** during transfer +forces re-issuance / re-pinning / re-handshake. Group by criticality. + +### Tier 0 — irreplaceable (lose this and external systems break) + +| Path | Bytes (est.) | Restore cost if lost | +|---|---|---| +| `/opt/docker/step-ca/data/secrets/` + `/opt/docker/step-ca/.env` | < 1 MiB | Re-issue every internal cert; reinstall `veilor-root.crt` on every device that uses `*.veilor` / internal-CA chains. Hard. | +| `/opt/docker/traefik/data/acme.json` (LE prod) | < 1 MiB | Hits LE rate-limit (5 dupe certs/wk per FQDN, 50 certs/wk per registered domain). Could lock cert issuance for a full week. | +| `/opt/docker/traefik/data/acme-internal.json` (step-ca chain) | < 1 MiB | Step-ca re-issues fast, but every leaf reissue invalidates pinned trust anchors. | +| `/opt/docker/headscale/config/private.key` + `/opt/docker/headscale/data/db.sqlite` | < 50 MiB | Loss = every node re-enrolls; preauthkeys, routes, ACLs reset. Friend GPU node identity churn. | +| `/etc/ssh/ssh_host_*` | < 1 MiB | Either copy → keep TOFU pinning intact, OR rotate → all clients hit "key changed" warning (acceptable but noisy). | + +### Tier 1 — application secrets (loss → password reset cascade) + +| Path | Bytes (est.) | Notes | +|---|---|---| +| `/opt/docker/forgejo/data/gitea/conf/app.ini` (note: file is `app.ini` under `gitea/conf/` even on Forgejo) | ~10 KiB | `SECRET_KEY`, `INTERNAL_TOKEN`, `JWT_SECRET`, `LFS_JWT_SECRET`, OAuth client secrets. | +| `/opt/docker/authentik/.env` + authentik PG dump | tens of MiB | `AUTHENTIK_SECRET_KEY`, `PG_PASS`. Any service trusting Authentik OIDC needs `client_secret` re-handover. | +| `/opt/docker/misskey/.env` + misskey PG dump | < 1 MiB env | `id`, `db.user/pass`, `redis.pass`, master key. | +| `/opt/docker/n8n/.env` + n8n PG dump | < 1 MiB env | Encryption key for credentials at rest — **lose this and stored creds inside n8n flows are unrecoverable**. | +| `/opt/docker/rocketchat/.env` + Mongo dump (currently stopped — see 4.1) | < 1 MiB env | First-admin still unclaimed (audit risk item). | +| `/opt/docker/tuwunel*/etc/tuwunel.toml` | < 1 MiB | Server signing key seed; lose = federation re-onboard from zero. | +| `/opt/docker/livekit/livekit.yaml` | < 1 KiB | `keys:` map (api-key→secret); JWT minter (`lk-jwt-service`) shares this. | +| `/opt/docker/pihole/etc-pihole/` | ~50 MiB | Adlists + custom DNS; rebuildable in 30 min if lost. | +| Gandi PAT (`GANDIV5_PERSONAL_ACCESS_TOKEN` in `/opt/docker/traefik/.env`) | <1 KiB | Re-issuable from Gandi UI; LiveDNS-only scope (per `reference_gandi_api.md`). | +| Tailscale auth keys (Headscale) | regenerate via `headscale preauthkeys create` | OK to regenerate. | + +### Tier 2 — bulk data (large, but reproducible OR low-stakes) + +| Path | Bytes (est.) | Notes | +|---|---|---| +| Misskey `/files/` (S3-style local) | tens of GiB | User uploads — irreplaceable to users. Dedup-friendly. | +| Forgejo `/home/docker/forgejo/data/git/` | ~5 GiB now | Git repos; also mirrored to GH per `project_forgejo_nullstone.md`, so partial DR exists. | +| `dl-veilor` static files | ~1 GiB | Public ISO downloads; rebuildable from veilor-os pipeline. | +| n8n flows (in `n8n_n8n_data`) | < 1 GiB | Encrypted with key from Tier 1; export JSON via UI as belt-and-braces. | +| Minecraft world (`/home/docker/minecraft/data/`) | ~10–30 GiB | Players will riot if lost. | +| Ollama models (`/home/user/models/ollama/`) | ~17 GiB | Re-downloadable from registry; not blocking. | +| Postgres dumps (authentik, misskey-db, n8n-postgres) | covered by `pg_dumpall` in 4.1 | | +| MongoDB dump (rocketchat-mongodb) | covered by `mongodump` in 4.1 | Container is **stopped** today — start, dump, stop. | + +### Tier 3 — config-as-code (safely re-deployable from `~/ai-lab/_github/`) + +- All `/opt/docker/*/docker-compose.yml` — committed under + `~/ai-lab/_github/infra/repos/` and `~/ai-lab/nullstone-server/`. +- Traefik `dynamic/*.yml` middleware files. +- Treat as authoritative in repo; copy from repo to cobblestone, not + from nullstone. Diff old-compose vs repo-compose during section 3d to + catch any uncommitted drift. + +--- + +## 3 — Cobblestone install plan + +### 3a — OS layer + +Verify base: + +```bash +ssh user@cobblestone 'cat /etc/debian_version; uname -r; lsb_release -a' +``` + +**LUKS2 (mandatory — closes F4):** + +- **Path A (preferred):** reinstall with full-disk LUKS2 from the + Debian installer (`/`, `/home`, swap all on encrypted PVs). Set up + TPM2 unattended unlock post-install: + ```bash + systemd-cryptenroll --tpm2-device=auto --tpm2-pcrs=0+7 /dev/nvmeXnYpZ + ``` + PCR 0+7 binds to firmware + secure-boot state; bricks if firmware + is updated → fall back to passphrase. +- **Path B (fallback if reinstall blocked):** LUKS-on-file loopback + for the high-value subset only: + - `/opt/docker/step-ca/` + - `/opt/docker/traefik/data/acme*.json` + - `/opt/docker/headscale/` + - postgres data dirs + - Mongo keyfile volume + This is **strictly worse** than Path A (rest of disk still + cleartext, including misskey uploads and forgejo repos), but it + closes the highest-value subset. Document as accepted risk. + +Hostname + base packages: + +```bash +sudo hostnamectl set-hostname cobblestone +sudo apt update && sudo apt install -y \ + curl ca-certificates gnupg jq ufw fail2ban chrony \ + rsync restic tmux htop iotop ncdu +``` + +**DE strip vs keep — recommendation: STRIP.** + +Cost of keeping: ~500 MiB RAM, ~5 GiB disk, larger attack surface +(CUPS, avahi, polkit, GUI daemons on localhost). Benefit: local +browser for vhost testing, on-keyboard recovery if SSH wedges. + +- **Default (strip):** `sudo apt purge '*-desktop' '*xorg*' lightdm + sddm gdm3 'plymouth*' libreoffice-* && sudo apt autoremove --purge`. + Install Cockpit for web admin behind Traefik + `no-guest@file`. +- **Keep:** lock SDDM/GDM local-only via PAM, disable XDMCP, mask + `cups-browsed`. No auto-login. + +Operator picks; document choice in SYSTEM.md. + +### 3b — Network + +**IP allocation during cutover** — use `192.168.0.101` for +cobblestone while nullstone stays on `.100`. Flip DNS / port-forwards +last (section 4.6). Avoids ARP collisions and keeps rollback trivial. + +**nftables ruleset** (mirror nullstone pattern — read live ruleset off +nullstone in 1.3, replay on cobblestone): + +```bash +sudo systemctl enable --now nftables +# Drop in /etc/nftables.conf with: +# - default policy drop on input +# - accept established/related +# - accept lo +# - accept 22 (SSH) from LAN + tailnet +# - accept 80/443 (Traefik) from anywhere +# - accept 222 (Forgejo SSH) from LAN + tailnet +# - accept 25565 (Minecraft) from anywhere +# - log+drop everything else +``` + +**IPv6:** audit reports nullstone has `net.ipv4.ip_forward=1` (F30). +That was an *unintended carryover* from a Tailscale subnet-router +experiment. **Do NOT** copy `/etc/sysctl.d/` from nullstone wholesale. +Instead, set explicitly: + +```bash +sudo tee /etc/sysctl.d/99-cobblestone.conf <<'EOF' +net.ipv4.ip_forward = 0 +net.ipv6.conf.all.forwarding = 0 +net.ipv4.conf.all.rp_filter = 1 +net.ipv4.conf.all.send_redirects = 0 +net.ipv4.conf.all.accept_redirects = 0 +EOF +sudo sysctl --system +``` + +If Headscale or Tailscale subnet-router is wired later, re-enable +`ip_forward` with explicit comment + audit note. + +**Tailscale + Headscale node identity:** +- Cleanest path: re-enroll cobblestone from scratch. New node, new + node-key, list `cobblestone` separately from `nullstone` in + Headscale during cutover week. +- Alternative: copy `/var/lib/tailscale/` from nullstone → cobblestone + to inherit the existing identity. Saves one ACL update but + conflates audit history. Not recommended. + +### 3c — Docker + +Install via official repo: + +```bash +curl -fsSL https://download.docker.com/linux/debian/gpg | \ + sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg +echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \ + https://download.docker.com/linux/debian $(lsb_release -cs) stable" | \ + sudo tee /etc/apt/sources.list.d/docker.list +sudo apt update && sudo apt install -y \ + docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin +``` + +**`/etc/docker/daemon.json` — userns-remap decision.** + +Two paths; operator decides. Document choice in SYSTEM.md. + +**Path 1 — DROP userns-remap (recommended):** same JSON as nullstone +minus the `userns-remap` line. + +- Pros: no more `chown 101000` dance; nsenter trick + (`feedback_docker_sudo_bypass.md`) drops the `--userns=host` flag; + Mongo keyfile pattern from `project_nullstone_docker_userns.md` + becomes unnecessary; `docker exec` UIDs match host 1:1. +- Cons: container root → host uid 0. Compensated by + `no-new-privileges`, `icc=false`, per-compose CAP drops, read-only + root FS where compatible. Net: small regression in defense-in-depth, + large workflow simplification. + +**Path 2 — KEEP userns-remap:** carry `/etc/subuid` + `/etc/subgid` +identically (`user:100000:65536`). Existing on-disk ownership at uid +`101000` transfers without rechown. Cost: persisting the daily +friction the operator has been hitting for months. + +**Default: Path 1.** If chosen, after rsync: +```bash +sudo chown -R user:user /home/docker /opt/docker +# Then per-service to the container uid (forgejo 1000, postgres 999, +# mongo 999, traefik 0). +``` + +Networks (must exist before Traefik comes up): + +```bash +docker network create proxy +docker network create socket-proxy-net +docker network create misskey-frontend +``` + +### 3d — Service redeploy order + +Topological. Each step depends only on its predecessors. Verification +command and rollback at each stage. + +| # | Stack | Depends on | Verify | Rollback | +|---|---|---|---|---| +| 1 | networks (`proxy`, `socket-proxy-net`, `misskey-frontend`) | docker daemon | `docker network ls` | `docker network rm` | +| 2 | `socket-proxy` | network `socket-proxy-net` | `docker logs socket-proxy` shows API filter active | down compose | +| 3 | `traefik` | socket-proxy + acme.json/acme-internal.json carryover + Gandi PAT in .env | `curl -k https://sys.s8n.ru` returns dashboard auth challenge; `docker logs traefik` shows resolver init OK; cert files repopulate without LE call (acme.json reuse) | down compose; acme.json restore from backup | +| 4 | `step-ca` | traefik (for ACME-back) | `docker exec step-ca step ca health`; Traefik internal-CA resolver issues a cert against `https://step-ca:9000/acme/acme/directory` | down compose; revert traefik resolver config | +| 5 | `headscale` | traefik | `curl https://hs.s8n.ru/health`; `docker exec headscale headscale nodes list` shows existing nodes (db.sqlite carryover) | down compose; restore db.sqlite snapshot | +| 6 | authentik (`postgres → redis → server → worker`) | traefik | `curl https://auth.s8n.ru/-/health/ready/`; OIDC discovery doc loads | per-component down | +| 7 | `forgejo` | traefik (+ optional authentik, currently unwired) | `curl https://git.s8n.ru/api/v1/version`; `git clone ssh://git@cobblestone:222/...` | down compose; data dir tar-revert | +| 8 | misskey (`db → redis → misskey → x-source`) | traefik, network `misskey-frontend` | `curl https://x.veilor/api/meta` returns JSON; signup page renders | down compose; pg dump restore | +| 9 | `tuwunel` + `tuwunel-txt` | traefik | `curl https://matrix.veilor.uk/_matrix/federation/v1/version` and `https://mx.s8n.ru/_matrix/federation/v1/version` | down compose; data tar-revert | +| 10 | `cinny-txt` + `commet-web` + `signup-page` + `signup-txt` | tuwunel reachable, traefik | `curl -I https://txt.s8n.ru` 200; static assets 200 | down compose | +| 11 | `livekit-server` + `lk-jwt-service` | traefik (TURN over HTTPS) | `wscat wss://livekit.veilor.uk/`; jwt service `/healthz` | down compose | +| 12 | n8n (`postgres → n8n`) | traefik, restored encryption key | `curl https://n8n.s8n.ru/healthz`; UI loads with existing flows | pg dump restore | +| 13 | `pihole` | traefik | `dig @cobblestone | head`; admin UI auth | down compose | +| 14 | `forgejo-runner` | forgejo (#7) reachable on internal name | `docker logs forgejo-runner` shows `Runner registered successfully` | down compose; regenerate token via `forgejo actions generate-runner-token` | +| 15 | `minecraft-mc` | traefik (only for filebrowser-mc), router port-forward 25565 | `mcstatus mc.racked.ru` (or `nc -zv cobblestone 25565`) | down compose; world tar-revert | +| 16 | `dl-veilor` + `filebrowser-mc` | traefik | `curl https://dl.veilor.org/v0.2.0/veilor-root.crt` | down compose | +| 17 | `anythingllm` | traefik **with `no-guest@file` middleware applied** OR LAN-only bind — must NOT bring up like nullstone (port 3001 publicly exposed, audit F-anythingllm-1) | `curl -I -H 'Host: ai.s8n.ru' https://cobblestone` from off-LAN must 403 | down compose | +| 18 | RocketChat (`mongodb → rocketchat`) | **operator decision** — currently stopped on nullstone; if not retired, restore from mongodump produced in 4.1 | `curl https://rc.s8n.ru/api/info`; first-admin claim if still pending | leave stopped (matches today's state) | + +--- + +## 4 — Cutover sequence + +### 4.1 — Snapshot state on nullstone + +```bash +NS=user@192.168.0.100 +TS=$(date +%F-%H%M) +DEST=/opt/snap/$TS +ssh $NS "sudo mkdir -p $DEST && sudo chown user:user $DEST" + +# Postgres dumps +for pg in authentik-postgres misskey-db n8n-postgres-1; do + ssh $NS "docker exec $pg pg_dumpall -U postgres" \ + | gzip > $DEST/$pg.sql.gz +done + +# Mongo (start, dump, stop again — currently stopped per audit) +ssh $NS 'cd /opt/docker/rocketchat && docker compose up -d rocketchat-mongodb && sleep 15' +ssh $NS 'docker exec rocketchat-mongodb mongodump \ + --username root \ + --password "$(grep MONGO_INITDB_ROOT_PASSWORD /opt/docker/rocketchat/.env | cut -d= -f2)" \ + --authenticationDatabase admin --archive' \ + | gzip > $DEST/rocketchat.archive.gz +ssh $NS 'cd /opt/docker/rocketchat && docker compose stop rocketchat-mongodb' + +# Forgejo full dump (covers DB + repos + LFS + attachments) +ssh $NS 'docker exec -u 1000 forgejo \ + forgejo dump --type tar.zst --file /tmp/forgejo-dump.tar.zst' +ssh $NS 'docker cp forgejo:/tmp/forgejo-dump.tar.zst -' \ + > $DEST/forgejo-dump.tar.zst + +# Stop everything before tar (consistency) +ssh $NS 'for d in /opt/docker/*/; do \ + [ -f "$d/docker-compose.yml" ] && \ + (cd "$d" && docker compose down) ; \ +done' + +# Bulk state tar +ssh $NS "sudo tar --acls --xattrs -cpf - /opt/docker /home/docker /opt/backups" \ + | zstd -T0 -19 > $DEST.tar.zst + +# Manifest +ssh $NS "find /opt/docker /home/docker -type f -print0 \ + | xargs -0 sha256sum" > $DEST.sha256 +``` + +Hold the tarball plus dumps in two places: cobblestone target host +and an offline USB. `acme.json` and step-ca secrets get an +*additional* armored copy to the password manager. + +### 4.2 — rsync to cobblestone + +After the tarball lands, repopulate cobblestone: + +```bash +COBB=user@192.168.0.101 +scp $DEST.tar.zst $COBB:/tmp/ +ssh $COBB 'sudo mkdir -p /opt/docker /home/docker /opt/backups && \ + sudo zstd -d /tmp/snap.tar.zst -o /tmp/snap.tar && \ + sudo tar --acls --xattrs -xpf /tmp/snap.tar -C /' +# If userns-remap dropped (Path 1 in 3c): +ssh $COBB 'sudo chown -R user:user /opt/docker /home/docker' +``` + +### 4.3 — Bring up services on cobblestone + +Walk section 3d table top to bottom. **Stop and verify** at each row +before the next. Don't batch — one bad startup cascades. + +For services that store internal hostnames (Tuwunel `server_name`, +Headscale `server_url`, Forgejo `ROOT_URL`), the values stay the same +because public DNS still resolves to the WAN IP — only the internal LAN +target changes. No app config edits needed for cutover. + +### 4.4 — Verify per vhost + +```bash +for host in sys.s8n.ru git.s8n.ru auth.s8n.ru pihole.s8n.ru \ + signup.txt.s8n.ru hs.s8n.ru rc.s8n.ru n8n.s8n.ru \ + txt.s8n.ru mx.s8n.ru x.veilor matrix.veilor.uk \ + chat.veilor.uk livekit.veilor.uk signup.veilor.uk \ + dl.veilor.org; do + echo -n "$host: " + curl --resolve $host:443:192.168.0.101 -sI https://$host | head -1 +done +``` + +Then push key flows: +- `git push nullstone-remote` (alias still works because DNS is + unchanged) — Forgejo CI runs. +- Matrix federation: `curl https://federationtester.matrix.org/api/report?server_name=veilor.uk`. +- Misskey signup: hit invite-gated form, complete signup, federation + test post. + +### 4.5 — Cutover network + +Two paths; operator picks based on appetite. + +**Path A — DNS swing** (lower risk, slower propagation): +1. Lower `*.s8n.ru` and `*.veilor*` A-record TTLs to 60 s **a week + before** cutover (Gandi UI; can't be done via API per + `reference_gandi_api.md`). +2. Day-of: change A records from `82.31.156.86` (assumed unchanged + public IP) only if the WAN NAT target has changed (e.g. router + port-forwards now point at `.101`). If WAN IP and port-forwards + stay the same and you swap LAN IPs (`.100` → `.101`), no public + DNS edit needed — only edit `/etc/hosts` on internal clients (per + `feedback_s8n_hosts_override.md`). + +**Path B — IP takeover** (faster, higher rollback friction): +- Bring nullstone down on `.100`, change cobblestone from `.101` → + `.100`, restart networking. Public DNS + router port-forwards + unchanged. Rollback = swap IPs back. + +Update onyx `/etc/hosts` long pin line **last**: +``` +192.168.0. rc.s8n.ru n8n.s8n.ru pihole.s8n.ru sys.s8n.ru \ + mx.s8n.ru txt.s8n.ru signup.txt.s8n.ru git.s8n.ru x.veilor \ + dl.veilor.org +``` + +### 4.6 — Update memory + ai-lab docs + +- `~/ai-lab/CLAUDE.md` — Device Registry: add `cobblestone` row, mark + `nullstone` as `decom 2026-MM-DD`. +- `~/ai-lab/SYSTEM.md` — replace nullstone hardware/network blocks + with cobblestone equivalents; keep nullstone as "cold spare" until + wipe. +- `~/ai-lab/README.md` — device table one-liner. +- `~/ai-lab/security/` — create `cobblestone-server/` folder; first + audit due within 7 days of cutover. +- Memory files to update: `project_nullstone_docker_userns.md` + (mark **superseded** if userns-remap dropped), + `project_forgejo_nullstone.md`, + `project_rocketchat_nullstone.md`, `project_tailscale_mesh.md`, + `feedback_nullstone_ssh_user.md`, `feedback_s8n_hosts_override.md` + (new IP). + +### 4.7 — Cold spare + wipe + +- Hold nullstone powered-off but cabled, 7 days minimum. +- If no rollback triggered, wipe: full LUKS reformat (or `nvme + format -s1` for crypto-erase if drive supports it), then either + donate or repurpose as cobblestone backup target (Restic + destination — closes audit recommendation #6). + +--- + +## 5 — Post-migration immediate fixes + +Carried over from `nullstone-server/audit-report-2026-05-05.md`: + +- **F-backup-1 — fix `/opt/docker/backup.sh`:** remove dead + `matrix-postgres` block (Synapse retired); correct + `rocketchat-mongodb` container name; replace literal + `CHANGE_ME_MONGO_ADMIN_PASSWORD` with read from + `/opt/docker/rocketchat/.env`. Verify next 02:00 run produces + non-zero RC + Mongo dumps. +- **no-guest@file ACL:** populate `sourceRange` to cover LAN + (`192.168.0.0/24`) + tailnet (`100.64.0.0/10`) + IPv6 equivalents. + Verify XFF chain restores client IP at the entryPoint level + (`forwardedHeaders.trustedIPs`). +- **anythingllm:** front via Traefik with `no-guest@file` OR bind + LAN-only. Must not repeat the 0.0.0.0:3001 nullstone state. +- **LUKS:** done at install (3a). Verify via `cryptsetup status` + + `systemd-cryptenroll --tpm2-device=list` post-cutover. +- **Restic + autorestic** to B2/Wasabi or to nullstone-as-spare, + with restore drill scheduled. +- **Vaultwarden** to centralize the secrets currently sprayed across + `.env` files. +- **Gatus** with cert-expiry checks + ntfy/Matrix alerts. +- **CrowdSec** with bouncer plugin at Traefik for the public + HTTP attack surface. +- **Beszel** for one-pane host metrics. + +--- + +## 6 — Open questions (operator decisions) + +| Question | Default if undecided | +|---|---| +| Strip DE on cobblestone? | **Strip + Cockpit.** Easier to defend; remote admin via web UI through Traefik + no-guest@file. | +| userns-remap on cobblestone? | **Off (Path 1 in 3c).** Operator pain outweighs the marginal isolation. Document tradeoff. | +| Move Headscale + step-ca to a $4 VPS? | **Defer (phase 2).** Keep on cobblestone for now; revisit once Restic + Gatus are running. SPOF mitigation is real but adds attack surface; do it once monitoring is in place. | +| RocketChat: bring back up or retire? | **Retire if not used in 30 days.** Currently stopped; first-admin still unclaimed. Mongo dump captured in 4.1, then drop the stack from cobblestone redeploy. Keeps `rc.s8n.ru` DNS for future revival. | +| Tailscale identity copy vs re-enroll for cobblestone? | **Re-enroll** (cleaner audit trail; Headscale ACLs need a one-line edit). | +| SSH host keys copy vs rotate? | **Copy.** TOFU pinning intact; one less "is this MITM?" prompt for clients. Add rotation to a follow-up cron. | +| Authentik wiring during cutover or after? | **After.** Authentik is currently mostly unwired (audit). Cutover is not the time to add new auth dependencies. | + +--- + +## 7 — Risks (severity-tagged) + +- 🔴 **acme.json mishandling = LE rate-limit.** Mitigation: copy + `acme.json` + `acme-internal.json` BEFORE bringing up Traefik on + cobblestone. Never let cobblestone Traefik issue a fresh batch of + certs. Hold a backup of both files in two locations. +- 🔴 **step-ca root key loss = full re-issuance.** Mitigation: + triple-copy `/opt/docker/step-ca/.env` + `data/secrets/` + (cobblestone, USB, password manager). Test that the encrypted root + key decrypts on cobblestone before tearing down nullstone. +- 🔴 **anythingllm reintroduces public 0.0.0.0:3001.** Mitigation: do + NOT bring it up before middleware is in place. Test from off-LAN + IP. +- 🟠 **PostgreSQL major-version skew.** Mitigation: pin same major on + cobblestone (`postgres:16-alpine` already pinned; do NOT use + `:latest`). If a major upgrade is desired, do it as a separate + step *after* cutover settles, with a fresh pg_dumpall as safety + net. +- 🟠 **Headscale node identity churn** if `db.sqlite` not copied. All + nodes (onyx, friend RTX 4080 PC, office) re-enroll. Mitigation: + copy `db.sqlite` + `private.key`; verify `headscale nodes list` + matches pre-cutover before flipping DNS. +- 🟡 **chrony NTS peers** may need re-trust on new host (NTS-KE binds + to hostname). Mitigation: chrony config copy verbatim; first + `chronyc tracking` should show stratum within 5 minutes. +- 🟡 **Authentik OIDC `client_secret`s.** Today: mostly unwired + (audit). Risk small. If Forgejo/RC/n8n were wired through + Authentik, each `client_secret` would need re-handover. Defer + Authentik wiring until post-cutover. +- 🟡 **Misskey AGPL §13 source endpoint** (`x-source`). Per + `project_x_misskey_fork.md`, the AGPL link must keep serving + source — and per the same memo, mute is acceptable for short + windows. Cutover downtime budget: **≤ 2 h**. If exceeded, post a + banner on `x.veilor` linking to `https://git.s8n.ru/s8n-ru/x` for + the duration. +- 🟡 **Backup script broken on copy.** Audit F-backup-1 still applies + if you copy `/opt/docker/backup.sh` verbatim. Fix during section 5, + not before — but do not let it run on cobblestone before fix + (disable the cron entry until corrected). + +--- + +## Appendix — quick reference + +- nullstone: `user@192.168.0.100`, Debian 13, 32 GiB / 477 GiB, ~28 + containers, no LUKS (F4). +- cobblestone: `user@192.168.0.101` during cutover, swing to `.100` + post-validation. +- LE wildcard `*.s8n.ru` + `*.veilor.uk` via Gandi DNS-01. Internal CA + via step-ca, Traefik resolver `internal-ca`. +- Out of scope: office workstation install, friend GPU re-enrollment, + veilor-os ISO build pipeline. + +--- + +**Path:** `/home/admin/ai-lab/_github/infra/runbooks/MIGRATION-nullstone-to-cobblestone.md` + +Two-line summary: pre-migration audit + secret catalog + cobblestone +install plan (LUKS2, optional userns-remap drop, 18-step topological +service redeploy) + cutover script + post-migration fixes carried over +from the 2026-05-05 audit. Operator must fill the "things we don't know +about cobblestone" table and pick on userns-remap / DE / RC retirement +before section 3 runs.