init: nullstone deploys + runbooks + audits

Sourced from previous audits + agent-wave outputs (2026-05-05):
  AUDIT-2026-05-05.md           — 5-agent stack synthesis
  forgejo/DEPLOY.md             — git.s8n.ru deploy runbook
  forgejo/forgejo-compose.yml   — production compose
  forgejo/runner-compose.yml    — forgejo-runner
  forgejo/migration-report-...  — GH→Forgejo migration audit (6/6 green)
  runbooks/MIGRATION-...        — nullstone→cobblestone runbook
  runbooks/DE-DECISION-...      — keep-vs-strip DE on cobblestone
  repos/REPO-AUDIT-2026-05-05.md — repo trees + ownership
This commit is contained in:
s8n 2026-05-06 10:02:28 +01:00
commit 09d80a63f6
9 changed files with 2045 additions and 0 deletions

370
AUDIT-2026-05-05.md Normal file
View file

@ -0,0 +1,370 @@
# 5-Agent Audit Report — 2026-05-05
Synthesis of 5 parallel agents covering: GitHub→Forgejo migration,
ai-lab structure, nullstone services, stack rating, recommended
additions.
Source agent outputs:
1. Migration agent → `nullstone-server/forgejo/migration-report-2026-05-05.md`
2. ai-lab structural audit
3. nullstone services + deployment audit
4. Stack rating (10 axes)
5. Recommended service additions
---
## TL;DR
- **GH → Forgejo migration: complete.** 6/6 repos mirrored
(5× s8n-ru/* + veilor-org/veilor-os). All HEADs match, branches
match, tags match, push-mirrors back to GH all green. Repaired one
default-branch metadata drift on `s8n-ru/x`. Zero failures.
- **Stack rating: 7/10.** Above-average self-hosted setup. Audit
discipline + identity/CA story unusually strong. Fragile on
monitoring + offsite backup + single-host.
- **Top 5 weaknesses (severity-ordered):** F4 no LUKS on nullstone
(regression), no monitoring/alerting, backups local-only with
silently broken script, `:latest` floats on most stacks, single
point of failure (nullstone + home WAN).
- **Top 5 services to add (priority):** Restic+autorestic, Vaultwarden,
Gatus, CrowdSec, Beszel.
- **Top 4 anti-recommendations:** Nextcloud, full LGTM stack, Mastodon,
HashiCorp Vault.
---
## 1 — GitHub repo migration
**Status: complete.** Per migration agent's report.
- 6 repos enumerated under `s8n-ru` user + admin'd orgs.
- 6 mirrored to `git.s8n.ru` (Forgejo); 5 fresh, 1 already pre-migrated
(`veilor-org/veilor-os`).
- HEADs / branches / tags match GH for all 6.
- Push-mirrors Forgejo → GH configured (8h interval + sync-on-commit),
all green.
- One repair: `s8n-ru/x` default branch was stuck on
`KisaragiEffective-patch-1` from Misskey upstream; PATCHed to
`master`.
Detail: `nullstone-server/forgejo/migration-report-2026-05-05.md`.
---
## 2 — ai-lab structural audit
### Devices
| codename | type | OS | role |
|---|---|---|---|
| onyx | laptop | Fedora 43 KDE | Dev workstation (DHCP `.28`, registry says `.6` — drift) |
| nullstone | server | Debian 13 | Infra host — Docker stack, mesh, Matrix/Misskey/RC |
| office | workstation | Fedora 43 KDE (pending install since 2026-04-19) | Office/sales (.5) |
External: friend PC `100.64.0.3` (RTX 4080, vLLM in WSL2).
### Active projects (`_github/`)
| repo | purpose | status |
|---|---|---|
| veilor-os | Hardened Fedora 43 KDE remix | actively iterating, BlueBuild + kickstart |
| auth-limbo | Paper plugin (racked.ru AuthMe fix) | active, released jars |
| minecraft-launcher | Custom MC launcher (PrismLauncher fork) | active, v1 build script |
| minecraft-server | Purpur MC at `mc.racked.ru:25565` | live in prod |
| minecraft-client | racked.ru MC client (FO 11.3.2 fork) | active |
### Per-device security audit cadence
| device | last audit | folder |
|---|---|---|
| nullstone | 2026-05-05 (ACL hardening); full 2026-05-02 | `security/nullstone-server/` (9 reports) |
| onyx | 2026-04-15 | `security/onyx-laptop/` (2 reports) |
| office | never | `security/office-workstation/` (empty) |
### Memory record (31 files, 1 index)
- 2 user, 7 feedback, 1 reference, 21 project memos.
- Top-active: matrix_veilor, txt_cinny, x_misskey_fork, tailscale_mesh,
friend_gpu, org_charter, brand_separation, simplex_org_chat.
### What this lab is
The operator runs a small home-lab/3-member CTO-style org
(`P M=CTO, nullstone=Runtime Owner, onyx-ai=Research/Review`) split
cleanly across **two brands** (per `project_brand_separation.md`):
1. **racked.ru** — privacy-first Minecraft platform (MC server +
client + custom launcher + AuthLimbo plugin)
2. **veilor** — security company stack (veilor-os hardened Fedora
ISO, veilor-server-bootstrap Debian preseed, Matrix at veilor.uk,
Misskey-fork at x.veilor)
All self-hosted on nullstone behind Traefik+Headscale+Pi-hole. Mesh
includes friend's RTX 4080 for remote LLM inference via Tailscale.
### Drift / gaps
- `office-workstation/` registered in CLAUDE.md but install pending
since 2026-04-19; no audit folder populated.
- README onyx IP `.6` vs actual DHCP `.28`.
- README folder tree doesn't match real repo (lists `_project_code/`
+ `scripts/`; reality has `_github/`, `_projects/`, `_archive/`,
`archive/`, `github/`, several `.sync-conflict-*` files, 30 MB
binary `re` at root).
- Two parallel `nullstone-server/` and `server/` device folders —
drift candidate.
- `MEMORY.md` index missing entry for `project_forgejo_nullstone.md`
(file present, index not updated).
- Sync-conflict files for CLAUDE.md / README.md / SYSTEM.md from
Syncthing merge never resolved.
- SYSTEM.md still mentions Jitsi/coturn / MAS Element X test
retired per project_matrix_veilor.md — TODO list not pruned.
---
## 3 — nullstone services + deployment audit
### Hardware
- **CPU:** AMD Ryzen 5 2600X (6c/12t)
- **RAM:** 32 GiB (15 used, 15 free, 24 GiB swap, 256 KiB used)
- **GPU:** GTX 1660 Ti 6 GB (Ollama)
- **Disk:** 477 GiB NVMe, LVM (`keystone-vg`):
- root 30 G (35% used)
- var 12 G (15%)
- **home 399 G (60%, 227 G used / 153 G free)** — watch growth
- tmp 2.7 G, swap 24 G
- **OS:** Debian 13, kernel 6.12.85+deb13
- **Docker:** v29.4.2, overlay2, **userns-remap=default**,
live-restore=true, icc=false, no-new-privileges=true. Data root
symlinked `/var/lib/docker → /home/user/docker-data`.
### Active services (28 containers)
Including: traefik, socket-proxy, authentik (server+worker+pg+redis),
forgejo + forgejo-runner, misskey + db + redis, x-source nginx,
rocketchat + mongodb, tuwunel + tuwunel-txt, cinny-txt, commet-web,
signup-page + signup-txt, livekit + lk-jwt-service, dl-veilor, pihole,
headscale, n8n + postgres, step-ca, filebrowser-mc, minecraft-mc,
anythingllm, plus 2 stale `alpine:3` shells from userns-host bypass.
### Domain → service map (all on `*.s8n.ru` or `*.veilor[.uk]`)
`sys.s8n.ru` (traefik dash), `git.s8n.ru` (forgejo, NEW), `auth.s8n.ru`
(authentik), `pihole.s8n.ru`, `signup.txt.s8n.ru`, `hs.s8n.ru`
(headscale), `rc.s8n.ru` (rocketchat), `n8n.s8n.ru`, `txt.s8n.ru`
(cinny), `mx.s8n.ru` (tuwunel-txt), `x.veilor` (misskey),
`matrix.veilor.uk`, `chat.veilor.uk` (commet), `livekit.veilor.uk`,
`signup.veilor.uk`, `dl.veilor.org`.
### Deployment patterns
- Compose: `/opt/docker/<svc>/docker-compose.yml`
- Data: named docker volumes under
`/home/user/docker-data/100000.100000/volumes/` + per-service
bind mounts. Newer services (forgejo, forgejo-runner, minecraft)
on `/home/docker/<svc>/` to dodge 30 G root.
- userns-remap quirk: container UIDs shifted +100000.
Workaround: alpine root container or chown to 101000.
- Docker socket exposure: traefik does NOT mount docker.sock; goes
via tecnativa/docker-socket-proxy on socket-proxy-net.
- Networks: `proxy` + `socket-proxy-net` + `misskey-frontend` +
per-stack internals (authentik-internal, misskey-internal, etc.).
- Middleware chain: `trusted-only@file → security-headers@file
→ rate-limit@file → <service-specific>` with `no-guest@file`
for routers needing tailnet+LAN but blocking public.
### Auth patterns
- **Authentik (auth.s8n.ru)** — central OIDC, all 4 components healthy.
**Currently mostly unwired.** Forgejo runs native auth (no OAUTH
section in app.ini). RC, n8n, anythingllm, filebrowser likely
local-auth too. Authentik present but underused.
- **Forgejo** — local users + PAT, admin `s8n-ru`, SSH 222.
- **Headscale** — preauthkey enrollment + `headscale-deny-leaks@file`.
- **Traefik dashboard** — basicauth + trusted-only@file.
### Backup state
- `/etc/cron.d/docker-backup` runs `/opt/docker/backup.sh` at 02:00
daily, 7-day rotation to `/opt/backups/`.
- **Script silently broken (HIGH):** matrix-postgres container is
gone (Synapse retired); rocketchat-mongodb name mismatch (script
expects `mongodb`); Mongo password reads literal
`CHANGE_ME_MONGO_ADMIN_PASSWORD`. So Rocket.Chat + (former) Matrix
dumps **not happening**. Misskey side-script works.
- **No off-host replication.** Single NVMe = total loss on disk
failure.
### Drift / risk register
- 🔴 Backup script broken (RC + ex-Matrix not dumping)
- 🔴 `anythingllm` listens 0.0.0.0:3001 with no traefik label,
bypasses entire L7 trust model. Either bind LAN-only or front via
traefik.
- 🟠 Resource limits: only minecraft-mc has memory/CPU limits.
30 other containers unbounded — runaway can OOM-kill neighbours.
- 🟠 No service-level health checks on ~half the containers.
- 🔴 `no-guest@file` IPAllowList stub: declares only
`sourceRange: ["127.0.0.0/8"]`. Routers chained with `no-guest`
reject everything except loopback unless XFF restores client IP.
**Verify** entryPoint forwardedHeaders.trustedIPs + middleware
ipStrategy.depth — misconfig either 403s real users or accepts
spoofed XFF.
- 🟡 office (100.64.0.4) not in `trusted-only@file` despite
`tag:infra` per SYSTEM.md.
- 🟠 RocketChat: first-admin setup still pending — wizard endpoint
takeover risk until claimed.
- 🟡 Stale `alpine:3` shell containers (userns-host bypass leftovers).
`docker rm -f` after each one-shot.
- 🟡 Archived compose dirs (`pocket-id.archived-*`, `matrix-old`)
contain secrets — move off docker tree.
- 🟡 `/home` 60% with growing volumes (Ollama, mongo, postgres ×3).
No quotas.
### Mem pressure: none right now
Top consumer minecraft 9.35 / 18 GiB cap (52% of cap, ~30% host).
All others < 2.2%. Headroom good.
---
## 4 — Stack rating (10 axes)
| Axis | Score | Top weakness |
|---|---|---|
| Architectural coherence | 8 | Drift artifacts (sync-conflict files, parallel `_archive`/`archive`) |
| Security posture | 7 | F4 no LUKS on server (regression); F30 ip_forward=1; F12 partial revert |
| Reproducibility | 6 | Most stacks on `:latest`; no IaC; admin bootstrap uncoded |
| Operational maturity | **4** | **No metrics/alerts; backups untested; on-call="user reads logs"** |
| Cost discipline | 9 | Single residential ISP + single home server is "cheap because fragile" |
| Threat model clarity | 6 | No written THREAT_MODEL.md; AGPL §13 source endpoint deferred |
| Update hygiene | 5 | `:latest` floats; no staged rollout; recovery = "edit compose, restart" |
| Documentation quality | 8 | SYSTEM.md is 979 lines; CV + team-msg.txt + sync-conflicts in repo root |
| Network resilience | 5 | Single residential WAN; control + data plane same box; no Tor/SOCKS fallback |
| Branding/product discipline | 9 | "X" rebrand close to veilor — easy to confuse in logs/docs |
### Overall: **7/10**
Above-average self-hosted stack. Better-documented than 90% of
homelabs, with audit discipline most small SaaS shops don't reach,
and a coherent identity/CA story (own root CA via step-ca, own VPN
control plane via Headscale, own Matrix homeserver). Loses points on
operational maturity (no monitoring, no offsite/tested backups, no
rollback), one critical regression (no LUKS on nullstone), and
inherent fragility from single-host single-ISP design.
The gap between **known weaknesses** and **fixed weaknesses** is the
limiting factor — operator clearly *can* fix these (audit closes 27/35
findings in 3 days), they just haven't yet.
### Comparison
- vs **Stock Fedora desktop + GitHub:** wins decisively (8 vs 3) on
network/identity/AGPL discipline.
- vs **secureblue + GH Actions:** stronger on server-side sovereignty;
weaker on client posture and CI. Roughly tied overall, different axes.
- vs **Hetzner-VPS hobbyist stack:** loses on resilience + update
hygiene, wins on cost + GPU inference + identity depth. This stack
more ambitious; Hetzner more boring-and-reliable.
- vs **Cloudflare/Workers managed:** wins on sovereignty + GPU + Matrix
ability. Loses on uptime + DDoS + zero-patching. This stack's whole
reason to exist is the inverse tradeoff — and it makes that tradeoff
coherently.
---
## 5 — Recommended service additions
### Top 5 priority (deploy in this order)
| # | Service | Why now | Effort | Maintenance |
|---|---|---|---|---|
| 1 | **Restic + autorestic** | Single biggest gap. nullstone NVMe failure = total loss right now. Encrypted incremental to B2/Wasabi or to onyx. | M | S |
| 2 | **Vaultwarden** | N services with N storage methods for secrets. Centralize before count grows. | S | S |
| 3 | **Gatus** | Otherwise you find out about a downed service from a friend on Matrix. Cert-expiry alone catches the silent killer. Alerts via Tuwunel webhook or ntfy. | S | S |
| 4 | **CrowdSec** | Pi-hole only sees DNS layer. Public Matrix fed candidates + RC + Misskey + signup pages = HTTP attack surface. Bouncer plugin blocks at Traefik. | M | S |
| 5 | **Beszel** | Once Restic is filling disk + CrowdSec flagging IPs, you want one dashboard. | S | S |
### Anti-recommendations
| Service | Why NOT |
|---|---|
| **Nextcloud** | Heavy (1.5 GB+ RAM idle), notorious upgrade pain. Use Seafile if you need files. |
| **Full LGTM stack** (Grafana+Prom+Loki+Alertmanager) | Five services to do what Beszel+Gatus do for solo-op. |
| **Mastodon** | You already run Misskey-fork. Federating two ActivityPub silos doubles moderation. |
| **HashiCorp Vault** | Complexity-to-benefit ratio terrible for one operator. Infisical or pass-with-git enough. |
| **Authelia** | Duplicates Authentik. Pick one. |
### Consolidation suggestions
- **Cinny + various Element/Commet forks:** pick **one** web client
per Matrix instance. Each fork = separate audit + CSP + branding burden.
- **n8n:** if only used for 2-3 simple flows, replace with shell
scripts in Forgejo Actions cron. n8n's value is the GUI for
non-coders; you're a coder.
- **Step-CA + Let's Encrypt:** confirm zero overlap. If step-ca only
issues one cert, kill it.
- **dl-veilor + signup pages:** if static, fold into single Caddy
container behind Traefik. Two containers for static HTML is two
too many.
### Other notable picks (lower priority)
- **Seafile CE** — file sync (much lighter than Nextcloud)
- **Karakeep** (formerly Hoarder) — bookmarks/RSS/read-later, AI tags
via your local Ollama / friend RTX 4080
- **ntfy** — formalize the push-notification target you're already
using ad-hoc
- **Forgejo Packages** — already implicit, just enable for container
registry + npm/cargo/maven/generic
---
## 6 — Action items (severity-ordered)
### Ship-blocking (do this week)
1. **Fix `/opt/docker/backup.sh`** — remove dead matrix-postgres,
correct rocketchat-mongodb container name, replace literal
`CHANGE_ME_MONGO_ADMIN_PASSWORD`. Verify next 02:00 run produces
non-zero RC + Mongo dumps.
2. **Bind anythingllm to LAN-only** OR add traefik front with
`no-guest@file`. Currently public on :3001.
3. **Verify `no-guest@file` ACL** — confirm `sourceRange` covers
LAN + tailnet, not just loopback. Verify XFF chain restores
real client IP.
4. **Claim RocketChat first-admin** — takeover risk until then.
5. **Enable LUKS2 on nullstone** (F4 regression) — schedule reinstall
window with TPM2 unlock; or until then, LUKS-on-file loopback
for step-ca root key + acme.json + Mongo keyfile.
### High-value next (do this month)
6. Deploy **Restic + autorestic** with B2/Wasabi target + restore drill.
7. Deploy **Vaultwarden** + migrate secrets out of `.env` files.
8. Deploy **Gatus** with cert-expiry checks + Matrix/ntfy alerts.
9. Resolve **sync-conflict files** at ai-lab repo root.
10. **Pin docker images by digest** for critical stacks (already done
for Misskey; do tuwunel/livekit/cinny/pihole/RC/Traefik next).
### Defer / planned
- Office workstation install + first audit
- Fold dl-veilor + signup pages into single Caddy
- Replace n8n with Forgejo Actions cron (if usage <5 flows)
- Move Headscale + step-ca to $4/mo VPS for SPOF mitigation
---
## 7 — File index
| Output | Path |
|---|---|
| This synthesis | `~/ai-lab/nullstone-server/audit-report-2026-05-05.md` |
| Migration detail | `~/ai-lab/nullstone-server/forgejo/migration-report-2026-05-05.md` |
| Forgejo runbook | `~/ai-lab/nullstone-server/forgejo/deploy-runbook.md` |
| Forgejo memory | `~/.claude/projects/-home-admin-ai-lab/memory/project_forgejo_nullstone.md` |
| veilor-os strategy | `~/ai-lab/_github/veilor-os/docs/STRATEGY.md` |
| veilor-os roadmap | `~/ai-lab/_github/veilor-os/docs/ROADMAP.md` |
| veilor-os threat model | `~/ai-lab/_github/veilor-os/docs/THREAT-MODEL.md` |

23
README.md Normal file
View file

@ -0,0 +1,23 @@
# infra
nullstone + cobblestone deploys, runbooks, audits.
## Layout
```
forgejo/ Forgejo + runner deploy artifacts (live on nullstone)
runbooks/ Migration + decision docs
├─ MIGRATION-nullstone-to-cobblestone.md
└─ DE-DECISION-cobblestone.md
repos/ Repo audits (cross-host inventory)
└─ REPO-AUDIT-2026-05-05.md
AUDIT-2026-05-05.md 5-agent stack audit (synthesis)
```
## Conventions
- Per-service deploy at `<service>/<file>` mirrors `/opt/docker/<service>/`
on nullstone/cobblestone host.
- Runbooks dated; do not silently update — append a new dated entry
if procedure changes.
- Memory record: `~/.claude/projects/-home-admin-ai-lab/memory/project_forgejo_nullstone.md`

176
forgejo/DEPLOY.md Normal file
View file

@ -0,0 +1,176 @@
# Forgejo deploy runbook — nullstone
Self-host plan: replace GH Actions free-tier (quota-bound) with
Forgejo + forgejo-runner running on nullstone. Same `build-iso.yml`
workflow, no GH dependency.
## Pre-flight
- nullstone reachable at 192.168.0.100 (LAN) and via tailscale (mesh)
- Traefik running, `proxy` docker network exists
- Gandi API token configured in traefik env (LiveDNS scope, s8n.ru only)
→ letsencrypt resolver works for new hostnames automatically
- DNS for `git.s8n.ru` must point at nullstone's public IP (Gandi
manual web UI; API can't add new records per memory
reference_gandi_api.md)
## Step 1 — DNS
Add A record `git.s8n.ru → <nullstone public IP>` via Gandi web UI.
Wait ~2min for propagation. Verify:
```bash
dig +short git.s8n.ru @1.1.1.1
```
## Step 2 — copy compose files to nullstone
```bash
scp /home/admin/ai-lab/nullstone-server/forgejo/docker-compose.yml \
nullstone:/tmp/forgejo-compose.yml
scp /home/admin/ai-lab/nullstone-server/forgejo/runner-compose.yml \
nullstone:/tmp/forgejo-runner-compose.yml
ssh nullstone bash <<'EOF'
sudo mkdir -p /opt/docker/forgejo/{data,config}
sudo mkdir -p /opt/docker/forgejo-runner/{data,cache}
sudo chown -R 1000:1000 /opt/docker/forgejo
sudo mv /tmp/forgejo-compose.yml /opt/docker/forgejo/docker-compose.yml
sudo mv /tmp/forgejo-runner-compose.yml /opt/docker/forgejo-runner/docker-compose.yml
EOF
```
## Step 3 — first-start Forgejo
```bash
ssh nullstone 'cd /opt/docker/forgejo && docker compose up -d'
ssh nullstone 'docker logs -f forgejo' & # watch first-start
```
When you see `Listen: http://0.0.0.0:3000`, Forgejo is up. Hit
<https://git.s8n.ru/> in your browser. Traefik gets the LE cert
automatically.
## Step 4 — initial admin user
The first-time wizard at `/install` is *disabled* by env (we set
`FORGEJO__service__DISABLE_REGISTRATION=true`). Create the admin via
CLI inside the container:
```bash
ssh nullstone 'docker exec -u 1000 forgejo \
forgejo admin user create \
--admin \
--username admin \
--email <your-email> \
--random-password \
--must-change-password=false'
```
The random password gets printed once — save it somewhere safe.
Login at git.s8n.ru with `admin` + that password, change it via the
web UI's user settings.
## Step 5 — generate runner registration token
```bash
ssh nullstone 'docker exec -u 1000 forgejo \
forgejo actions generate-runner-token'
```
Output is a single line — copy it into `.env` next to the runner
compose:
```bash
echo "RUNNER_TOKEN=<token>" | ssh nullstone 'sudo tee /opt/docker/forgejo-runner/.env'
ssh nullstone 'sudo chmod 600 /opt/docker/forgejo-runner/.env'
```
## Step 6 — start runner
```bash
ssh nullstone 'cd /opt/docker/forgejo-runner && docker compose up -d'
ssh nullstone 'docker logs -f forgejo-runner'
```
Look for `Runner registered successfully`. Verify in Forgejo web UI:
Site Administration → Actions → Runners — should list `nullstone`.
## Step 7 — mirror veilor-os repo
In the Forgejo web UI:
1. Create org `veilor-org` (matches GH org name).
2. Click + → Migrate Repository.
3. Type: GitHub. URL: `https://github.com/veilor-org/veilor-os`.
4. Mirror = ON. Description: "self-hosted mirror; primary dev here".
5. Click Migrate.
Forgejo pulls the repo + all branches + tags + actions config. Once
done, push from local will go to BOTH (set as second remote):
```bash
cd ~/ai-lab/_github/veilor-os
git remote add nullstone https://git.s8n.ru/veilor-org/veilor-os
git push nullstone main v0.7-bluebuild-spike
```
## Step 8 — flip workflow to nullstone runner
Change `build-iso.yml`:
```yaml
runs-on: ubuntu-24.04 # before
runs-on: nullstone # after — picks up our forgejo runner
```
Push to nullstone remote. Watch Forgejo Actions tab. Same workflow,
runs on our hardware, no GH minutes.
## Step 9 — close the loop
Mirror Forgejo → GitHub for public visibility. Forgejo settings on
the repo → Mirror → Push mirror → `https://github.com/veilor-org/veilor-os`
with a GH PAT that has write access. Forgejo pushes on every commit.
End state:
- `git push origin` → GH (public mirror)
- `git push nullstone` → Forgejo (primary; runs CI)
- Forgejo auto-pushes to GH for visibility
- ISO builds run unlimited on nullstone hardware
- 0 GH Actions minutes consumed
## Disk needs
- Forgejo data: ~1GB initial, grows ~100MB/yr per repo
- Runner workspace: ~80GB free recommended for ISO builds (squashfs
+ downloaded RPMs + xorriso staging)
- Runner cache: ~20GB for `actions/cache`-style hits across builds
Confirm with `df -h /` on nullstone before kickoff.
## Resource cost
- Forgejo: ~200MB RAM idle, ~500MB during build queues
- Runner: idle 50MB, ~4GB during ISO build (depsolve + squashfs)
- Network: ~2GB/build (Fedora package download)
Should fit alongside existing nullstone services without contention.
## Rollback
If anything breaks:
```bash
ssh nullstone 'cd /opt/docker/forgejo && docker compose down'
ssh nullstone 'cd /opt/docker/forgejo-runner && docker compose down'
```
Local repo `origin` still points at GH; nothing on the dev side
changes. ISO builds fall back to GH Actions until quota cycles.
## See also
- veilor-os roadmap: `_github/veilor-os/docs/ROADMAP.md`
- nullstone service inventory: `~/ai-lab/SYSTEM.md`
- Existing service patterns: `/opt/docker/headscale/`,
`/opt/docker/authentik/`

View file

@ -0,0 +1,68 @@
# Forgejo — self-hosted git + CI for veilor-org
# Deploy path on nullstone: /opt/docker/forgejo/
# Domain: git.s8n.ru
#
# Why: GH Actions free-tier minute quota was hammering veilor-os builds
# (5+ ISO builds = 150min, repeatable runner-shortage failures). Forgejo
# Actions takes the same `build-iso.yml` workflow unmodified and runs it
# on hardware we own. Bonus: full git host independence.
#
# Design notes:
# - Image pinned by tag, not digest, until we automate pinning. Forgejo
# releases roughly every 1-2 months; bump in this file.
# - SSH on host port 222 (host 22 is sshd for nullstone admin). Forgejo's
# internal ssh server reads /var/lib/forgejo/.ssh/authorized_keys, no
# pam, no sudo.
# - HTTP-only inside the proxy network; Traefik terminates TLS at the edge
# via the existing letsencrypt resolver (Gandi LiveDNS DNS-01).
# - `userns: host` matches the nullstone Docker convention so volume
# ownership maps cleanly to host UID 1000 (memory: project_nullstone_docker_userns.md).
services:
forgejo:
image: codeberg.org/forgejo/forgejo:9-rootless
container_name: forgejo
restart: unless-stopped
user: "1000:1000"
environment:
- USER_UID=1000
- USER_GID=1000
- FORGEJO__database__DB_TYPE=sqlite3
- FORGEJO__server__DOMAIN=git.s8n.ru
- FORGEJO__server__ROOT_URL=https://git.s8n.ru/
- FORGEJO__server__SSH_DOMAIN=git.s8n.ru
- FORGEJO__server__SSH_PORT=222 # public-facing SSH on host:222
- FORGEJO__server__SSH_LISTEN_PORT=2222 # container-internal listen port
- FORGEJO__server__START_SSH_SERVER=true
- FORGEJO__server__OFFLINE_MODE=false
- FORGEJO__security__INSTALL_LOCK=true # skip /install wizard; use envs above
- FORGEJO__service__DISABLE_REGISTRATION=true # invite-only; no public signup
- FORGEJO__service__REQUIRE_SIGNIN_VIEW=false # public repos viewable
- FORGEJO__actions__ENABLED=true # turns on Forgejo Actions
- FORGEJO__actions__DEFAULT_ACTIONS_URL=github # so `uses: actions/checkout@v4` resolves
- FORGEJO__webhook__ALLOWED_HOST_LIST=* # for GH mirroring webhooks
- FORGEJO__log__LEVEL=Info
# Email (optional; SMTP via authentik or local relay)
# - FORGEJO__mailer__ENABLED=false
volumes:
- /home/docker/forgejo/data:/var/lib/gitea
- /home/docker/forgejo/config:/etc/gitea
- /etc/timezone:/etc/timezone:ro
- /etc/localtime:/etc/localtime:ro
ports:
- "0.0.0.0:222:2222/tcp" # public SSH for git-over-ssh (port 22 is host sshd)
networks:
- proxy
labels:
- "traefik.enable=true"
- "traefik.docker.network=proxy"
- "traefik.http.routers.forgejo.rule=Host(`git.s8n.ru`)"
- "traefik.http.routers.forgejo.entrypoints=websecure"
- "traefik.http.routers.forgejo.tls=true"
- "traefik.http.routers.forgejo.tls.certresolver=letsencrypt"
- "traefik.http.routers.forgejo.middlewares=security-headers@file,rate-limit@file,no-guest@file"
- "traefik.http.services.forgejo.loadbalancer.server.port=3000"
networks:
proxy:
external: true

View file

@ -0,0 +1,59 @@
# Forgejo Migration Report — 2026-05-05
**Summary:** All 6 GitHub repos owned/admined by `s8n-ru` are mirrored to `git.s8n.ru` with healthy push-mirrors GH→Forgejo→GH; only fix this run was correcting the default branch on `s8n-ru/x` from `KisaragiEffective-patch-1` back to `master`.
## Scope
- GitHub auth verified: `gh auth status``s8n-ru` (token scopes incl. `repo`, `admin:org`, `delete_repo`).
- Forgejo auth verified: `~/.config/veilor-forgejo-pat.txt` → API user `s8n-ru` (id=1, is_admin=true).
- Inventory taken via `gh repo list` for user `s8n-ru` and org `veilor-org` (only org user belongs to). No archived repos and no forks were returned.
- All API calls to Forgejo went via the internal-via-alpine route (`docker run --rm --network proxy alpine:3 ... http://forgejo:3000`) since `https://git.s8n.ru/` is locked by the `no-guest@file` ACL.
## State file
`/tmp/migrate-state.tsv` was used as the resume-tracker so a re-run wouldn't redo work. Final contents:
| owner | name | status | notes |
|-------------|-------------------|---------|--------------------------------|
| s8n-ru | x | done | default-branch-fixed-this-run |
| s8n-ru | minecraft-launcher| done | already-mirrored |
| s8n-ru | auth-limbo | done | already-mirrored |
| s8n-ru | minecraft-server | done | already-mirrored |
| s8n-ru | 8bit-icons | done | already-mirrored |
| veilor-org | veilor-os | skipped | already-migrated (per spec) |
## Audit
GH HEAD (default branch) compared against Forgejo HEAD on the same branch name; branch and tag counts compared with full pagination; push-mirror existence verified with `last_error == ""`.
| Owner | Name | Default | GH HEAD | FJ HEAD | Branches GH/FJ | Tags GH/FJ | Push-mirror | Last sync (UTC+1) |
|-------------|-------------------|---------|-------------|-------------|----------------|------------|-------------|---------------------|
| s8n-ru | x | master | a2c1ed23 | a2c1ed23 | 84 / 84 | 1310 / 1310| yes | 2026-05-06 02:17:27 |
| s8n-ru | minecraft-launcher| main | ae760edd | ae760edd | 1 / 1 | 1 / 1 | yes | 2026-05-06 02:14:24 |
| s8n-ru | auth-limbo | main | b6863806 | b6863806 | 1 / 1 | 0 / 0 | yes | 2026-05-06 02:14:26 |
| s8n-ru | minecraft-server | main | ede60294 | ede60294 | 1 / 1 | 0 / 0 | yes | 2026-05-06 02:14:26 |
| s8n-ru | 8bit-icons | main | 42a3252d | 42a3252d | 1 / 1 | 0 / 0 | yes | 2026-05-06 02:14:26 |
| veilor-org | veilor-os | main | b40e89a3 | b40e89a3 | 22 / 22 | 2 / 2 | yes | (pre-existing) |
All push-mirrors target `https://github.com/<owner>/<name>.git` with `sync_on_commit: true`.
## Findings on this run
- Previous attempt's API timeout left every repo intact and content-correct, but on `s8n-ru/x` the default-branch metadata had been set to `KisaragiEffective-patch-1` (an in-flight feature branch from upstream `KisaragiEffective`, presumably the last branch processed when the timeout hit). Fixed via `PATCH /api/v1/repos/s8n-ru/x { "default_branch": "master" }`. All 84 branches and 1310 tags were already present, so no re-mirror was needed.
- All five s8n-ru push-mirrors and the veilor-org/veilor-os mirror reported `last_error: ""` and recent successful syncs, confirming the GH→Forgejo→GH bidirectional path was healthy before this run started.
## Failures
None.
## Skipped
| Owner | Name | Reason |
|------------|-----------|-----------------------------------------|
| veilor-org | veilor-os | Already migrated before this task began |
No archived repos and no forks (where user is not source author) were encountered. No repo exceeded 1 GB (largest is `s8n-ru/x` at ~268 MB).
## Cleanup
`/tmp/migrate/` removed. `/tmp/migrate-state.tsv` retained for the next re-run.

View file

@ -0,0 +1,63 @@
# Forgejo Runner — CI executor for veilor-org repos
# Deploy path on nullstone: /opt/docker/forgejo-runner/
#
# act_runner is Forgejo's drop-in GH Actions runner. Reads workflow
# YAML, spawns container per job, reports results back to Forgejo.
#
# Design notes:
# - Privileged + host networking + Docker socket access. Required for the
# veilor-os ISO build because livecd-creator needs loop devices and
# --privileged. This is the same trust model as our existing GH Actions
# workflow which uses `--privileged` inside `addnab/docker-run-action@v3`.
# - Single runner with label `nullstone` so workflows can opt in via
# `runs-on: nullstone`. Existing `runs-on: ubuntu-24.04` will not be
# picked up — that's intentional, lets us flip workflows one at a time.
# - Cache + workdir on host SSD, persistent across container restarts.
# - act_runner config gets generated on first start; registration token
# must be set in `.env` (see deploy-runbook.md).
services:
forgejo-runner:
image: code.forgejo.org/forgejo/runner:6.4.0
container_name: forgejo-runner
restart: unless-stopped
user: "0:0" # runner needs root to dind
privileged: true
userns_mode: "host" # privileged ⊥ userns-remap default
environment:
# Internal hostname — runner reaches forgejo container directly on
# the proxy net, bypasses traefik + no-guest@file ACL. Cleaner +
# faster than going out the public path.
- INSTANCE_URL=http://forgejo:3000
- REGISTRATION_TOKEN=${RUNNER_TOKEN}
- RUNNER_NAME=nullstone
# Labels map `runs-on:` keys in workflow YAML to docker images.
# ubuntu-24.04 → catthehacker/ubuntu (widely-used GH Actions image).
# Add `nullstone` label resolving to privileged Fedora 43 so our
# build-iso.yml can opt in selectively (`runs-on: nullstone`).
- RUNNER_LABELS=ubuntu-24.04:docker://ghcr.io/catthehacker/ubuntu:act-24.04,nullstone:docker://registry.fedoraproject.org/fedora:43
entrypoint: ["/bin/sh", "-c"]
command:
- |
set -e
# Register only on first start; subsequent restarts read /data/.runner.
# $$VAR escapes compose interpolation so vars resolve in the container.
if [ ! -f /data/.runner ]; then
/bin/forgejo-runner register \
--no-interactive \
--instance "$$INSTANCE_URL" \
--token "$$REGISTRATION_TOKEN" \
--name "$$RUNNER_NAME" \
--labels "$$RUNNER_LABELS"
fi
exec /bin/forgejo-runner daemon
volumes:
- /home/docker/forgejo-runner/data:/data
- /var/run/docker.sock:/var/run/docker.sock # docker-out-of-docker
- /home/docker/forgejo-runner/cache:/cache
networks:
- proxy
networks:
proxy:
external: true

View file

@ -0,0 +1,486 @@
<!--
Repo audit — Forgejo (git.s8n.ru) + GitHub (github.com).
Generated: 2026-05-05. Sources: gh CLI for github.com, internal Forgejo
REST API via alpine on the proxy network (no-guest@file blocks the
public host). Source-of-truth for what is migrated:
~/ai-lab/nullstone-server/forgejo/migration-report-2026-05-05.md.
-->
# Repo Audit — 2026-05-05
Combined audit of every repo on `git.s8n.ru` (Forgejo, primary host) and
`github.com` (mirror destination for migrated repos), plus per-repo file
trees and an ownership/anomaly summary.
- **Scopes covered:** `s8n-ru` (user, both hosts), `veilor-org` (org, both hosts).
- **`racked-team`:** does **not** exist on github.com (`gh` returns
"owner handle was not recognized"). User memory references the brand
but no GH org with that handle is registered. See Anomalies section.
- **Forgejo access:** internal-only via `alpine:3 + curl` on the
`proxy` docker network (`http://forgejo:3000`). `https://git.s8n.ru/`
is locked by the `no-guest@file` ACL.
---
## 1. Summary
| Metric | Count |
|---|---|
| Total distinct repos | 8 |
| Mirrored (Forgejo ↔ GitHub) | 6 |
| Forgejo-only | 2 (`veilor-org/veilor-server`, `veilor-org/infra`) |
| GitHub-only | 0 |
| Empty repos | 1 (`veilor-org/infra` — initial bare on Forgejo) |
| Archived | 0 |
| Forks | 0 (per migration report; `s8n-ru/x` is detached fork-of-Misskey, treated as standalone) |
**By owner:**
| Owner | Forgejo repos | GitHub repos |
|---|---|---|
| `s8n-ru` (user) | 5 | 5 |
| `veilor-org` (org) | 3 | 1 |
**By primary host:**
| Primary host | Repos |
|---|---|
| `git.s8n.ru` | 8 (all repos that exist on Forgejo) |
| `github.com` only | 0 |
`git.s8n.ru` is the canonical write side for every migrated repo; GitHub
receives sync-on-commit + 8h interval pushes (`sync_on_commit: true`,
`interval: "8h0m0s"`). The two Forgejo-only repos (`veilor-server`,
`infra`) currently have no GH counterpart — see Anomalies.
---
## 2. Ownership Table
Last-commit timestamps are taken from the default branch tip on Forgejo
(equal to GitHub for all mirrored repos — SHAs verified identical on
2026-05-06). Sizes are Forgejo-reported KiB unless the repo is GH-only.
| Repo | Owner | Visibility | Default | Stars (GH) | Last commit | Size (KiB) | License | Mirror status | Primary host |
|---|---|---|---|---|---|---|---|---|---|
| `x` | `s8n-ru` | private | `master` | 0 | 2026-05-05 13:46 | 283 674 | AGPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru |
| `minecraft-launcher` | `s8n-ru` | public | `main` | 0 | 2026-05-05 05:26 | 3 644 | GPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru |
| `auth-limbo` | `s8n-ru` | public | `main` | 0 | 2026-05-05 05:09 | 79 | AGPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru |
| `minecraft-server` | `s8n-ru` | private | `main` | 0 | 2026-05-04 18:37 | 383 | AGPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru |
| `8bit-icons` | `s8n-ru` | private | `main` | 0 | 2026-05-01 21:31 | 554 | AGPL-3.0 | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru |
| `veilor-os` | `veilor-org` | private | `main` | 0 | 2026-05-06 02:01 | 376 | MIT | git.s8n.ru → github.com (push, 8h, sync_on_commit) | git.s8n.ru |
| `veilor-server` | `veilor-org` | public | `main` | n/a | 2026-05-06 04:00 | 54 | none in tree | **no mirror** — Forgejo-only | git.s8n.ru |
| `infra` | `veilor-org` | private | `main` | n/a | (empty) | 22 | n/a | **no mirror** — Forgejo-only, currently bare | git.s8n.ru |
Notes:
- All push-mirrors target `https://github.com/<owner>/<name>.git`.
- Last sync per push-mirror reported `last_error: ""` on both 2026-05-05 (migration run) and 2026-05-06 (this audit run).
- "Stars (GH)" is 0 for every repo — these are personal/org-internal projects, not public-marketing items.
- License column reflects what GitHub's API picked up; private repos may have a `LICENSE` file in tree without a GH-detected SPDX id (see `s8n-ru/x` and `s8n-ru/8bit-icons` — verified by file presence in tree).
---
## 3. Per-Repo File Trees
Two-level trees (root + first level under each top-level dir). Repos
with > 500 entries are summarised: top-dirs ranked by entry count, with
per-dir totals shown instead of expanding. Listed alphabetically by
`<owner>/<name>`.
### `s8n-ru/8bit-icons`
131 entries. AMOLED pixel-art icon pack for Android (24×24 monochrome).
```
.
├── .github/
│ ├── ISSUE_TEMPLATE/
│ ├── workflows/
│ └── PULL_REQUEST_TEMPLATE.md
├── android-app/
│ ├── app/
│ ├── gradle/
│ ├── .gitignore
│ ├── build.gradle
│ ├── gradle.properties
│ ├── gradlew
│ ├── gradlew.bat
│ └── settings.gradle
├── assets/
│ ├── png/
│ ├── previews/
│ └── svg/
├── docs/
│ └── .gitkeep
├── mappings/
│ ├── aliases.json
│ ├── appfilter.xml
│ └── requests.json
├── scripts/
│ ├── build-appfilter.py
│ ├── lint-icons.py
│ ├── png-to-svg.py
│ ├── svg2png.sh
│ └── sync-android.py
├── .gitignore
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── ROADMAP.md
└── STYLE_GUIDE.md
```
### `s8n-ru/auth-limbo`
32 entries. Paper plugin fixing AuthMe post-login teleport race.
```
.
├── .github/
│ ├── ISSUE_TEMPLATE/
│ └── workflows/
├── docs/
│ ├── compatibility.md
│ ├── configuration.md
│ ├── how-it-works.md
│ └── installation.md
├── lib/
│ ├── .gitkeep
│ └── README.md
├── src/
│ └── main/
├── .gitignore
├── CHANGELOG.md
├── LICENSE
├── README.md
└── pom.xml
```
### `s8n-ru/minecraft-launcher`
_Large repo — 1796 entries._
_Top dirs by size:_ `app/` (1529), `libraries/` (154), `program_info/` (32), `cmake/` (22), `docs/` (8), `scripts/` (6), `buildconfig/` (3), `.github/` (2)
```
.
├── .github/ (2 entries)
├── app/ (1529 entries)
├── buildconfig/ (3 entries)
├── cmake/ (22 entries)
├── docs/ (8 entries)
├── libraries/ (154 entries)
├── program_info/ (32 entries)
├── scripts/ (6 entries)
├── tools/ (1 entries)
├── .clang-format
├── .clang-tidy
├── .editorconfig
├── .envrc
├── .git-blame-ignore-revs
├── .gitattributes
├── .gitignore
├── .gitmodules
├── .markdownlint.yaml
├── .markdownlintignore
├── BUILD_AND_DEPLOY_V1.sh
├── BUILD_GUIDE.md
├── CHANGELOG.md
├── CMakeLists.txt
├── CMakePresets.json
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── COPYING.md
├── Containerfile
├── INSTALL_DEPS.sh
├── LICENSE
├── PROJECT_SUMMARY.md
├── README.md
├── README_RELEASE.md
├── RELEASE_CHECKLIST.md
├── default.nix
├── renovate.json
├── shell.nix
├── vcpkg-configuration.json
└── vcpkg.json
```
### `s8n-ru/minecraft-server`
224 entries. racked.ru Minecraft server config + custom plugin (Purpur 1.21.11).
```
.
├── docs/
│ ├── migrations/
│ ├── plugins/
│ ├── BACKUP.md
│ ├── DEPLOY.md
│ ├── PERMISSIONS.md
│ ├── PLUGINS.md
│ ├── PLUGIN_ALTERNATIVES.md
│ ├── RACKED_BRAND.md
│ ├── REBRAND_2026-04-30.md
│ └── ROADMAP.md
├── live-server/
│ ├── plugins/
│ ├── .modrinth-manifest.json
│ ├── .rcon-cli.env
│ ├── .rcon-cli.yaml
│ ├── bukkit.yml
│ ├── commands.yml
│ ├── docker-compose.yml
│ ├── eula.txt
│ ├── help.yml
│ ├── log4j2.xml
│ ├── ops.json
│ ├── permissions.yml
│ ├── pufferfish.yml
│ ├── purpur.yml
│ ├── server.properties
│ ├── spigot.yml
│ ├── wepif.yml
│ └── whitelist.json
├── scripts/
│ └── backup.sh
├── .gitignore
├── LICENSE
├── MISSION.md
├── README.md
├── RULES.md
├── TELEMETRY_AUDIT.md
├── THANKS.md
├── VIBE.md
└── docker-compose.yml
```
### `s8n-ru/x`
_Large repo — 3249 entries._
_Top dirs by size:_ `packages/` (3062), `locales/` (41), `.github/` (35), `scripts/` (19), `cypress/` (11), `assets/` (9), `chart/` (9), `.devcontainer/` (5)
Private fork of Misskey, rebranded as Twitter/X for the `x.veilor` silo. Default branch `master` (mid-migration metadata bug fixed 2026-05-05; see migration report).
```
.
├── .config/ (4 entries)
├── .devcontainer/ (5 entries)
├── .github/ (35 entries)
├── .okteto/ (1 entries)
├── .vscode/ (2 entries)
├── assets/ (9 entries)
├── chart/ (9 entries)
├── cypress/ (11 entries)
├── fluent-emojis/ (0 entries)
├── idea/ (4 entries)
├── locales/ (41 entries)
├── packages/ (3062 entries)
├── patches/ (1 entries)
├── scripts/ (19 entries)
├── .dockerignore
├── .dockleignore
├── .editorconfig
├── .gitattributes
├── .gitignore
├── .gitmodules
├── .node-version
├── .npmrc
├── .vsls.json
├── CHANGELOG-X.md
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── COPYING
├── Dockerfile
├── LICENSE
├── NOTICE.md
├── Procfile
├── README.md
├── ROADMAP-X.md
├── ROADMAP.md
├── SECURITY.md
├── codecov.yml
├── compose.local-db.yml
├── compose_example.yml
├── crowdin.yml
├── cypress.config.ts
├── healthcheck.sh
├── package.json
├── pnpm-lock.yaml
├── pnpm-workspace.yaml
└── renovate.json5
```
### `veilor-org/infra`
Empty repo (no commits). Created 2026-05-06 09:51 BST as the canonical
home for nullstone+cobblestone deploys, runbooks, and audits. Local
working clone is at `~/ai-lab/_github/infra/` but has no commits and no
remote wired up yet. No tree to render.
### `veilor-org/veilor-os`
162 entries. Hardened Fedora KDE remix; primary on Forgejo since 2026-05-06 02:01.
```
.
├── .github/
│ ├── workflows/
│ ├── CODEOWNERS
│ └── PULL_REQUEST_TEMPLATE.md
├── assets/
│ ├── branding/
│ ├── fonts/
│ ├── installer/
│ ├── kde/
│ ├── konsole/
│ ├── plymouth/
│ ├── sddm/
│ └── wallpapers/
├── build/
│ ├── Containerfile
│ └── build-iso.sh
├── docs/
│ ├── research/
│ ├── BUILD.md
│ ├── CLI.md
│ ├── HARDENING.md
│ ├── INSTALL.md
│ ├── INSTALLER.md
│ ├── POWER.md
│ ├── ROADMAP.md
│ ├── STRATEGY.md
│ └── THREAT-MODEL.md
├── kickstart/
│ └── veilor-os.ks
├── overlay/
│ ├── etc/
│ └── usr/
├── scripts/
│ ├── apparmor/
│ ├── selinux/
│ ├── 10-harden-base.sh
│ ├── 20-harden-kernel.sh
│ ├── 30-apply-v03-theme.sh
│ ├── firstboot.sh
│ └── kde-theme-apply.sh
├── test/
│ ├── test-runs/
│ ├── METHOD-CHANGELOG.md
│ ├── README.md
│ ├── TESTING.md
│ ├── auto-install-keymap.sh
│ ├── auto-install.sh
│ ├── boot-checklist.md
│ └── run-vm.sh
├── upstream/
│ ├── fedora-kde-common.ks
│ ├── fedora-live-base.ks
│ ├── fedora-live-kde-base.ks
│ ├── fedora-live-kde.ks
│ ├── fedora-live-minimization.ks
│ ├── fedora-repo-not-rawhide.ks
│ └── fedora-repo.ks
├── .gitignore
├── CHANGELOG.md
├── CONTRIBUTING.md
├── LICENSE
└── README.md
```
### `veilor-org/veilor-server`
7 entries. Hardened-by-default Debian server-install ISO builder.
```
.
├── .gitignore
├── LICENSE
├── README.md
├── build.sh
├── flash.sh
├── post-install.sh
└── preseed.cfg.tpl
```
---
## 4. Anomalies
Items the operator should review.
### Forgejo-only repos with no GitHub mirror
These will diverge from GH if/when a GH counterpart is created later, and they currently have no off-site mirror at all. Both live in `veilor-org`.
| Repo | State | Recommendation |
|---|---|---|
| `veilor-org/veilor-server` | Public on Forgejo, 7 files (Debian preseed bootstrap), no push-mirror | Mirror to `github.com/veilor-org/veilor-server` to match the `veilor-os` policy (Forgejo primary, GH read-only). Memory entry `project_veilor_server_bootstrap` confirms this is intended sibling to veilor-os. |
| `veilor-org/infra` | Private on Forgejo, **empty repo** (no commits, no default branch tip), no push-mirror | Either initial-commit and add a push-mirror to GH, or delete if the canonical infra repo is intended to live elsewhere. Local `~/ai-lab/_github/infra/` has uncommitted content but no `.git/config` remote pointing at this repo. |
### Off-site backup gap
All five `s8n-ru/*` repos and `veilor-org/veilor-os` have a healthy
push-mirror to GH; the two anomalies above do not. If `git.s8n.ru` (on
nullstone) goes down, the two `veilor-org` Forgejo-only repos have **no
remote copy**. This is the only off-site-backup gap in the inventory.
### Missing GitHub org `racked-team`
CLAUDE.md / user memory references a `racked-team` GH org (separate
from `veilor-org`, for the racked.ru Minecraft brand). `gh api` confirms
the handle is **not registered** on github.com: "the owner handle
'racked-team' was not recognized as either a GitHub user or an
organization". The racked-related repos (`minecraft-launcher`,
`minecraft-server`, `auth-limbo`) all live under `s8n-ru/`, not under a
brand-scoped org.
Either:
- the org was never created (memory drift — should be reconciled), or
- the org has a different handle (e.g. `racked-ru`, `rackedteam`).
`gh api /user/orgs` returns only `veilor-org` for the active token; no
other org membership exists for `s8n-ru`.
### Missing/undetected licenses
| Repo | Tree has `LICENSE`? | GH SPDX detected | Note |
|---|---|---|---|
| `s8n-ru/x` | yes (`LICENSE` + `COPYING`) | AGPL-3.0 | OK |
| `s8n-ru/8bit-icons` | yes | AGPL-3.0 | OK |
| `s8n-ru/minecraft-server` | yes | AGPL-3.0 | OK |
| `s8n-ru/auth-limbo` | yes | AGPL-3.0 | OK |
| `s8n-ru/minecraft-launcher` | yes | GPL-3.0 | OK |
| `veilor-org/veilor-os` | yes | MIT | OK |
| `veilor-org/veilor-server` | yes | (no GH copy yet) | Will need GH detection once mirrored. |
| `veilor-org/infra` | n/a | n/a | Empty repo. |
No undetected-license repos. (All public repos surface the correct SPDX id on GitHub.)
### Default-branch hygiene
All repos: default branch matches the active dev branch. The
`s8n-ru/x` `master`-vs-`KisaragiEffective-patch-1` drift caught in the
previous migration run is still resolved (Forgejo & GH both report
`master` at SHA `a2c1ed2…`).
### Archived / dormant
No archived repos on either host. No repos with > 30 days since last
commit. Latest activity per repo (default branch tip):
```
veilor-org/veilor-server 2026-05-06 04:00
veilor-org/veilor-os 2026-05-06 02:01
s8n-ru/x 2026-05-05 13:46
s8n-ru/minecraft-launcher 2026-05-05 05:26
s8n-ru/auth-limbo 2026-05-05 05:09
s8n-ru/minecraft-server 2026-05-04 18:37
s8n-ru/8bit-icons 2026-05-01 21:31
veilor-org/infra (empty)
```
---
_Generated: 2026-05-05. Verifies state of git.s8n.ru and github.com as of the API responses captured during this run. Push-mirror SHAs verified equal between hosts (`s8n-ru/x` `a2c1ed23…`, `s8n-ru/minecraft-launcher` `ae760edd…`, `s8n-ru/auth-limbo` `b6863806…`, `s8n-ru/minecraft-server` `ede60294…`, `s8n-ru/8bit-icons` `42a3252d…`, `veilor-org/veilor-os` `b40e89a3…`)._

View file

@ -0,0 +1,170 @@
# Cobblestone Desktop Environment: Keep or Strip
**Status:** Decision pending operator confirmation of which DE shipped.
**Date:** 2026-05-06
**Scope:** cobblestone (Debian server, fresh install with DE present).
---
## TL;DR
Cobblestone is a service host, not a workstation. The operator already has a Fedora 43 KDE laptop (onyx) for daily driving and a precedent (nullstone) for headless servers. A desktop environment on cobblestone costs ~500 MB RAM, 58 GB disk, and an attack surface dominated by Xorg/Wayland plus the DE session manager — none of which earns its keep once the box is in steady state. The honest counter-argument is bring-up convenience: during the first few weeks of migrating Traefik, Forgejo, Authentik, Headscale, step-ca, Matrix (Tuwunel + LiveKit), Misskey, Pi-hole, n8n, and Minecraft, an operator who needs to debug TLS chains or federation handshakes may want a local browser. Recommendation: **strip after a 30-day soak (target 2026-06-05)**, install `cockpit` behind Authentik OIDC at `cobblestone.s8n.ru` for occasional GUI-feeling admin, and treat the bare console (HDMI + USB keyboard) as the recovery path. Strip-now is also defensible if the operator is comfortable doing all bring-up via SSH from onyx — that is genuinely how nullstone runs today.
---
## Side-by-side comparison
| Axis | Keep DE | Strip DE |
|---|---|---|
| RAM idle | ~500 MB | ~50 MB |
| Disk | ~58 GB | ~400 MB |
| Attack surface | Xorg/Wayland + DM (sddm/gdm3/lightdm) + ~200 GUI deps + plymouth | sshd + cron + journalctl + dockerd |
| Recovery (network down) | Plug monitor + kbd, GUI login, debug | Plug monitor + kbd, console login, debug |
| Update cadence | Track DE CVEs (KDE Plasma is frequent; GNOME less so; XFCE quiet) | Kernel + sshd + dockerd only |
| Useful when | First 24h bring-up; Firefox to hit internal CA pages; rare on-box troubleshooting | Almost always after week 1 |
**Key insight on recovery:** the GUI login does *not* save you when the network is down. A console login on `tty1` lets you run the same `journalctl`, `ip a`, `systemctl status` commands. The DE adds polish, not capability.
---
## Decision matrix
```
Cobblestone has DE installed
|
+-----------+----------+
| |
Operator works Cobblestone is
mainly on onyx? daily-driver too?
| |
YES NO
| |
+------+------+ KEEP DE
| |
Mid-migration? Settled?
| |
KEEP (soak) STRIP NOW
30-day flip
```
Operator works mainly on onyx (yes), cobblestone is not a daily driver (no). We are mid-migration (services not yet moved). **Path: KEEP for soak, flip on 2026-06-05.**
---
## Recommendation: strip after 30-day soak
1. Leave the DE in place during the migration of the listed services.
2. Calendar a reminder for **2026-06-05** to revisit.
3. On that date, if no service troubleshooting still depends on a local browser/GUI editor, run the strip procedure below.
4. Install `cockpit` immediately (today) regardless — it is useful with or without the DE and gives a soft landing for "I just want to see disk usage".
Why not strip now: Tuwunel federation debugging, Misskey AGPL endpoint validation, and step-ca chain inspection sometimes benefit from a browser pointed at `localhost`. SSH port-forwarding from onyx covers 95% of that, but the first migration of each service is the worst time to discover the 5%.
Why not keep forever: cobblestone is not a workstation. Every Plasma/GNOME CVE becomes a patch obligation for zero return.
---
## Install instead of DE (do this today)
- **cockpit + cockpit-machines + cockpit-podman** — web admin on port 9090. Front it with a Traefik vhost `cobblestone.s8n.ru` behind Authentik OIDC. Drop-in for "show me disk/CPU/services in a UI".
- **lazydocker** — TUI for docker. Faster than `docker ps -a` for daily ops.
- **dive** — image-layer inspector. Useful when an image is 2 GB and you want to know why.
- **glances** — htop with optional web UI on port 61208 (firewall it; cockpit covers most cases).
- **mc** (midnight commander) — file manager replacement for the no-GUI case.
- **Claude Code on cobblestone** — separate decision; not blocking. Running it on cobblestone enables ssh-less ops and lets cron/agent jobs operate on the box natively. If installed, gate it behind the same SSO posture as cockpit.
---
## Strip commands per DE flavour
The operator has not confirmed which DE shipped. Run `ls /usr/bin/*session* 2>/dev/null; dpkg -l | grep -E 'task-(xfce|gnome|kde|mate|cinnamon)-desktop'` first to identify it.
**Important:** `task-*-desktop` is a meta-package. Removing it alone does NOT remove the desktop — you must remove the actual package set too, then `apt autoremove --purge`. Always run `apt autoremove --purge` with caution: review the list before pressing `y`. It can sweep packages you wanted to keep if a DE dependency was the only reverse-dep.
### XFCE
```
sudo apt remove --purge \
task-xfce-desktop xfce4 xfce4-* \
lightdm lightdm-gtk-greeter \
xorg xserver-xorg* \
plymouth plymouth-themes
sudo apt autoremove --purge
```
### GNOME
```
sudo apt remove --purge \
task-gnome-desktop gnome-shell gnome-session gnome-* \
gdm3 \
xorg xserver-xorg* xwayland \
plymouth plymouth-themes
sudo apt autoremove --purge
```
### KDE Plasma
```
sudo apt remove --purge \
task-kde-desktop kde-plasma-desktop plasma-* kde-* \
sddm sddm-theme-* \
xorg xserver-xorg* xwayland \
plymouth plymouth-themes
sudo apt autoremove --purge
```
### MATE
```
sudo apt remove --purge \
task-mate-desktop mate-desktop-environment mate-* \
lightdm lightdm-gtk-greeter \
xorg xserver-xorg* \
plymouth plymouth-themes
sudo apt autoremove --purge
```
### Cinnamon
```
sudo apt remove --purge \
task-cinnamon-desktop cinnamon cinnamon-* \
lightdm lightdm-gtk-greeter \
xorg xserver-xorg* \
plymouth plymouth-themes
sudo apt autoremove --purge
```
### After any of the above
```
sudo systemctl set-default multi-user.target
sudo systemctl disable --now sddm gdm3 lightdm 2>/dev/null
sudo apt install --no-install-recommends cockpit cockpit-podman lazydocker mc glances
sudo reboot
```
Confirm `systemctl get-default` returns `multi-user.target` and `who` shows only ssh/console sessions after reboot.
---
## What breaks when you strip
| Lost capability | Replacement |
|---|---|
| Browser to test internal CA pages | `curl --cacert /etc/step-ca/certs/root_ca.crt https://...` or SSH port-forward from onyx |
| GUI text editor | vim / nano (already installed) |
| File manager | `mc` or shell |
| LightDM/SDDM/GDM autostart | `multi-user.target` (pure systemd) |
| Plymouth boot splash | Plain text scroll (better for debugging boot issues) |
| Local Firefox for OIDC login flows | Port-forward `ssh -L 9090:localhost:9090 cobblestone` from onyx, then hit `http://localhost:9090` in onyx Firefox |
None of these are losses for a service host. The text-scroll boot is arguably an upgrade — Plymouth hides the systemd unit that hung on boot, which is exactly the moment you need to see it.
---
## Open questions for the operator
1. Which DE actually shipped on cobblestone? (XFCE / GNOME / KDE / MATE / Cinnamon)
2. Strip-now or 30-day soak? Default recommendation is soak.
3. Install Claude Code on cobblestone? Out of scope for this doc, but related.
4. Cockpit vhost name confirmed as `cobblestone.s8n.ru`?
---
**Path:** `/home/admin/ai-lab/_github/infra/runbooks/DE-DECISION-cobblestone.md`

View file

@ -0,0 +1,630 @@
<!--
Migration runbook: nullstone → cobblestone
Audience: P M (operator), nullstone Runtime Owner.
Status: DRAFT — pre-cutover. Read sections 13 first; sections 47 are
executed only on cutover day.
Source-of-truth audits referenced:
- ~/ai-lab/SYSTEM.md
- ~/ai-lab/nullstone-server/audit-report-2026-05-05.md
- ~/ai-lab/nullstone-server/forgejo/deploy-runbook.md
Last updated: 2026-05-06
-->
# Migration runbook — nullstone → cobblestone
Goal: relocate the Docker stack (~28 containers, ~227 GiB state) from
**nullstone** (Debian 13, 192.168.0.100, AMD Ryzen 5 2600X / 32 GiB /
477 GiB NVMe, no LUKS) to **cobblestone** (Debian, fresh, LAN, hardware
TBD by operator), and close audit regression **F4 (no LUKS at rest)**
in the same window.
This runbook is read-only on both hosts until cutover (section 4).
Sections 13 are inventory + planning; section 4 is the destructive
cutover; sections 57 are follow-through.
## Things we don't know about cobblestone yet — operator to fill in
| Question | Why it matters | Default if unset |
|---|---|---|
| CPU model / cores / threads | Sizing for parallel postgres + Ollama + MC | Assume ≥ Ryzen 5 2600X parity |
| RAM | 32 GiB nullstone runs 50 % util peak; less = trim MC + Ollama | Require ≥ 32 GiB |
| Storage layout (LVM? ZFS? plain?) | Decides LUKS strategy in 3a | Assume single NVMe, plain ext4 |
| GPU present (any) | Ollama / vLLM / Misskey thumb GPU helpers | Assume none, leave Ollama on friend RTX 4080 |
| LUKS already enabled at install? | If no → reinstall window or LUKS-on-file fallback | Assume **no** (act accordingly) |
| Static IP allocated? | Cutover plan needs a parking IP | Assume DHCP, target `.101` for cutover |
| DE installed? | Strip vs keep debate | Confirmed installed; default = strip |
| User account name + uid | Bind-mount permissions on /home/docker | Assume `user`, uid 1000 (mirror nullstone) |
Update this table before running section 3.
---
## 1 — Pre-migration audit (run on nullstone)
All commands read-only. SSH as `user@192.168.0.100`
(per `feedback_nullstone_ssh_user.md``admin@` is rejected).
### 1.1 Container inventory
```bash
ssh user@192.168.0.100 'docker ps -a --format "{{json .}}"' \
> nullstone-containers-$(date +%F).jsonl
ssh user@192.168.0.100 'docker inspect $(docker ps -aq)' \
> nullstone-inspect-$(date +%F).json
```
Parse for `Names`, `Image`, `Mounts[].Source`, `NetworkSettings.Networks`,
`HostConfig.RestartPolicy`, `Config.Labels` (Traefik routers).
### 1.2 Volumes (size estimate)
```bash
ssh user@192.168.0.100 'docker volume ls --format "{{.Name}}"' \
| xargs -I {} ssh user@192.168.0.100 \
"docker run --rm -v {}:/v alpine du -sh /v 2>/dev/null | sed 's|/v|{}|'"
```
Cross-reference with `/home/user/docker-data/100000.100000/volumes/`
(userns-remapped path) for per-volume bytes.
### 1.3 Network
```bash
ssh user@192.168.0.100 'docker network ls; \
ss -tlnp 2>/dev/null | grep LISTEN; \
iptables-save 2>/dev/null; nft list ruleset 2>/dev/null'
```
Capture Traefik vhosts:
```bash
ssh user@192.168.0.100 'cd /opt/docker/traefik && \
ls dynamic/; cat dynamic/*.yml | grep -E "rule:|sourceRange:"'
```
### 1.4 Cron + scheduled tasks
```bash
ssh user@192.168.0.100 'sudo cat /etc/crontab /etc/cron.d/* 2>/dev/null; \
for u in $(cut -d: -f1 /etc/passwd); do \
crontab -u $u -l 2>/dev/null && echo "(user $u)"; done'
```
Known: `/etc/cron.d/docker-backup` runs `/opt/docker/backup.sh` daily at
02:00 — **broken** (F-backup-1, fix in section 5).
### 1.5 Systemd
```bash
ssh user@192.168.0.100 'systemctl list-unit-files \
--state=enabled --type=service --no-pager'
```
Watch for: `docker.service`, `tailscaled.service`, `ollama.service`
(Ollama runs on host, not in Docker), `chrony.service`, `ssh.service`.
### 1.6 Disk + memory + cpu baseline
```bash
ssh user@192.168.0.100 'df -hT; \
sudo du -sh /home/docker/* /opt/docker/* /opt/backups 2>/dev/null; \
free -h; lscpu | head -20; nproc'
```
Reference (2026-05-06 spot check):
`/` 30 G (37 %) · `/var` 12 G (17 %) · `/home` 399 G (60 %, 226 G used).
Most state is on `/home`.
### 1.7 Daemon config
```bash
ssh user@192.168.0.100 'cat /etc/docker/daemon.json /etc/subuid /etc/subgid; \
sudo cat /etc/systemd/system/docker.service.d/override.conf 2>/dev/null'
```
Known good (carry forward except possibly userns-remap, see 3c):
```json
{
"log-driver": "json-file",
"log-opts": {"max-size": "10m", "max-file": "3"},
"live-restore": true,
"icc": false,
"userns-remap": "default",
"default-address-pools": [{"base": "172.20.0.0/16", "size": 24}],
"storage-driver": "overlay2",
"no-new-privileges": true
}
```
---
## 2 — Secret + state catalog
Anything in this table that is **lost** or **corrupted** during transfer
forces re-issuance / re-pinning / re-handshake. Group by criticality.
### Tier 0 — irreplaceable (lose this and external systems break)
| Path | Bytes (est.) | Restore cost if lost |
|---|---|---|
| `/opt/docker/step-ca/data/secrets/` + `/opt/docker/step-ca/.env` | < 1 MiB | Re-issue every internal cert; reinstall `veilor-root.crt` on every device that uses `*.veilor` / internal-CA chains. Hard. |
| `/opt/docker/traefik/data/acme.json` (LE prod) | < 1 MiB | Hits LE rate-limit (5 dupe certs/wk per FQDN, 50 certs/wk per registered domain). Could lock cert issuance for a full week. |
| `/opt/docker/traefik/data/acme-internal.json` (step-ca chain) | < 1 MiB | Step-ca re-issues fast, but every leaf reissue invalidates pinned trust anchors. |
| `/opt/docker/headscale/config/private.key` + `/opt/docker/headscale/data/db.sqlite` | < 50 MiB | Loss = every node re-enrolls; preauthkeys, routes, ACLs reset. Friend GPU node identity churn. |
| `/etc/ssh/ssh_host_*` | < 1 MiB | Either copy keep TOFU pinning intact, OR rotate all clients hit "key changed" warning (acceptable but noisy). |
### Tier 1 — application secrets (loss → password reset cascade)
| Path | Bytes (est.) | Notes |
|---|---|---|
| `/opt/docker/forgejo/data/gitea/conf/app.ini` (note: file is `app.ini` under `gitea/conf/` even on Forgejo) | ~10 KiB | `SECRET_KEY`, `INTERNAL_TOKEN`, `JWT_SECRET`, `LFS_JWT_SECRET`, OAuth client secrets. |
| `/opt/docker/authentik/.env` + authentik PG dump | tens of MiB | `AUTHENTIK_SECRET_KEY`, `PG_PASS`. Any service trusting Authentik OIDC needs `client_secret` re-handover. |
| `/opt/docker/misskey/.env` + misskey PG dump | < 1 MiB env | `id`, `db.user/pass`, `redis.pass`, master key. |
| `/opt/docker/n8n/.env` + n8n PG dump | < 1 MiB env | Encryption key for credentials at rest **lose this and stored creds inside n8n flows are unrecoverable**. |
| `/opt/docker/rocketchat/.env` + Mongo dump (currently stopped — see 4.1) | < 1 MiB env | First-admin still unclaimed (audit risk item). |
| `/opt/docker/tuwunel*/etc/tuwunel.toml` | < 1 MiB | Server signing key seed; lose = federation re-onboard from zero. |
| `/opt/docker/livekit/livekit.yaml` | < 1 KiB | `keys:` map (api-keysecret); JWT minter (`lk-jwt-service`) shares this. |
| `/opt/docker/pihole/etc-pihole/` | ~50 MiB | Adlists + custom DNS; rebuildable in 30 min if lost. |
| Gandi PAT (`GANDIV5_PERSONAL_ACCESS_TOKEN` in `/opt/docker/traefik/.env`) | <1 KiB | Re-issuable from Gandi UI; LiveDNS-only scope (per `reference_gandi_api.md`). |
| Tailscale auth keys (Headscale) | regenerate via `headscale preauthkeys create` | OK to regenerate. |
### Tier 2 — bulk data (large, but reproducible OR low-stakes)
| Path | Bytes (est.) | Notes |
|---|---|---|
| Misskey `/files/` (S3-style local) | tens of GiB | User uploads — irreplaceable to users. Dedup-friendly. |
| Forgejo `/home/docker/forgejo/data/git/` | ~5 GiB now | Git repos; also mirrored to GH per `project_forgejo_nullstone.md`, so partial DR exists. |
| `dl-veilor` static files | ~1 GiB | Public ISO downloads; rebuildable from veilor-os pipeline. |
| n8n flows (in `n8n_n8n_data`) | < 1 GiB | Encrypted with key from Tier 1; export JSON via UI as belt-and-braces. |
| Minecraft world (`/home/docker/minecraft/data/`) | ~1030 GiB | Players will riot if lost. |
| Ollama models (`/home/user/models/ollama/`) | ~17 GiB | Re-downloadable from registry; not blocking. |
| Postgres dumps (authentik, misskey-db, n8n-postgres) | covered by `pg_dumpall` in 4.1 | |
| MongoDB dump (rocketchat-mongodb) | covered by `mongodump` in 4.1 | Container is **stopped** today — start, dump, stop. |
### Tier 3 — config-as-code (safely re-deployable from `~/ai-lab/_github/`)
- All `/opt/docker/*/docker-compose.yml` — committed under
`~/ai-lab/_github/infra/repos/` and `~/ai-lab/nullstone-server/`.
- Traefik `dynamic/*.yml` middleware files.
- Treat as authoritative in repo; copy from repo to cobblestone, not
from nullstone. Diff old-compose vs repo-compose during section 3d to
catch any uncommitted drift.
---
## 3 — Cobblestone install plan
### 3a — OS layer
Verify base:
```bash
ssh user@cobblestone 'cat /etc/debian_version; uname -r; lsb_release -a'
```
**LUKS2 (mandatory — closes F4):**
- **Path A (preferred):** reinstall with full-disk LUKS2 from the
Debian installer (`/`, `/home`, swap all on encrypted PVs). Set up
TPM2 unattended unlock post-install:
```bash
systemd-cryptenroll --tpm2-device=auto --tpm2-pcrs=0+7 /dev/nvmeXnYpZ
```
PCR 0+7 binds to firmware + secure-boot state; bricks if firmware
is updated → fall back to passphrase.
- **Path B (fallback if reinstall blocked):** LUKS-on-file loopback
for the high-value subset only:
- `/opt/docker/step-ca/`
- `/opt/docker/traefik/data/acme*.json`
- `/opt/docker/headscale/`
- postgres data dirs
- Mongo keyfile volume
This is **strictly worse** than Path A (rest of disk still
cleartext, including misskey uploads and forgejo repos), but it
closes the highest-value subset. Document as accepted risk.
Hostname + base packages:
```bash
sudo hostnamectl set-hostname cobblestone
sudo apt update && sudo apt install -y \
curl ca-certificates gnupg jq ufw fail2ban chrony \
rsync restic tmux htop iotop ncdu
```
**DE strip vs keep — recommendation: STRIP.**
Cost of keeping: ~500 MiB RAM, ~5 GiB disk, larger attack surface
(CUPS, avahi, polkit, GUI daemons on localhost). Benefit: local
browser for vhost testing, on-keyboard recovery if SSH wedges.
- **Default (strip):** `sudo apt purge '*-desktop' '*xorg*' lightdm
sddm gdm3 'plymouth*' libreoffice-* && sudo apt autoremove --purge`.
Install Cockpit for web admin behind Traefik + `no-guest@file`.
- **Keep:** lock SDDM/GDM local-only via PAM, disable XDMCP, mask
`cups-browsed`. No auto-login.
Operator picks; document choice in SYSTEM.md.
### 3b — Network
**IP allocation during cutover** — use `192.168.0.101` for
cobblestone while nullstone stays on `.100`. Flip DNS / port-forwards
last (section 4.6). Avoids ARP collisions and keeps rollback trivial.
**nftables ruleset** (mirror nullstone pattern — read live ruleset off
nullstone in 1.3, replay on cobblestone):
```bash
sudo systemctl enable --now nftables
# Drop in /etc/nftables.conf with:
# - default policy drop on input
# - accept established/related
# - accept lo
# - accept 22 (SSH) from LAN + tailnet
# - accept 80/443 (Traefik) from anywhere
# - accept 222 (Forgejo SSH) from LAN + tailnet
# - accept 25565 (Minecraft) from anywhere
# - log+drop everything else
```
**IPv6:** audit reports nullstone has `net.ipv4.ip_forward=1` (F30).
That was an *unintended carryover* from a Tailscale subnet-router
experiment. **Do NOT** copy `/etc/sysctl.d/` from nullstone wholesale.
Instead, set explicitly:
```bash
sudo tee /etc/sysctl.d/99-cobblestone.conf <<'EOF'
net.ipv4.ip_forward = 0
net.ipv6.conf.all.forwarding = 0
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
EOF
sudo sysctl --system
```
If Headscale or Tailscale subnet-router is wired later, re-enable
`ip_forward` with explicit comment + audit note.
**Tailscale + Headscale node identity:**
- Cleanest path: re-enroll cobblestone from scratch. New node, new
node-key, list `cobblestone` separately from `nullstone` in
Headscale during cutover week.
- Alternative: copy `/var/lib/tailscale/` from nullstone → cobblestone
to inherit the existing identity. Saves one ACL update but
conflates audit history. Not recommended.
### 3c — Docker
Install via official repo:
```bash
curl -fsSL https://download.docker.com/linux/debian/gpg | \
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/debian $(lsb_release -cs) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list
sudo apt update && sudo apt install -y \
docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
```
**`/etc/docker/daemon.json` — userns-remap decision.**
Two paths; operator decides. Document choice in SYSTEM.md.
**Path 1 — DROP userns-remap (recommended):** same JSON as nullstone
minus the `userns-remap` line.
- Pros: no more `chown 101000` dance; nsenter trick
(`feedback_docker_sudo_bypass.md`) drops the `--userns=host` flag;
Mongo keyfile pattern from `project_nullstone_docker_userns.md`
becomes unnecessary; `docker exec` UIDs match host 1:1.
- Cons: container root → host uid 0. Compensated by
`no-new-privileges`, `icc=false`, per-compose CAP drops, read-only
root FS where compatible. Net: small regression in defense-in-depth,
large workflow simplification.
**Path 2 — KEEP userns-remap:** carry `/etc/subuid` + `/etc/subgid`
identically (`user:100000:65536`). Existing on-disk ownership at uid
`101000` transfers without rechown. Cost: persisting the daily
friction the operator has been hitting for months.
**Default: Path 1.** If chosen, after rsync:
```bash
sudo chown -R user:user /home/docker /opt/docker
# Then per-service to the container uid (forgejo 1000, postgres 999,
# mongo 999, traefik 0).
```
Networks (must exist before Traefik comes up):
```bash
docker network create proxy
docker network create socket-proxy-net
docker network create misskey-frontend
```
### 3d — Service redeploy order
Topological. Each step depends only on its predecessors. Verification
command and rollback at each stage.
| # | Stack | Depends on | Verify | Rollback |
|---|---|---|---|---|
| 1 | networks (`proxy`, `socket-proxy-net`, `misskey-frontend`) | docker daemon | `docker network ls` | `docker network rm` |
| 2 | `socket-proxy` | network `socket-proxy-net` | `docker logs socket-proxy` shows API filter active | down compose |
| 3 | `traefik` | socket-proxy + acme.json/acme-internal.json carryover + Gandi PAT in .env | `curl -k https://sys.s8n.ru` returns dashboard auth challenge; `docker logs traefik` shows resolver init OK; cert files repopulate without LE call (acme.json reuse) | down compose; acme.json restore from backup |
| 4 | `step-ca` | traefik (for ACME-back) | `docker exec step-ca step ca health`; Traefik internal-CA resolver issues a cert against `https://step-ca:9000/acme/acme/directory` | down compose; revert traefik resolver config |
| 5 | `headscale` | traefik | `curl https://hs.s8n.ru/health`; `docker exec headscale headscale nodes list` shows existing nodes (db.sqlite carryover) | down compose; restore db.sqlite snapshot |
| 6 | authentik (`postgres → redis → server → worker`) | traefik | `curl https://auth.s8n.ru/-/health/ready/`; OIDC discovery doc loads | per-component down |
| 7 | `forgejo` | traefik (+ optional authentik, currently unwired) | `curl https://git.s8n.ru/api/v1/version`; `git clone ssh://git@cobblestone:222/...` | down compose; data dir tar-revert |
| 8 | misskey (`db → redis → misskey → x-source`) | traefik, network `misskey-frontend` | `curl https://x.veilor/api/meta` returns JSON; signup page renders | down compose; pg dump restore |
| 9 | `tuwunel` + `tuwunel-txt` | traefik | `curl https://matrix.veilor.uk/_matrix/federation/v1/version` and `https://mx.s8n.ru/_matrix/federation/v1/version` | down compose; data tar-revert |
| 10 | `cinny-txt` + `commet-web` + `signup-page` + `signup-txt` | tuwunel reachable, traefik | `curl -I https://txt.s8n.ru` 200; static assets 200 | down compose |
| 11 | `livekit-server` + `lk-jwt-service` | traefik (TURN over HTTPS) | `wscat wss://livekit.veilor.uk/`; jwt service `/healthz` | down compose |
| 12 | n8n (`postgres → n8n`) | traefik, restored encryption key | `curl https://n8n.s8n.ru/healthz`; UI loads with existing flows | pg dump restore |
| 13 | `pihole` | traefik | `dig @cobblestone | head`; admin UI auth | down compose |
| 14 | `forgejo-runner` | forgejo (#7) reachable on internal name | `docker logs forgejo-runner` shows `Runner registered successfully` | down compose; regenerate token via `forgejo actions generate-runner-token` |
| 15 | `minecraft-mc` | traefik (only for filebrowser-mc), router port-forward 25565 | `mcstatus mc.racked.ru` (or `nc -zv cobblestone 25565`) | down compose; world tar-revert |
| 16 | `dl-veilor` + `filebrowser-mc` | traefik | `curl https://dl.veilor.org/v0.2.0/veilor-root.crt` | down compose |
| 17 | `anythingllm` | traefik **with `no-guest@file` middleware applied** OR LAN-only bind — must NOT bring up like nullstone (port 3001 publicly exposed, audit F-anythingllm-1) | `curl -I -H 'Host: ai.s8n.ru' https://cobblestone` from off-LAN must 403 | down compose |
| 18 | RocketChat (`mongodb → rocketchat`) | **operator decision** — currently stopped on nullstone; if not retired, restore from mongodump produced in 4.1 | `curl https://rc.s8n.ru/api/info`; first-admin claim if still pending | leave stopped (matches today's state) |
---
## 4 — Cutover sequence
### 4.1 — Snapshot state on nullstone
```bash
NS=user@192.168.0.100
TS=$(date +%F-%H%M)
DEST=/opt/snap/$TS
ssh $NS "sudo mkdir -p $DEST && sudo chown user:user $DEST"
# Postgres dumps
for pg in authentik-postgres misskey-db n8n-postgres-1; do
ssh $NS "docker exec $pg pg_dumpall -U postgres" \
| gzip > $DEST/$pg.sql.gz
done
# Mongo (start, dump, stop again — currently stopped per audit)
ssh $NS 'cd /opt/docker/rocketchat && docker compose up -d rocketchat-mongodb && sleep 15'
ssh $NS 'docker exec rocketchat-mongodb mongodump \
--username root \
--password "$(grep MONGO_INITDB_ROOT_PASSWORD /opt/docker/rocketchat/.env | cut -d= -f2)" \
--authenticationDatabase admin --archive' \
| gzip > $DEST/rocketchat.archive.gz
ssh $NS 'cd /opt/docker/rocketchat && docker compose stop rocketchat-mongodb'
# Forgejo full dump (covers DB + repos + LFS + attachments)
ssh $NS 'docker exec -u 1000 forgejo \
forgejo dump --type tar.zst --file /tmp/forgejo-dump.tar.zst'
ssh $NS 'docker cp forgejo:/tmp/forgejo-dump.tar.zst -' \
> $DEST/forgejo-dump.tar.zst
# Stop everything before tar (consistency)
ssh $NS 'for d in /opt/docker/*/; do \
[ -f "$d/docker-compose.yml" ] && \
(cd "$d" && docker compose down) ; \
done'
# Bulk state tar
ssh $NS "sudo tar --acls --xattrs -cpf - /opt/docker /home/docker /opt/backups" \
| zstd -T0 -19 > $DEST.tar.zst
# Manifest
ssh $NS "find /opt/docker /home/docker -type f -print0 \
| xargs -0 sha256sum" > $DEST.sha256
```
Hold the tarball plus dumps in two places: cobblestone target host
and an offline USB. `acme.json` and step-ca secrets get an
*additional* armored copy to the password manager.
### 4.2 — rsync to cobblestone
After the tarball lands, repopulate cobblestone:
```bash
COBB=user@192.168.0.101
scp $DEST.tar.zst $COBB:/tmp/
ssh $COBB 'sudo mkdir -p /opt/docker /home/docker /opt/backups && \
sudo zstd -d /tmp/snap.tar.zst -o /tmp/snap.tar && \
sudo tar --acls --xattrs -xpf /tmp/snap.tar -C /'
# If userns-remap dropped (Path 1 in 3c):
ssh $COBB 'sudo chown -R user:user /opt/docker /home/docker'
```
### 4.3 — Bring up services on cobblestone
Walk section 3d table top to bottom. **Stop and verify** at each row
before the next. Don't batch — one bad startup cascades.
For services that store internal hostnames (Tuwunel `server_name`,
Headscale `server_url`, Forgejo `ROOT_URL`), the values stay the same
because public DNS still resolves to the WAN IP — only the internal LAN
target changes. No app config edits needed for cutover.
### 4.4 — Verify per vhost
```bash
for host in sys.s8n.ru git.s8n.ru auth.s8n.ru pihole.s8n.ru \
signup.txt.s8n.ru hs.s8n.ru rc.s8n.ru n8n.s8n.ru \
txt.s8n.ru mx.s8n.ru x.veilor matrix.veilor.uk \
chat.veilor.uk livekit.veilor.uk signup.veilor.uk \
dl.veilor.org; do
echo -n "$host: "
curl --resolve $host:443:192.168.0.101 -sI https://$host | head -1
done
```
Then push key flows:
- `git push nullstone-remote` (alias still works because DNS is
unchanged) — Forgejo CI runs.
- Matrix federation: `curl https://federationtester.matrix.org/api/report?server_name=veilor.uk`.
- Misskey signup: hit invite-gated form, complete signup, federation
test post.
### 4.5 — Cutover network
Two paths; operator picks based on appetite.
**Path A — DNS swing** (lower risk, slower propagation):
1. Lower `*.s8n.ru` and `*.veilor*` A-record TTLs to 60 s **a week
before** cutover (Gandi UI; can't be done via API per
`reference_gandi_api.md`).
2. Day-of: change A records from `82.31.156.86` (assumed unchanged
public IP) only if the WAN NAT target has changed (e.g. router
port-forwards now point at `.101`). If WAN IP and port-forwards
stay the same and you swap LAN IPs (`.100` → `.101`), no public
DNS edit needed — only edit `/etc/hosts` on internal clients (per
`feedback_s8n_hosts_override.md`).
**Path B — IP takeover** (faster, higher rollback friction):
- Bring nullstone down on `.100`, change cobblestone from `.101`
`.100`, restart networking. Public DNS + router port-forwards
unchanged. Rollback = swap IPs back.
Update onyx `/etc/hosts` long pin line **last**:
```
192.168.0.<new> rc.s8n.ru n8n.s8n.ru pihole.s8n.ru sys.s8n.ru \
mx.s8n.ru txt.s8n.ru signup.txt.s8n.ru git.s8n.ru x.veilor \
dl.veilor.org
```
### 4.6 — Update memory + ai-lab docs
- `~/ai-lab/CLAUDE.md` — Device Registry: add `cobblestone` row, mark
`nullstone` as `decom 2026-MM-DD`.
- `~/ai-lab/SYSTEM.md` — replace nullstone hardware/network blocks
with cobblestone equivalents; keep nullstone as "cold spare" until
wipe.
- `~/ai-lab/README.md` — device table one-liner.
- `~/ai-lab/security/` — create `cobblestone-server/` folder; first
audit due within 7 days of cutover.
- Memory files to update: `project_nullstone_docker_userns.md`
(mark **superseded** if userns-remap dropped),
`project_forgejo_nullstone.md`,
`project_rocketchat_nullstone.md`, `project_tailscale_mesh.md`,
`feedback_nullstone_ssh_user.md`, `feedback_s8n_hosts_override.md`
(new IP).
### 4.7 — Cold spare + wipe
- Hold nullstone powered-off but cabled, 7 days minimum.
- If no rollback triggered, wipe: full LUKS reformat (or `nvme
format -s1` for crypto-erase if drive supports it), then either
donate or repurpose as cobblestone backup target (Restic
destination — closes audit recommendation #6).
---
## 5 — Post-migration immediate fixes
Carried over from `nullstone-server/audit-report-2026-05-05.md`:
- **F-backup-1 — fix `/opt/docker/backup.sh`:** remove dead
`matrix-postgres` block (Synapse retired); correct
`rocketchat-mongodb` container name; replace literal
`CHANGE_ME_MONGO_ADMIN_PASSWORD` with read from
`/opt/docker/rocketchat/.env`. Verify next 02:00 run produces
non-zero RC + Mongo dumps.
- **no-guest@file ACL:** populate `sourceRange` to cover LAN
(`192.168.0.0/24`) + tailnet (`100.64.0.0/10`) + IPv6 equivalents.
Verify XFF chain restores client IP at the entryPoint level
(`forwardedHeaders.trustedIPs`).
- **anythingllm:** front via Traefik with `no-guest@file` OR bind
LAN-only. Must not repeat the 0.0.0.0:3001 nullstone state.
- **LUKS:** done at install (3a). Verify via `cryptsetup status` +
`systemd-cryptenroll --tpm2-device=list` post-cutover.
- **Restic + autorestic** to B2/Wasabi or to nullstone-as-spare,
with restore drill scheduled.
- **Vaultwarden** to centralize the secrets currently sprayed across
`.env` files.
- **Gatus** with cert-expiry checks + ntfy/Matrix alerts.
- **CrowdSec** with bouncer plugin at Traefik for the public
HTTP attack surface.
- **Beszel** for one-pane host metrics.
---
## 6 — Open questions (operator decisions)
| Question | Default if undecided |
|---|---|
| Strip DE on cobblestone? | **Strip + Cockpit.** Easier to defend; remote admin via web UI through Traefik + no-guest@file. |
| userns-remap on cobblestone? | **Off (Path 1 in 3c).** Operator pain outweighs the marginal isolation. Document tradeoff. |
| Move Headscale + step-ca to a $4 VPS? | **Defer (phase 2).** Keep on cobblestone for now; revisit once Restic + Gatus are running. SPOF mitigation is real but adds attack surface; do it once monitoring is in place. |
| RocketChat: bring back up or retire? | **Retire if not used in 30 days.** Currently stopped; first-admin still unclaimed. Mongo dump captured in 4.1, then drop the stack from cobblestone redeploy. Keeps `rc.s8n.ru` DNS for future revival. |
| Tailscale identity copy vs re-enroll for cobblestone? | **Re-enroll** (cleaner audit trail; Headscale ACLs need a one-line edit). |
| SSH host keys copy vs rotate? | **Copy.** TOFU pinning intact; one less "is this MITM?" prompt for clients. Add rotation to a follow-up cron. |
| Authentik wiring during cutover or after? | **After.** Authentik is currently mostly unwired (audit). Cutover is not the time to add new auth dependencies. |
---
## 7 — Risks (severity-tagged)
- 🔴 **acme.json mishandling = LE rate-limit.** Mitigation: copy
`acme.json` + `acme-internal.json` BEFORE bringing up Traefik on
cobblestone. Never let cobblestone Traefik issue a fresh batch of
certs. Hold a backup of both files in two locations.
- 🔴 **step-ca root key loss = full re-issuance.** Mitigation:
triple-copy `/opt/docker/step-ca/.env` + `data/secrets/`
(cobblestone, USB, password manager). Test that the encrypted root
key decrypts on cobblestone before tearing down nullstone.
- 🔴 **anythingllm reintroduces public 0.0.0.0:3001.** Mitigation: do
NOT bring it up before middleware is in place. Test from off-LAN
IP.
- 🟠 **PostgreSQL major-version skew.** Mitigation: pin same major on
cobblestone (`postgres:16-alpine` already pinned; do NOT use
`:latest`). If a major upgrade is desired, do it as a separate
step *after* cutover settles, with a fresh pg_dumpall as safety
net.
- 🟠 **Headscale node identity churn** if `db.sqlite` not copied. All
nodes (onyx, friend RTX 4080 PC, office) re-enroll. Mitigation:
copy `db.sqlite` + `private.key`; verify `headscale nodes list`
matches pre-cutover before flipping DNS.
- 🟡 **chrony NTS peers** may need re-trust on new host (NTS-KE binds
to hostname). Mitigation: chrony config copy verbatim; first
`chronyc tracking` should show stratum within 5 minutes.
- 🟡 **Authentik OIDC `client_secret`s.** Today: mostly unwired
(audit). Risk small. If Forgejo/RC/n8n were wired through
Authentik, each `client_secret` would need re-handover. Defer
Authentik wiring until post-cutover.
- 🟡 **Misskey AGPL §13 source endpoint** (`x-source`). Per
`project_x_misskey_fork.md`, the AGPL link must keep serving
source — and per the same memo, mute is acceptable for short
windows. Cutover downtime budget: **≤ 2 h**. If exceeded, post a
banner on `x.veilor` linking to `https://git.s8n.ru/s8n-ru/x` for
the duration.
- 🟡 **Backup script broken on copy.** Audit F-backup-1 still applies
if you copy `/opt/docker/backup.sh` verbatim. Fix during section 5,
not before — but do not let it run on cobblestone before fix
(disable the cron entry until corrected).
---
## Appendix — quick reference
- nullstone: `user@192.168.0.100`, Debian 13, 32 GiB / 477 GiB, ~28
containers, no LUKS (F4).
- cobblestone: `user@192.168.0.101` during cutover, swing to `.100`
post-validation.
- LE wildcard `*.s8n.ru` + `*.veilor.uk` via Gandi DNS-01. Internal CA
via step-ca, Traefik resolver `internal-ca`.
- Out of scope: office workstation install, friend GPU re-enrollment,
veilor-os ISO build pipeline.
---
**Path:** `/home/admin/ai-lab/_github/infra/runbooks/MIGRATION-nullstone-to-cobblestone.md`
Two-line summary: pre-migration audit + secret catalog + cobblestone
install plan (LUKS2, optional userns-remap drop, 18-step topological
service redeploy) + cutover script + post-migration fixes carried over
from the 2026-05-05 audit. Operator must fill the "things we don't know
about cobblestone" table and pick on userns-remap / DE / RC retirement
before section 3 runs.