infra/STATE.md

160 lines
7.3 KiB
Markdown
Raw Permalink Normal View History

# Infra state — 2026-05-06
Source-of-truth for **what is true now** + **what is pending**.
When state changes, append to top of "Changelog" and edit the
relevant table/section. Don't rewrite history.
## Forge
**Primary git host: <https://git.s8n.ru/> (Forgejo).** Forgejo is the
ONLY source of truth. When the operator says "my git", they mean
Forgejo.
**Push-mirror to GitHub is OFF by default** (changed 2026-05-06).
Operator works privately on Forgejo. Push to GitHub happens only when
explicitly requested for a specific repo. GitHub copies that exist
right now are point-in-time snapshots from before the mirror was
disabled — they will go stale.
- Forgejo: <https://git.s8n.ru/> (LE cert, `no-guest@file` ACL)
- Forgejo SSH: `ssh://git@192.168.0.100:222/<owner>/<repo>.git`
(LAN only; router port-forward 222 not yet configured)
- Admin user: `s8n-ru` (NOT `admin` — reserved by Forgejo)
- Push-mirror to GH: every commit + 8h interval, all repos green
- Forgejo runner: registered on nullstone, labels
`ubuntu-24.04 + nullstone` (privileged Fedora 43 for ISO builds)
## Hosts
| codename | role | LAN IP | OS | LUKS | Status |
|---|---|---|---|---|---|
| onyx | dev workstation | 192.168.0.28 (DHCP, registry says .6 — drift) | Fedora 43 KDE | yes | active |
| nullstone | infra (migrating off) | 192.168.0.100 | Debian 13 | **NO** ⚠️ | active until cutover |
| office | workstation | 192.168.0.5 | Fedora 43 KDE (pending install since 2026-04-19) | tbd | not yet on net |
| **cobblestone** | **infra (target)** | **TBD** | **Debian, has DE** | **TBD — install with LUKS** | **fresh, awaiting access details** |
Mesh:
- Tailscale + Headscale (`hs.s8n.ru` on nullstone) — control plane
moves to cobblestone with the migration. Identity continuity =
carry `/var/lib/tailscale/state` OR re-enroll.
- Friend PC (`100.64.0.3`, RTX 4080) — vLLM in WSL2 over tailnet
for remote LLM inference.
## Repos (8 total)
| Repo | Owner | Forgejo | GH mirror | Notes |
|---|---|---|---|---|
| veilor-os | veilor-org | ✅ primary | snapshot 2026-05-06 (stale from now) | hardened Fedora KDE remix |
| veilor-server | veilor-org | ✅ primary | **DELETED from GH 2026-05-06** | Debian preseed bootstrap |
| infra | veilor-org | ✅ primary | **DELETED from GH 2026-05-06** | this repo |
| x | s8n-ru | ✅ primary | **DELETED from GH 2026-05-06** | private Misskey fork |
| minecraft-launcher | s8n-ru | ✅ primary | snapshot 2026-05-06 (stale) | racked.ru launcher |
| minecraft-server | s8n-ru | ✅ primary | snapshot 2026-05-06 (stale) | racked.ru MC server |
| minecraft-client | s8n-ru | ✅ primary | snapshot 2026-05-06 (stale) | racked.ru MC client config |
| auth-limbo | s8n-ru | ✅ primary | snapshot 2026-05-06 (stale) | Paper plugin (AuthMe fix) |
| 8bit-icons | s8n-ru | ✅ primary | snapshot 2026-05-06 (stale) | AGPLv3 AMOLED 24×24 pixel-art Android pack |
**No repos on GH that aren't mirrored from Forgejo.**
⚠️ **`racked-team` GH org does NOT exist** per `gh api`. Memory says
it's the Minecraft brand org — drift to reconcile. Either:
- Move all `s8n-ru/minecraft-*` repos under `racked-team` org (create
it, transfer)
- OR drop the `racked-team` mention from memory (it was aspirational)
## Service inventory (nullstone, current)
28 active containers. Categorized:
```
MESH headscale, pihole
GIT forgejo, forgejo-runner
IDENTITY authentik-server, -worker, -postgres, -redis, step-ca
CHAT tuwunel (matrix.veilor.uk), tuwunel-txt (mx.s8n.ru),
cinny-txt, commet-web, signup-page, signup-txt,
livekit-server, lk-jwt-service
SOCIAL misskey, misskey-db, misskey-redis, x-source nginx
ADMIN traefik, socket-proxy
AUTOMATION n8n-n8n-1, n8n-postgres
HOST APPS minecraft-mc, anythingllm, dl-veilor, filebrowser-mc
DOWN rocketchat, rocketchat-mongodb (volumes preserved)
EPHEMERAL alpine:3 shells (userns-host bypass leftovers — clean up)
```
## Pending decisions (waiting on operator)
| Decision | Recommendation | Status |
|---|---|---|
| Cobblestone IP + SSH access | hand over from operator | ⏳ blocked |
| Cobblestone hardware specs | hand over from operator | ⏳ blocked |
| LUKS on cobblestone | **mandatory** (fixes F4) | ⏳ blocked on access |
| DE on cobblestone | **30-day soak then strip**; install cockpit today | ⏳ runbook drafted |
| userns-remap on cobblestone | **drop** (simpler bind-mounts; lose 1 layer defense) | ⏳ runbook drafted |
| Headscale + step-ca SPOF mitigation | phase-2: move to $4/mo VPS | ⏳ deferred |
| RocketChat revive or retire | 30-day timer; if unused, retire and free volumes | ⏳ stopped 2026-05-06 |
| anythingllm public binding | bind LAN-only or front via traefik+no-guest | ⏳ open issue |
| /opt/docker/backup.sh fixes | matrix-postgres + rocketchat-mongodb + literal CHANGE_ME pw | ⏳ open issue |
| `no-guest@file` ACL config | populate sourceRange beyond loopback; verify XFF chain | ⏳ open issue |
## Pending audits / ratings (from 5-agent wave)
Stack rating: **7/10** ([AUDIT-2026-05-05.md](./AUDIT-2026-05-05.md)).
Top 5 weaknesses (severity):
1. 🔴 No LUKS on nullstone (regression)
2. 🔴 backup.sh broken silently (RC + ex-Matrix not dumping)
3. 🔴 no-guest@file stub (loopback-only sourceRange)
4. 🔴 anythingllm public on 0.0.0.0:3001
5. 🟠 No off-host backup replication (single-NVMe SPOF)
Top 5 services to add (priority order):
1. Restic + autorestic → B2/Wasabi (encrypted, dedup, incremental)
2. Vaultwarden (centralize secrets out of `.env` files)
3. Gatus (uptime + cert-expiry; alerts via Tuwunel/ntfy)
4. CrowdSec (HTTP/SSH layer block at Traefik)
5. Beszel (lightweight observability)
## Pending tracked work
### v0.5.32 ship (veilor-os)
Per `_github/veilor-os/docs/ROADMAP.md`. CI failed last attempt on GH
runner shortage; flip workflow to `runs-on: nullstone` to use
Forgejo runner instead.
### v0.7 BlueBuild spike (veilor-os)
Branch: `v0.7-bluebuild-spike` on Forgejo. Recipe ready, kickstart
ready, GH Actions wired (won't trigger now since main host moved).
Adapt to Forgejo Actions — should be drop-in with `runs-on:
ubuntu-24.04` since runner has that label.
## Changelog
### 2026-05-06
- **Deleted 3 repos from GitHub:** `s8n-ru/x`, `veilor-org/infra`,
`veilor-org/veilor-server`. Forgejo copies untouched. GH 404s
confirmed.
- **Disabled all Forgejo→GH push-mirrors** (8 repos). Forge is now
the only auto-pushed-to host. Operator works privately. Push to GH
is a manual operator step for specific repos when wanted.
- Created `veilor-org/infra` Forgejo repo (mirror initially set, then
removed same day per the policy change above)
- Stopped RocketChat (`docker compose stop`); volumes preserved
- 5-agent stack audit shipped (`AUDIT-2026-05-05.md`)
- Cobblestone deployed (fresh Debian + DE) — awaiting access details
- This STATE.md created
### 2026-05-05
- Forgejo + forgejo-runner deployed on nullstone at git.s8n.ru
- 6 GH repos migrated to Forgejo with push-mirrors back to GH
- Admin pw rotated; SSH key for s8n-ru added; PAT generated
- veilor-os v0.5.31 four-bug fix shipped
- 9-agent research wave on veilor-os v0.5.32 blockers
- secureblue layering strategy locked (`STRATEGY.md`)
- THREAT-MODEL.md drafted
### 2026-05-04 (and earlier)
- See `_github/veilor-os/docs/ROADMAP.md` "Lessons learned" section
- See `~/.claude/projects/-home-admin-ai-lab/memory/MEMORY.md` for
per-project memos