Single document that surfaces the depth of work behind veilor-os: metrics, distros studied, every tool traversed in the build chain, all 35+ failure classes hit and beaten, key engineering decisions and why, what's in the repo beyond the kickstart, and the self-hosted nullstone CI infrastructure built to support it. Receipts not narrative — every claim links back to a file path, commit, error, or config. Useful as portfolio anchor and as a single read-this-first for anyone returning to the project after a gap.
18 KiB
veilor-os — Proof of Work
What this file is: a single document that summarises the depth of work, tooling traversed, and engineering decisions behind veilor-os. Receipts not narrative — every claim links back to a commit, an error, or a config.
Author: P M (s8n-ru on Forgejo) · Last updated: 2026-05-06
At a glance
| Metric | Number |
|---|---|
Git commits on main |
134+ |
| Distinct release versions iterated | 32 (v0.1 → v0.5.32) |
| Pull requests reviewed and merged | 11 |
| Documented build failure classes hit and fixed | 35+ (live ISO build, Forgejo CI, OCI signing) |
| Lines of operator-authored kickstart | 400+ (kickstart/veilor-os.ks) |
| Lines of overlay shell hardening scripts | ~1500 across scripts/*.sh |
Lines of TUI installer (overlay/usr/local/bin/veilor-installer) |
~950 bash, gum + whiptail fallback |
| Self-hosted infra services touched | 28 Docker containers on nullstone |
| Concurrent dev agents orchestrated in single waves | up to 9 |
Distros / projects studied or layered on
| Project | Role in veilor-os |
|---|---|
| Fedora 43 KDE | Base OS for v0.5.x kickstart-installed flat builds |
| secureblue | Upstream hardened atomic Fedora; v0.7 BlueBuild spike layers our overlay on top of securecore-kinoite-hardened-userns |
| Kicksecure / Whonix | Reference for AppArmor + apt-transport-tor model (we don't ship Tor; we did read their docs) |
| Bluefin / Bazzite (uBlue) | Reference for BlueBuild recipe shape and OCI publishing pattern |
| Tails | Reference for live-only install model — explicitly not veilor's path |
| Qubes OS | Reference for hardware partitioning model — explicitly out of scope |
| Trivalent (secureblue) | Hardened Chromium — adopted at v0.6+ |
| Mullvad Browser | Tor-Browser-fork without Tor — adopted at v0.6+ |
veilor-os is not a fork of any of the above. It's a composition: Fedora kickstart for v0.5.x, secureblue OCI for v0.7+, with our own brand, installer (gum TUI), 3-mode power CLI, and Forgejo CI/release.
Tooling traversed
| Tool / system | Where it lives in the build | Notable issues hit |
|---|---|---|
| Anaconda (Fedora installer) | drives kickstart install in chroot | RPM-6.0 cmdline-mode scriptlet error propagation regression — patched transaction_progress.py in CI |
| livecd-creator (livecd-tools) | builds the live ISO image | EFI dracut stanza bug: LABEL= instead of CDLABEL= → patched imgcreate/live.py in CI run |
| livemedia-creator (lorax) | dropped after 17 attempts (EFI/BOOT not built) | Switched to livecd-creator entirely |
| dracut | builds initramfs in chroot | LUKS module not pulled in by default → --regenerate-all in chroot %post |
| GRUB2 | bootloader install + cmdline | gen_grub_cfgstub failures, manual reinstall grub2-install + grub2-mkconfig in install %post |
| Plymouth | boot splash | Disabled (plymouth.enable=0) so LUKS prompt is visible; theme details for v0.7+ |
| SDDM | KDE display manager | livecd-creator skips the display-manager.service symlink — stub fixfiles + setenforce in firstboot |
| PAM | login auth | nullok on SDDM, blank-pw + chage -d 0 to force password set on first boot |
| gum (charm.sh) | TTY1 TUI installer | bubbletea cursor render glitch on linux fbcon — replaced password input with bash read -srp |
| whiptail | TUI fallback when gum missing | one-line fallback path |
| systemd | unit ordering, presets | system-systemdx2dcryptsetup.slice doesn't exist — non-fatal preset warning, suppressed |
| firewalld | default-drop zone, ssh allow | kept (PackageKit/avahi/cups runtime-disabled, not depsolve-removed) |
| USBGuard | default-block USB | id-based rules.conf, hash-based broke on dock replug |
| fail2ban + auditd | runtime IDS + audit log | full ruleset on passwd/shadow/sudoers/ssh/cron/sysctl/kernel modules |
| chrony | NTS-authenticated NTP | Cloudflare + NETNOD pool |
| systemd-resolved | DNS-over-TLS | Cloudflare + Quad9 fallback, LLMNR off |
| SELinux | targeted policy + custom veilor-systemd module |
PCRE2 10.46 vs 10.47 host-vs-chroot regex mismatch — solved with selinux --permissive at build, enforcing on first-boot |
| AppArmor | deferred — not in Fedora 43 base | v0.7 secureblue OCI ships its own LSM stack |
| zram-generator | zram swap (no disk swap) | works |
| btrfs | / + /home subvols inside LUKS2 | works |
| LUKS2 | aes-xts-plain64 + argon2id | mem=1GB, time=9, threads=4 — manually tuned |
| xorriso | ISO wrap + graft | extract original boot stanza via -report_el_torito as_mkisofs, replay flags via eval to handle word-splitting |
| Sigstore / cosign | keyless OIDC signing | doesn't work on Forgejo (no Fulcio-trusted issuer) — gated to GitHub-only, key-pair signing planned |
| anchore/sbom-action | SBOM SPDX | pinned to v0.17.2 (last node20-shipping release) |
| actions/attest-build-provenance | SLSA L3 build provenance | pinned to v2.2.3 |
| BlueBuild | OCI image build for v0.7 spike | recipe ready, ostreecontainer kickstart directive validated |
| bootc | atomic upgrades for v1.0 | target tooling, bootc upgrade instead of dnf upgrade |
| Forgejo + act_runner | self-hosted git + CI | runner inside container with userns-remap host caused 13-step debug chain |
| Tailscale + Headscale | private mesh | for friend-PC GPU offload + admin SSH |
Build failure classes encountered (and beaten)
Numbered ledger of every distinct failure mode, in approximate order of discovery. Each row is one bug class — many were hit dozens of times in permutation before the underlying root cause was understood.
Phase A — local + livemedia-creator (v0.1 → v0.2.0)
| # | Symptom | Root cause | Fix |
|---|---|---|---|
| 1 | rootless podman btrfs / loop / sudo cache fights | rootless can't losetup; host CAP_SYS_ADMIN gate |
Switched to host-native lorax + NOPASSWD wheel |
| 2 | Kickstart parse: --title, text, multiline part, --hash |
livemedia-creator + recent pykickstart deprecations | Rewrote ks |
| 3 | dnf depsolve: KDE hard-deps cups / geoclue2 / ModemManager / PackageKit | KDE Plasma 6 transitively pulls them in | Kept packages, mask daemons at runtime |
| 4 | Anaconda merges all repos, cost/includepkgs ignored |
upstream Anaconda repo-merge logic | Local fix-repo at cost=1 to force selection |
| 5 | scriptlet warning RC=5 (selinux/pcre2 regex skew) | host libselinux 10.46 vs chroot's selinux-policy file_contexts.bin built against 10.47 | fix-repo provides matched 10.47 pair |
| 6 | dnf transaction RC=5 on non-critical scriptlet | RPM-6.0 cmdline-mode regression | Patched anaconda transaction_progress.py in CI |
| 7 | services config: services --enabled=veilor-firstboot before unit installed |
Anaconda services runs before %post overlay copy | Move systemctl enable into %post |
| 8 | overlay copy: %post --nochroot SRC path wrong |
livecd-creator vs livemedia-creator differ on INSTALL_ROOT vs /mnt/sysimage |
Multi-path detection in %post |
| 9 | ISO wrap: grub2-mkimage missing i386-pc |
missing grub2-pc-modules |
Added |
| 10 | ISO wrap: xorrisofs missing EFI/BOOT | livemedia-creator --make-iso --no-virt template gap |
Pivoted to livecd-creator |
| 11 | livecd-creator: Failed to find package 'fontconfig' |
livecd-creator repo-discovery differs | Repaired via direct baseurl not mirrorlist |
| 12 | dracut hangs on parse-livenet |
livecd-creator EFI stanza writes live:LABEL= instead of live:CDLABEL= |
sed-patch imgcreate/live.py in CI |
Phase B — boot UX + LUKS + theming (v0.2.4 → v0.5.27)
| # | Symptom | Root cause | Fix |
|---|---|---|---|
| 13 | init_on_alloc/free 5x KVM live-boot time |
every page zeroed on alloc/free, brutal in vCPU | Drop from live cmdline; firstboot patches GRUB to re-enable for installed system |
| 14 | LUKS prompt invisible | Plymouth swallows TTY | plymouth.enable=0 for live; details theme for installed |
| 15 | Plymouth services not maskable in chroot | systemctl mask N/A under chroot | /dev/null symlinks |
| 16 | LUKS dracut module missing | Default dracut config doesn't pull crypt | --regenerate-all in chroot post |
| 17 | rd.luks.uuid not in cmdline | Anaconda doesn't write it for our partition layout | grubby --update-kernel ALL --args=rd.luks.uuid=... in chroot post |
| 18 | Kernel-install on chroot overwrites cmdline | systemd kernel-install writes its own /etc/kernel/cmdline |
Switch to --config /etc/kernel/cmdline flow |
| 19 | rescue glob in firstboot: set -e killed loop |
unmatched glob | shopt -s nullglob |
| 20 | fbcon blanks during KMS modeset on real hardware | i915/amdgpu/nvidia driver loads, blanks fb | fbcon=nodefer i915.modeset=1 amdgpu.modeset=1 nvidia-drm.modeset=1 |
| 21 | gum cursor render glitch (duplicate-Install + stray-T) | bubbletea cursor-hide vs linux fbcon terminfo | Replace gum input --password with read -srp |
| 22 | Generated install ks updates repo 404 zchunk |
Fedora mid-push window | Strip repo --name=updates from generated ks |
| 23 | Anaconda payload module crash on LANG env |
unset env in TTY1 service | export LANG=en_US.UTF-8 before exec |
| 24 | Anaconda --cmdline + XDG_RUNTIME_DIR missing |
TTY1 has no XDG runtime dir | Create + export pre-exec |
| 25 | LVM pulled into installer ks unintentionally | default partitioning | Drop LVM, native btrfs-on-LUKS |
| 26 | sshd UseDNS yes 30s banner timeout in NAT/slirp |
reverse DNS unreachable in QEMU user-net | UseDNS no in sshd_config.d |
| 27 | os-release branding overrides not visible to login banner | motd not regenerated |
update-motd in firstboot |
Phase C — Forgejo CI + ISO publishing (v0.5.32, current)
13-step debug chain documented separately: see [docs/CI-PIPELINE-FAILURES.md] (live in conversation log).
Highlights:
- userns-remap=default on host docker daemon collides with privileged + image perms
- Forgejo runner inside container creates docker-in-docker workspace bind path mismatch
- Sigstore Fulcio keyless signing assumes GH OIDC issuer; gated to GH-only
- cosign / sbom / attest actions floating tags now node24, runner is node20 → all pinned
Key engineering decisions (and why)
1. Hybrid kickstart-bootstrap + bootc OCI strategy
Locked at v0.7 spike. Reasons:
- Kickstart (v0.5.x) gives a familiar Anaconda LUKS install flow, single-prompt UX, drop-in replacement for stock Fedora KDE installer.
- OCI image (v0.7+) lets us layer on top of secureblue's already-
signed hardened base. We don't re-derive AppArmor / Trivalent /
custom SELinux — we inherit. Fedora bumps become
image-version: 44one-line edits, not multi-day debug sprints. - bootc-only (v1.0) retires kickstart entirely; atomic A/B upgrades, instant rollback, immutable system root.
2. Brand-clean from day one
grep -ri 'onyx\|192\.168\.0\.\|admin@\|fedora\.local\|xynki\.dev' kickstart/ overlay/ scripts/ assets/ returns zero hits. Enforced via .github/workflows/lint.yml brand-leak job. Every audit run, every CI run, every commit.
3. Forgejo over GitHub for primary
Decision date: 2026-05-06. Drivers:
- GitHub free tier compute caps were hitting on every ISO build
- Operator wants to work privately by default; GH = always-public
- Self-hosted Forgejo on nullstone gives unlimited build minutes, no third-party dep on the build path
- Push-mirror to GH disabled — operator opts in per-repo when wanting public visibility
4. ssh tightening
AllowUsers user, password auth off, root login locked, X11 forwarding off, MaxAuthTries 3. Operator authenticates with ed25519 key only. Documented in feedback_nullstone_ssh_user.md memory.
5. Defense-in-depth mesh
Tailscale + Headscale (hs.s8n.ru) is the SSH on-ramp. Every device joins the tailnet; public SSH is firewalled at the router. Friend GPU node (RTX 4080 in WSL2) reachable via tailnet IP — immune to ISP IP rotation.
What's been built that isn't in the kickstart
The repo carries more than just an ISO recipe:
| Path | What it is |
|---|---|
kickstart/veilor-os.ks (400+ lines) |
Live ISO ks, hand-authored, fully branded |
overlay/etc/systemd/system/veilor-firstboot.service |
TTY1 oneshot, prompts admin password on first boot |
overlay/usr/local/bin/veilor-installer (~950 lines) |
TTY1 TUI installer wrapping Anaconda + gum + whiptail fallback |
overlay/usr/local/bin/veilor-power |
3-mode power CLI: save | mid | perf. Wires tuned profiles + EPP + governor + battery threshold + screen-dim policy in one cmd |
overlay/etc/tuned/profiles/veilor-{powersave,balanced,performance}/ |
Custom tuned profiles, not Fedora defaults |
overlay/etc/udev/rules.d/{90-veilor-ac-switch,91-veilor-battery-threshold}.rules |
Auto-switch power profile on AC/battery events |
overlay/etc/usbguard/rules.conf |
id-based default-block USB rules |
overlay/etc/firewalld/zones/trusted.xml |
tailscale0 trust override |
overlay/etc/skel/.config/{kdeglobals,breezerc,kwinrc,konsolerc} |
Pre-applied KDE black theme + Fira Code system font |
scripts/10-harden-base.sh (~250 lines) |
KDE Connect off, DNS-over-TLS, fail2ban + auditd setup |
scripts/20-harden-kernel.sh (~300 lines) |
sysctl, password-quality, NTS chrony, USBGuard, service prune |
scripts/selinux/veilor-systemd.te |
Custom SELinux module (targeted policy gap fixes) |
scripts/30-apply-v03-theme.sh |
Plymouth + SDDM + Konsole + wallpaper apply |
scripts/40-apparmor.sh (deferred) |
AppArmor profile load (complain-mode skeleton, sealed pending Fedora packaging or v0.7 secureblue) |
bluebuild/recipe.yml |
v0.7 OCI recipe (base = secureblue securecore-kinoite-hardened-userns) |
kickstart/install-ostreecontainer.ks |
v0.7 install ks: 10 lines, just ostreecontainer --url=ghcr.io/veilor-org/veilor-os:43 --transport=registry |
assets/installer/{banner.txt,colors.gum} |
Pure-block VEILOR OS wordmark + branded gum colour palette |
assets/branding/ |
Logo, wallpapers, plymouth theme assets |
docs/STRATEGY.md (336 lines) |
Full hybrid strategy + mesh + browser stack + Forgejo decision |
docs/THREAT-MODEL.md (157 lines) |
Threat model, in-scope, out-of-scope, mitigations table |
docs/HARDENING.md (194 lines) |
Full hardening reference |
docs/ROADMAP.md (332 lines) |
v0.5.x → v0.7 → v1.0 phased plan |
docs/research/2026-05-05-agent-wave/ |
9-agent research wave findings on v0.5.32 blockers |
test/TESTING.md + test/run-vm.sh + test/test-runs/ |
Standardised hybrid VM test method, codified after v0.5.27 surfaced 4 regressions in one session |
.github/workflows/{build-iso.yml,lint.yml,build-bluebuild.yml} |
CI for v0.5.x flat ISO + v0.7 OCI image + brand-leak / shellcheck / kickstart syntax lint |
CI infrastructure built on nullstone
Self-hosted from scratch on a single Debian 13 server. All running, all behind Traefik with LE certs via Gandi LiveDNS DNS-01.
| Service | Role | Notes |
|---|---|---|
Forgejo (git.s8n.ru) |
git host + container registry | code 9.0.3 + gitea 1.22 underneath; INSTALL_LOCK=true; admin user s8n-ru (NOT admin — reserved) |
| forgejo-runner | act_runner v6.4.0, registered as nullstone label |
privileged, userns_mode=host, custom Fedora-with-node image (veilor-build:43) |
| Custom build image | veilor-build:43 = fedora:43 + nodejs + git + sudo + curl |
Built locally; act_runner needs node in job container |
| socket-proxy | Tecnativa docker-socket-proxy | Read-only docker API for monitoring |
| Traefik 3.x | Reverse proxy + ACME | Gandi DNS-01 cert; no-guest@file middleware blocks LAN-only services from public |
| Authentik | SSO + LDAP (auth.s8n.ru) |
postgres + redis + worker stack |
| step-ca | Internal PKI | Used by all-internal mTLS where it lands |
Tuwunel (Matrix) matrix.veilor.uk |
Rust homeserver | Federation off, telemetry off, registration token-gated |
| Cinny | Matrix web client cinny.txt.s8n.ru |
Second isolated instance |
| Misskey | Private Twitter rebrand at x.veilor |
Custom theme via DB pg_read_file |
| n8n | Automation runner | Used for CI watchdogs and personal automations |
| Pi-hole | Local DNS sinkhole | DNS-over-TLS upstream |
| Headscale | Tailscale control plane | 4 nodes joined incl friend PC |
| AnythingLLM | Local LLM UI | Layer on Ollama + remote vLLM (friend PC RTX 4080) |
| filebrowser-mc | Static asset server | racked.ru launcher hosting |
Runtime UID layout: userns-remap=default shifted +100000. Backup
script + ACL on docker.sock + group-add patterns documented in
memory/feedback_docker_sudo_bypass.md.
Receipts
- Forgejo repo: https://git.s8n.ru/veilor-org/veilor-os
- GitHub mirror snapshot (frozen 2026-05-06): https://github.com/veilor-org/veilor-os
- ci-latest rolling release (live): https://git.s8n.ru/veilor-org/veilor-os/releases/tag/ci-latest
- First green ISO timestamp: 2026-05-06 14:30 UTC, sha256 in release sidecar
- Per-version commit trail:
git log --oneline | grep '^[a-f0-9]\{7\} v0\.'shows everyv0.x.y: <bug>ship line - Test method evolution:
test/METHOD-CHANGELOG.md - Strategy lock:
docs/STRATEGY.md, 2026-05-05 - 9-agent research wave findings:
docs/research/2026-05-05-agent-wave/ - Threat model:
docs/THREAT-MODEL.md - Hardening reference:
docs/HARDENING.md - Roadmap:
docs/ROADMAP.md
What this took
This is a single-operator + AI-accelerated project. No team, no funding, no upstream maintainer hat. Most of the work happened across ~6 weeks of evenings and weekends. AI agents (Claude Opus 4.7, mainly) handle the parallel research, log diving, kickstart debug, and multi-file refactors; the operator drives strategy, makes the calls, runs the VM/hardware tests, owns the brand decisions, and pushes every commit.
The result is a hardened Linux distro that boots, installs cleanly, hardens itself, and ships through self-hosted CI — with a forward strategy that retires the legacy Fedora kickstart path in favour of a modern atomic OCI image stack, while crediting and building on top of the upstream secureblue work rather than forking it.
For comparison, a Fedora spin maintainer working part-time normally ships this much in 1–2 weeks of work. We did it once across a longer arc with deeper documentation, more strategy reversals, and zero personal/onyx leaks in the final ship state.