diff --git a/README.md b/README.md index 32d747a..4e51465 100644 --- a/README.md +++ b/README.md @@ -4,31 +4,42 @@ [![Build veilor-os ISO](https://github.com/veilor-org/veilor-os/actions/workflows/build-iso.yml/badge.svg)](https://github.com/veilor-org/veilor-os/actions/workflows/build-iso.yml) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) -[![Status: pre-release](https://img.shields.io/badge/status-pre--release_v0.2.5-orange)](CHANGELOG.md) veilor-os is a Fedora 43 KDE Plasma remix for operators who want a clean, fast, opinionated desktop with serious hardening already wired in. Boot the ISO, set an admin password, work. No installer wizard. No initial-setup screen. No telemetry. No "would you like to enable X" prompts. +The current install path is an Anaconda kickstart with a custom gum TUI +on top. v0.7+ ships a hybrid path: the kickstart ISO becomes the bootstrap +installer (Anaconda's LUKS UX is mature), but the root filesystem is +populated directly from a cosign-signed bootc OCI image built via BlueBuild +on top of [secureblue](https://github.com/secureblue/secureblue)'s +hardened Kinoite variant. Updates from there flow through `bootc upgrade` +— atomic A/B, instant rollback. v1.0 is bootc-only. + +See [docs/STRATEGY.md](docs/STRATEGY.md) for the full trajectory. + --- ## Status -**Pre-release `v0.2.5`** — first feature-complete ISO that actually applies -the veilor-os overlay to the installed system. The build pipeline is green -on CI; the live ISO boots to KDE on KVM and bare metal. See -[CHANGELOG.md](CHANGELOG.md) for the full v0.2.0 → v0.2.5 story (it is -worth reading — five real bugs caught and documented). +Active development on the install path. Three bug classes have been +worked through (LUKS unlock cmdline, anaconda RPM-6.0 cmdline-mode +brittleness, bootloader install via `gen_grub_cfgstub`); current focus +is the v0.5.32 blocker list from the +[2026-05-05 9-agent research wave](docs/research/2026-05-05-agent-wave/README.md). -What is **done**: hardening (SELinux, sysctl, USBGuard, fail2ban, +What is **shipping**: hardening (SELinux, sysctl, USBGuard, fail2ban, firewalld), KDE black theme, Fira Code system font, 3-mode power management, single-prompt LUKS install, first-boot admin password flow, reproducible CI build, EFI+BIOS bootable live ISO. What is **planned** (see [docs/ROADMAP.md](docs/ROADMAP.md)): Plymouth -black theme, SDDM theme, signed ISOs (own MOK + GPG), AppArmor + nftables, -veilor-update / veilor-doctor helpers, public docs site. ++ SDDM polish, signed ISOs (own MOK + GPG, sigstore/cosign on OCI), +AppArmor + nftables stack, `veilor-update` / `veilor-doctor` / +`veilor-postinstall` helpers, public docs site, **bootc OCI hybrid +spike at v0.7**, **bootc-only at v1.0**. --- diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md index 23fa559..8e09f94 100644 --- a/docs/ROADMAP.md +++ b/docs/ROADMAP.md @@ -223,15 +223,17 @@ public, benchmarks come after. After threat model, not before. 5. **Press kit** — wallpapers, logo, screenshots, feature one-liner. -### Hybrid bootc spike — layer on secureblue (REVISED 2026-05-05) +### Hybrid bootc spike — layer on secureblue, install via `ostreecontainer` (REVISED 2026-05-05) The original v0.7 entry called for a Containerfile-from-scratch spike on `quay.io/fedora/fedora-bootc:43`. Research on 2026-05-05 (see `docs/STRATEGY.md` and -`docs/research/2026-05-05-agent-wave/`) found a faster path: -**layer veilor's branding + threat model + UX on top of -secureblue's already-shipping `securecore-kinoite-hardened-userns` -OCI image** via a BlueBuild recipe. +`docs/research/2026-05-05-agent-wave/`), then a parent-operator +refinement same day, locked the path: **layer veilor's branding + +threat model + UX on top of secureblue's already-shipping +`securecore-kinoite-hardened-userns` OCI image** via a BlueBuild +recipe, and install it directly during the Anaconda pass via the +`ostreecontainer` kickstart directive (no first-boot rebase). Reasoning: @@ -240,27 +242,55 @@ Reasoning: surface we'd need to build alone (sysctl + kargs + SELinux custom policy + USBGuard + hardened-malloc + Unbound DoT + cosign-signed OCI build pipeline). -- Containerfile-from-scratch spike: 1 week to first ISO. - BlueBuild recipe extending secureblue: ~2 days, ~200 lines - YAML. The hardening review is inherited. +- Containerfile-from-scratch spike: 1 week to first ISO. BlueBuild + recipe extending secureblue: ~2 days. With the `ostreecontainer` + swap (no `veilor-firstboot-rebase.service`, no transition window): + **~1 day**. - secureblue does NOT publish a threat model. Athena OS does (their main differentiator, only public threat model in hardened-Linux 2026). Our `docs/THREAT-MODEL.md` (drafted) gets us ahead of both on the one axis that matters most for a security-branded distro. -Hybrid path locked: kickstart ISO stays as the **bootstrap -installer** (Anaconda's LUKS UX is mature). On first boot, a -one-shot `veilor-firstboot-rebase` service runs `bootc rebase -ghcr.io/veilor/veilor-os:43`. From then on, `bootc upgrade` is -the update channel. v1.0 deprecates the kickstart entirely. +Hybrid path locked: -Overrides we apply over secureblue: replace Trivalent (their -single-maintainer browser fork) with Brave or Mullvad-Browser; -keep sudo (revert `run0`-only); re-enable Xwayland. +- Kickstart ISO stays as the **bootstrap installer** (Anaconda's + LUKS UX is mature). +- `%packages` is replaced with `ostreecontainer + --url=ghcr.io/veilor/veilor-os:43 --transport=registry` so the + install pass populates `/` directly from the OCI image — no + first-boot rebase, no second reboot. +- From boot one onward, `bootc upgrade` is the update channel. +- v1.0 deprecates the kickstart entirely. + +Stay on `ostreecontainer` through v0.8. **Do NOT migrate to the new +`bootc` kickstart command until v1.0** — it blocks multi-disk and +authenticated registries (likely needed eventually). **Do NOT use** +`bootc-image-builder anaconda-iso` output — deprecated in +image-builder v44+. Produce OCI image and bootstrap ISO as +**separate artifacts**. + +Overrides over secureblue: keep Trivalent as default (their COPR +tracks upstream M147+ within hours; reverses earlier draft that +treated it as override-and-remove); add Mullvad Browser alongside; +gate Thorium behind `ujust install-thorium` with CVE-lag warning; +restore sudo (revert `run0`-only); re-enable Xwayland. + +Mesh stack baked in: Tailscale (Day 1, daily driver), Yggdrasil-go +(Day 1, idle warm-fallback), Reticulum/RetiNet AGPL fork (opt-in +via `ujust install-reticulum`). See `docs/STRATEGY.md` mesh stack +section for the layer breakdown and threat-floor table. Full plan: `docs/STRATEGY.md`. Spike will land in -`bluebuild/recipe.yml` plus `.github/workflows/build-bluebuild.yml`. +`bluebuild/recipe.yml` plus `.github/workflows/build-bluebuild.yml`, +on a separate branch — does NOT land in v0.5.x main. + +External dependency tracked: Traefik `no-guest@file` ACL on +nullstone is currently an `0.0.0.0/0` allow-all stub. Must be +fixed before veilor-os first-public-ISO ships, otherwise +`tag:guest` provisioning leaks the full vhost surface to every +veilor user. **Parent operator owns the fix; not in veilor-os +scope.** --- diff --git a/docs/STRATEGY.md b/docs/STRATEGY.md index b40455c..69be1d9 100644 --- a/docs/STRATEGY.md +++ b/docs/STRATEGY.md @@ -1,6 +1,8 @@ # veilor-os Strategy — Hybrid kickstart bootstrap + bootc OCI -Decision date: **2026-05-05** +Decision date: **2026-05-05** (refined same day from parent-operator +handoff, locks the `ostreecontainer` install path, mesh stack +bake-in, browser stack, Iroh seeding roadmap, and threat floor table). Locked at: **v0.5.31 → v0.7 spike → v1.0** ## TL;DR @@ -8,9 +10,10 @@ Locked at: **v0.5.31 → v0.7 spike → v1.0** - Keep the Anaconda-driven kickstart ISO as the **bootstrap installer** (LUKS UX is mature, single passphrase prompt, custom partitioning works). -- On first boot, the installed system is automatically rebased to a - **veilor-os OCI image** built via BlueBuild on top of secureblue's - `securecore-kinoite-hardened-userns`. +- Anaconda's `ostreecontainer` directive populates the root filesystem + directly from a **veilor-os OCI image** (built via BlueBuild on top + of secureblue's `securecore-kinoite-hardened-userns`) **during the + install pass — no first-boot rebase, no mutable→atomic transition**. - All future updates flow through `bootc upgrade` — atomic A/B, instant rollback, cosign-signed. - The kickstart-driven mutable-root path is deprecated at v1.0; kept @@ -20,20 +23,53 @@ Locked at: **v0.5.31 → v0.7 spike → v1.0** Pure pivot to bootc-from-scratch (Agent 3's spike plan) was **1 week to first ISO**. Pure pivot to layering on secureblue is **2 days to -first ISO** because the hardening work is already done. But both -require throwing away the partitioning UX we already have working in -Anaconda. +first ISO** because the hardening work is already done. The +`ostreecontainer` refinement compresses that to **1 day** by +eliminating the first-boot rebase choreography (no +`veilor-firstboot-rebase.service`, no second reboot, no transition +window where the system is half-mutable, half-atomic). + +Both pure-pivot paths require throwing away the partitioning UX we +already have working in Anaconda. Hybrid keeps it. Hybrid: - **Day-zero install:** Anaconda kickstart + custom partitioning + LUKS prompt (what we have today). User experience = unchanged. -- **First boot, post-LUKS-unlock:** `bootc rebase - ghcr.io/veilor/veilor-os:43` runs once; pulls the OCI image; next - reboot lands in the veilor OCI tree. +- **End of install pass:** `ostreecontainer + --url=ghcr.io/veilor/veilor-os:43 --transport=registry` populates + `/` from the OCI image. Transition is invisible. +- **First boot:** veilor OCI tree, no rebase, no special service. - **Day-2:** `bootc upgrade` cadence for everything from then on. We keep what works, pivot the part that doesn't. +## ostreecontainer directive (refinement, locked) + +Replace the `%packages` block in the install kickstart with: + +``` +ostreecontainer --url=ghcr.io/veilor/veilor-os:43 --transport=registry +``` + +Keep the existing `part`/LUKS encryption block verbatim — Anaconda +partitions before `ostreecontainer` populates root. + +**Stay on `ostreecontainer` through v0.8.** Do NOT migrate to the new +`bootc` kickstart command until v1.0 — `bootc` blocks multi-disk and +authenticated registries, both of which we'll likely need. + +**Do NOT use** `bootc-image-builder anaconda-iso` output — +deprecated in image-builder v44+. Produce the OCI image and the +bootstrap ISO as **separate artifacts**: + +- OCI image: BlueBuild recipe → cosign-signed image at + `ghcr.io/veilor/veilor-os:43` +- Bootstrap ISO: Anaconda kickstart with `ostreecontainer` directive + pointing at the OCI image + +Reference: ; pykickstart +docs for `ostreecontainer`. + ## Why secureblue underneath | Question | Answer | @@ -47,14 +83,135 @@ We keep what works, pivot the part that doesn't. What we override in our recipe: -- **Browser**: Trivalent (their fork) → Brave / Mullvad-Browser. - Single-maintainer browser fork is unacceptable risk for daily-driver - audience. - **`run0` instead of sudo**: revert. Breaks too many workflows. - **Xwayland disabled**: revert. Some apps still need it. -- **veilor branding**: theme, KDE color scheme, Plymouth, SDDM, font, +- **Veilor branding**: theme, KDE color scheme, Plymouth, SDDM, font, os-release. All `overlay/*` ports verbatim from current repo. +(Browser stack is its own section below — Trivalent is now a *kept* +default, not an override.) + +## Browser stack + +| Role | Pick | Source | +|---|---|---| +| **Default browser** | **Trivalent** (secureblue's hardened Chromium) | Fedora COPR `secureblue/trivalent` — tracks upstream M147+ within hours, ships hardened_malloc + JIT-less + Drumbrake WASM | +| **Anti-fingerprint companion** | **Mullvad Browser** | Clearnet, no Tor, layered alongside Trivalent for pseudonymous browsing | +| **Optional opt-in** | **Thorium** | `ujust install-thorium` only — WARN users of months-long CVE lag (LTS Chromium base, ~9 milestones behind upstream stable as of 2026-05) | + +**DO NOT default to Thorium under any circumstances** — contradicts +the threat model. Trivalent's COPR keeps us inside one-hour-of-upstream +patch latency; Thorium is multi-month-stale and is a perf/media +profile choice, not a security choice. + +The earlier draft of this doc treated Trivalent as an override-and- +remove. That was wrong: Trivalent is exactly the level of hardening +we want for a default browser. Keep it. Add Mullvad alongside. +Move Thorium behind an explicit opt-in. + +## Mesh stack — three-layer warm-stack + +Day 1 ships layers 1 (Tailscale) and 2 (Yggdrasil idle). Layer 3 +(Reticulum) is opt-in via `ujust`. + +### Layer 1 — Tailscale + Headscale (daily driver) + +- Already running on `nullstone`, `hs.s8n.ru`. OIDC via Authentik. +- Veilor OS ships `tailscale-1.94.2+` from official Fedora repo. +- Service unit **pre-disabled** at install time. +- First-boot prompt: "join Veilor mesh? [paste / QR]". On accept: + `tailscale up --login-server=https://hs.s8n.ru` with the user's + pre-auth key. + +### Layer 2 — Yggdrasil-go (warm fallback, idle by default) + +- `yggdrasil-go` 0.5.13+ from COPR / dnf. +- Decentralized IPv6 in `200::/7`. +- systemd unit **enabled** but config = empty `Listen[]`, one + `Public peer` (e.g. `vpn.itrus.su` or another EU peer), + `AllowedPublicKeys` allowlist mode (no allow-all). +- WSS:443 transport for ISP DPI evasion. +- Generates ECC keypair on first boot via systemd-tmpfiles or + firstboot script. +- Survives ISP-level Tailscale block (threat floor (ii)). + +### Layer 3 — Reticulum (opt-in) + +- **RetiNet AGPL fork** (NOT upstream RNS — upstream has an anti-AI + license clause incompatible with our governance). Sourced from the + Codeberg AGPL fork. +- Sideband (Android/desktop messenger built on RNS). +- Install via `ujust install-reticulum`. NOT auto-started until + RetiNet stabilizes. +- Default config when enabled: `AutoInterface` (LAN multicast) + + 1–2 TCP backbone peers. +- RNode hardware (LoRa transceiver) bundle as separate + `ujust install-reticulum-rnode`. +- Survives total internet outage (threat floor (iii)) when paired + with RNode. + +## Onboarding model + +Token-based (paste OR QR, user picks). Misskey signup page mints a +**reusable pre-auth key** (TTL=24h, single-use, regenerated per +signup). First boot of Veilor ISO accepts hex paste OR QR scan of +the same key. + +**NOT auto-OIDC at first boot** — too much Authentik exposure for +day-zero users. + +## Tier model — three-tier + +- `tag:admin` — onyx + failsafe. Full mesh, `*:*`. +- `tag:infra` — nullstone, office. Mesh among themselves; admin + inbound only. +- `tag:guest` — Veilor OS users + friend. ONLY `x.veilor:443` + reachable + future seeded service hostnames whitelisted. +- **Failsafe** — pre-baked admin pre-auth key on yubikey + printed + paper + Authentik OIDC group `tailnet-admin` as second auth path. + +## Threat floor table + +| Floor | Attack | Day 1 (v0.7 ship) | Phase 2 (v0.8) | +|-------|--------|---|---| +| (i) | ISP blocks `s8n.ru` DNS | Tailscale dies, Yggdrasil survives | YES (documented failover) | +| (ii) | ISP blocks Tailscale protocol | Yggdrasil-WSS:443 survives | YES | +| (iii) | Internet unreachable | RNS over LoRa survives | OPT-IN (RetiNet + RNode) | + +Day 1 must hold floor (i). Floors (ii) and (iii) become P2 once +Yggdrasil is promoted from idle to documented failover. + +## Iroh seeding daemon (Phase 2 / v0.8) + +- `veilor-seed.service` systemd unit, runs as `_veilor-seed` user. +- Watches `/var/lib//files/` blob store directories. +- BLAKE3-hashes new blobs, registers with local iroh node. +- Publishes tickets on per-service `iroh-gossip` topic. +- LRU local cache, default 10 GB. +- Sidecar mirrors service blob stores: Misskey `/files/`, Matrix + media, `dl.veilor` downloads. +- Other Veilor nodes pull lazily on cache miss. +- **DEFER DB replication forever.** Static media only. + +DOCUMENT but DO NOT IMPLEMENT until **Iroh hits 1.0** (currently +0.96–0.98 RC season; 1.0 target Q1 2026 slipped, watching). + +Reference: . + +## External dependency — Phase 0 (NOT veilor-os scope) + +Real ACL gap on nullstone Traefik right now: friend on `tag:guest` +can reach `nullstone:443` → SNI-routes to ALL Traefik vhosts +(`sys.s8n.ru`, `pihole.s8n.ru`, `hs.s8n.ru`, `auth.s8n.ru`, n8n, rc, +mx, …). Only per-vhost auth blocks them. The `no-guest@file` Traefik +middleware that should fix this is currently an `0.0.0.0/0` +allow-all stub (neutralized 2026-05-03 from XFF chain breakage). + +**veilor-os does NOT fix this.** Tracked here as an external +dependency: ACL fix on nullstone Traefik **required before veilor-os +first-public-ISO ships**, otherwise `tag:guest` provisioning leaks +the full vhost surface to every veilor user. Parent operator owns it. + ## Strategic credibility win secureblue does NOT publish a threat model. Athena OS does, and it's @@ -71,14 +228,16 @@ distro: **honest, scoped, public threat model**. | v0.5.31 | shipped | Anaconda kickstart, mutable root | | v0.5.32 | active — top blockers from 9-agent wave | Anaconda kickstart | | v0.5.x → v0.6 | maintenance | Anaconda kickstart, ergonomics + UX polish | -| **v0.7 spike** | **2-day BlueBuild prototype** | First veilor OCI image extending secureblue-kinoite-hardened | -| v0.7 ship | ISO bootstraps install, first boot rebases to OCI | Hybrid path live | -| **v1.0** | **bootc-only**, kickstart deprecated | `bootc upgrade` for all updates | +| **v0.7 spike** | **1-day BlueBuild prototype** (was 2 days; `ostreecontainer` removes first-boot-rebase work) | First veilor OCI image extending secureblue-kinoite-hardened | +| v0.7 ship | ISO bootstraps install, `ostreecontainer` populates from OCI in-pass | Hybrid path live | +| v0.8 | Iroh seeding (P2P static media), Yggdrasil promoted from idle to documented failover, RetiNet stabilization watch | bootc-only direction | +| **v1.0** | **bootc-only**, kickstart deprecated, possibly migrate `ostreecontainer` → new `bootc` kickstart command if multi-disk + auth-registry blockers resolved upstream | `bootc upgrade` for all updates | -The `bootc-image-builder` spike plan (Agent 3) is **superseded** by -this hybrid: don't build a Containerfile from scratch on -`fedora-bootc:43`. Instead, write a BlueBuild recipe on -`securecore-kinoite-hardened-userns`. Spike compresses 1 week → 2 days. +The Containerfile-from-scratch spike plan (Agent 3 of 2026-05-05 +wave) is **superseded** by this hybrid: don't build a Containerfile +from scratch on `fedora-bootc:43`. Instead, write a BlueBuild recipe +on `securecore-kinoite-hardened-userns`. With `ostreecontainer` +swap, spike compresses 1 week → 1 day. ## Next concrete steps @@ -89,39 +248,55 @@ suspend/resume wifi fix, firstboot WantedBy, USBGuard id-rules, firewalld tailscale0 zone, KMS modeset, /etc/skel branding, virtio-9p log capture. -### v0.7-spike (2 days) +`ostreecontainer` swap **does NOT land in v0.5.32 main.** It belongs +in the v0.7 spike branch only. + +### v0.7-spike (1 day, separate branch) 1. New repo dir: `bluebuild/recipe.yml`. 2. `from`: `ghcr.io/secureblue/securecore-kinoite-hardened-userns:latest`. 3. Override modules: - `type: files` — stamp our `overlay/*` tree (branding, themes, veilor scripts, sddm theme, plymouth theme). - - `type: rpm-ostree` — install Brave + restore Xwayland. - - `type: rpm-ostree` — remove Trivalent. + - `type: rpm-ostree` — install Mullvad Browser + restore Xwayland + + re-enable sudo (revert run0). + - **Keep Trivalent** as default (was wrongly marked for removal in + the first draft of this doc). - `type: brand` — PRETTY_NAME, GRUB_DISTRIBUTOR, distributor URL. + - `type: files` — pre-disabled `tailscale.service`, idle + `yggdrasil.service`, `ujust install-reticulum` and + `ujust install-thorium` recipes. 4. `.github/workflows/build-bluebuild.yml` — pull BlueBuild action, build + cosign sign + push to GHCR. -5. `kickstart/install.ks` — add a one-shot `veilor-firstboot-rebase` - service that runs `rpm-ostree rebase ghcr.io/veilor/veilor-os:43` - then disables itself. User reboots once and is on the OCI image. +5. `kickstart/install.ks` — replace `%packages` block with + `ostreecontainer --url=ghcr.io/veilor/veilor-os:43 + --transport=registry`. Keep existing partitioning + LUKS block + verbatim. **Drop** all planned `veilor-firstboot-rebase.service` + work — no longer needed. ### v1.0 — bootc-only - Drop `kickstart/veilor-os.ks`, drop `livecd-creator` workflow. -- Keep `installer-iso.toml` for the bootstrap ISO (built via - bootc-image-builder); the OCI image is the source of truth. +- Bootstrap ISO is built as a **separate artifact** (NOT via + `bootc-image-builder anaconda-iso`, which was deprecated in + image-builder v44). +- The OCI image is the source of truth. - `veilor-update` becomes thin `bootc upgrade --apply` wrapper. +- Migrate `ostreecontainer` directive → new `bootc` kickstart + command IF multi-disk + authenticated-registry support has landed + upstream by then. ## Open questions - Does secureblue accept upstream contributions? If yes, send our - USBGuard id-based-rules fix and our threat model framework. -- Brave vs Mullvad-Browser: Brave has telemetry concerns out of box; - Mullvad-Browser is Tor-Browser-derived but not designed for - daily-driver. Test both in spike. -- Recovery flow when bootc rebase fails on first boot — need fallback - to keep the kickstart-installed system bootable. Likely already - handled by bootc's A/B; verify in spike. + USBGuard id-based-rules fix and our threat-model framework. +- Recovery flow when `ostreecontainer` install pass fails — Anaconda + should abort cleanly; verify in spike that no half-installed + state is bootable. +- Iroh 1.0 timing — currently 0.96–0.98 RC; Q1 2026 target slipped. + Re-evaluate Phase 2 schedule when 1.0 lands. +- RetiNet upstream stabilization — track Codeberg fork for releases. + If it stalls > 6 months we re-evaluate Layer 3. - Fedora 44 transition: secureblue tracks Fedora releases (current `v4.9` on F44). If we follow, we get F44 for free at the same time upstream does. @@ -129,9 +304,13 @@ log capture. ## See also - `docs/THREAT-MODEL.md` — drafted, needs publish for v0.7 -- `docs/ROADMAP.md` — to be updated to reflect this strategy +- `docs/ROADMAP.md` — updated to reflect this strategy - `docs/research/2026-05-05-agent-wave/03-bootc-spike-plan.md` — superseded by this hybrid (kept as reference for the Containerfile-from-scratch alternative) - secureblue: - BlueBuild: +- bootc / ostreecontainer docs: +- Yggdrasil: +- Reticulum manual: +- Iroh blobs design: