veilor-os/docs/ROADMAP.md
veilor-org 7060d9aa6b docs: refine strategy — ostreecontainer install + mesh stack + browser stack
Refines docs/STRATEGY.md per parent-operator handoff (2026-05-05).
Locks in five things the original draft didn't cover, and corrects
one mistake.

## Refinement: ostreecontainer install path

The original draft proposed a two-step install: Anaconda partitions
+ kickstart, then on first boot a `veilor-firstboot-rebase.service`
runs `bootc rebase ghcr.io/veilor/veilor-os:43`. This commit drops
that step.

Anaconda's `ostreecontainer --url=... --transport=registry`
directive populates the root filesystem directly from the OCI image
during the install pass. No first-boot rebase, no transition
window, no second reboot. Same end state, simpler path.

Stay on `ostreecontainer` through v0.8. Do NOT migrate to the new
`bootc` kickstart command until v1.0 — it blocks multi-disk and
authenticated registries. Do NOT use `bootc-image-builder
anaconda-iso` output — deprecated in image-builder v44+. Produce
the OCI image and the bootstrap ISO as separate artifacts.

This compresses the v0.7 BlueBuild spike from 2 days → 1 day.

## Correction: keep Trivalent as default

The original strategy.md treated Trivalent (secureblue's hardened
Chromium) as an override-and-remove. That was wrong: Trivalent's
COPR tracks upstream M147+ within hours, ships hardened_malloc +
JIT-less + Drumbrake WASM. Default browser pick.

Mullvad Browser layered alongside for anti-fingerprint. Thorium
remains opt-in via `ujust install-thorium` only — its CVE lag is
months and contradicts the threat model. Never default.

## Mesh stack baked in

Three-layer warm-stack documented in STRATEGY.md:
- L3a Tailscale + Headscale (Day 1, daily driver)
- L3b Yggdrasil-go (Day 1, idle warm-fallback, AllowedPublicKeys mode)
- L3c Reticulum/RetiNet AGPL fork (opt-in via ujust install-reticulum)

Threat floor table: ISP-DNS-block (i, Day 1), ISP-Tailscale-block
(ii, Phase 2 promote Yggdrasil), internet-down (iii, opt-in RetiNet
+ RNode).

Tier model: tag:admin / tag:infra / tag:guest with failsafe pre-auth
key on yubikey + paper + Authentik OIDC group.

## Onboarding

Token paste / QR (user picks). Misskey signup mints reusable
24h-TTL pre-auth key. NOT auto-OIDC at first boot.

## Iroh seeding daemon stub (v0.8 / Phase 2)

`veilor-seed.service` documented but NOT implemented until Iroh hits
1.0 (current 0.96–0.98 RC, Q1 2026 target slipped). BLAKE3 +
iroh-gossip per-service topic. Static media only — DEFER DB
replication forever.

## External dependency tracked

nullstone Traefik `no-guest@file` ACL is currently 0.0.0.0/0
allow-all (XFF chain breakage 2026-05-03). Must be fixed before
veilor-os first-public-ISO ships, otherwise tag:guest provisioning
leaks the full vhost surface to every veilor user. Parent operator
owns the fix; explicitly out of veilor-os scope.

## Files

- docs/STRATEGY.md — full refinement
- docs/ROADMAP.md — v0.7 spike entry now reflects ostreecontainer
  + mesh stack + 1-day spike target
- README.md — drops the "v0.2.5 pre-release" badge + status box
  (out of date), adds bootc/atomic trajectory paragraph

## What did NOT change

- v0.5.x main branch is untouched. The ostreecontainer swap belongs
  in the v0.7 spike branch, NOT v0.5.32.
- nullstone Traefik config is untouched. Out of scope.
- The kickstart and overlay code is untouched.
2026-05-05 15:15:52 +01:00

332 lines
15 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Roadmap
Versioned roadmap for veilor-os. Targets are intentionally short and
testable. No fluff. Items in earlier versions are blockers for later
ones unless explicitly noted.
For the historical record of what landed in each release, see
[`../CHANGELOG.md`](../CHANGELOG.md).
---
## Lessons learned through v0.5.x install grind
Five things v0.5.2731 changed about how we plan:
1. **Anaconda + RPM-6.0 + `--cmdline` is brittle** — three install
failures, kernel cmdline written to four places before one worked.
`--location=none` skips `CollectKernelArgumentsTask`,
`kernel-install` reads `/etc/kernel/cmdline` not `/proc/cmdline`,
and `transaction_progress.py` masks real failures if patched too
broadly. Justifies promoting the bootc-image-builder spike to v0.7.
2. **Test procedure must gate every tag** — v0.5.27 only surfaced four
bugs in one VM run because the run walked every step in order.
`test/TESTING.md` and `test/test-runs/` are now load-bearing.
3. **Real hardware is not optional** — VM catches install logic, not
KMS / fbcon / firmware. Spare laptop + friend's laptop must run
pre-tag, every time.
4. **Multi-agent debug waves work, but only with a verifier** — the
v0.5.31 four-bug fix came from a 4-agent verification wave on
v0.5.30 outcome. Wave + verifier = signal; wave alone = noise.
5. **"We ask once, with sane defaults" is the distro UX** — every
v0.5 install bug we shipped a workaround for (locale, hostname,
USBGuard policy, drivers) is something `veilor-postinstall` could
ask the user about cleanly on first boot. That promotes
`veilor-postinstall` from v0.6 background item to flagship.
---
## v0.2 — green ISO + base hardening (DONE)
Reproducible CI build pipeline. UEFI+BIOS bootable live ISO from a single
kickstart. Single-prompt LUKS install. First-boot admin password flow.
Full overlay applied (sysctl, sshd, sudoers, tuned profiles, KDE black
theme, Fira Code, branded `/etc/os-release`). SELinux enforcing.
firewalld drop zone. fail2ban + auditd + USBGuard active. The build
chased five real bugs (DEST hardcoded, `set -eu` killing `cp`,
os-release symlink, missing admin user, `LABEL=` vs `CDLABEL=` in
livecd-tools) before greening.
Released `v0.2.5` on 2026-05-01. CI on every push to `main`.
---
## v0.5.27v0.5.31 — install path stabilisation (DONE)
The bridge between v0.2 (greens at all) and v0.3 (looks polished). All
install-path bugs surfaced by the formal hybrid-VM test procedure
(`test/TESTING.md`). Five releases, ~hours of debug, three install
failures before greening.
- **v0.5.27 (DONE)** — `rd.luks.uuid` via `grubby --update-kernel=ALL`,
GRUB rebrand, `fbcon=nodefer`, ASCII gum cursor.
- **v0.5.28 (DONE)** — locale locked en_US.UTF-8, dropped updates repo,
patched anaconda `transaction_progress.py` to silence `Configuring
xxx.x86_64` scroll, excluded man-db.
- **v0.5.29 (DONE)** — narrowed anaconda patch (was masking real
failures), LUKS UX, initramfs assertion. Five-fix bundle from 7-agent
research wave.
- **v0.5.30 (DONE)** — broad error suppression, manual bootloader path,
virtio log capture for post-mortem.
- **v0.5.31 (DONE)** — `--location=none` was making anaconda skip
`CollectKernelArgumentsTask`; kernel-install reads
`/etc/kernel/cmdline` as source of truth, veilor never wrote it, so
BLS entries shipped with empty cmdline. Three-path write
(`/etc/kernel/cmdline` + `/etc/default/grub` + grubby) plus explicit
`kernel-install add`.
## v0.5.32 — next ship (active)
Outstanding from the grind, immediate priority for the next tag:
- **End-to-end VM green run** — v0.5.31 lands the kernel-cmdline fix
but no full hybrid-VM pass has signed it off. Run the procedure in
`test/TESTING.md` to install + reboot + login, file the report in
`test/test-runs/`, then tag.
- **Real-hardware run on the spare laptop** — VM is necessary not
sufficient. Friend's laptop is mate's-test, spare is ours. KMS,
fbcon, USB controller, real-firmware Secure Boot only show up here.
- **gum input render glitch** — duplicate "Install", stray T in
password fields on linux fbcon. Replace `gum input --password` with
bash `read -srp`; cosmetic only but visible on every install.
---
## v0.3 — UX polish (in progress)
The visible polish layer that v0.2 deferred for build velocity.
- **Plymouth black theme** — boot splash matching the desktop. No Fedora
drum, no white flash. `assets/plymouth/veilor/`.
- **SDDM theme** — black login background, single-user prompt with
`admin` pre-filled, no userlist.
- **Konsole profile** — black background, Fira Code, transparent panel
off (no compositor cost on resume).
- **Wallpaper SVG** — flat black with subtle veilor wordmark, 1080p +
4K + ultrawide variants.
- **Re-enable memory hygiene on installed system.** v0.2.5 stripped
`init_on_alloc=1 init_on_free=1` from the *live* cmdline because they
5x'd KVM boot time. Re-add post-install via `veilor-firstboot` so the
installed system gets the protection without the ISO penalty.
- **USBGuard auto-snapshot on first boot.** Currently the operator
runs `usbguard generate-policy` manually. v0.3 wires this into
`veilor-firstboot` after the password step (with a clear
"plug in trusted devices first" prompt).
Target: this month. None of it is a kickstart change — pure overlay
work.
---
## v0.4 — distribution + signing
Get veilor-os to a state where the ISO is downloadable, verifiable, and
trusted by Secure Boot without user shenanigans.
- **GPG-signed releases.** Tag → CI builds → CI signs ISO + sha256 with
veilor.org release key → GitHub Release artifact carries `.iso.asc`.
- **Reproducible builds.** Pin Fedora compose ID, lock package versions
via `dnf snapshot` or equivalent, document how to verify two builds
match.
- **Own MOK (Machine Owner Key) + sbsign for Secure Boot.** Currently
veilor-os relies on Fedora's signed shim chain. v0.4 ships our own
MOK, signs the kernel + initramfs at build time, optionally enrols
the MOK on first boot for users who want a cleaner trust path.
- **ISO download mirror** — static download page on veilor.org with
current + previous release, sha256, gpg signature. **Not** an RPM
mirror — veilor-os does not ship its own packages, only the spin
configuration.
- **Release process documented** — tagging, CI, signing, mirror sync
in `docs/RELEASE.md`.
---
## v0.5 — hardening tier 2
Hardening that builds on v0.2's foundation. Each item is opt-in unless
specified — defaults stay sane for a daily driver.
- **AppArmor profiles in addition to SELinux.** Stack-not-replace.
Targeted at the browser, the mail client, and anything that touches
attacker-controlled data. SELinux remains the primary MAC.
- **systemd-homed** — encrypted-per-user `~`, suspend-aware, key
unlocked at PAM login. Optional, opt-in via post-install helper.
- **nftables ruleset** alongside firewalld defaults. Default firewalld
policy stays drop; nftables provides advanced filtering for users
who want it.
- **Audit log shipping** — opt-in `auditd` -> remote syslog over TLS,
for users running a central log aggregator.
- **Installer kickstart split** — separate `veilor-os-install.ks` for
installer ISO (real LUKS partitioning, not the live-rootfs
simplification used in v0.2). Lets users install veilor-os as the
primary OS without going through the live boot first.
- **Audit baseline** — re-run the security audit (template in
`security/audit-template.md`) and target a lower risk score than v0.2.
---
## v0.6 — ergonomics (PROMOTED — install grind proved we need this)
Smooth the operator experience so day-to-day work doesn't fight the
hardening. `veilor-postinstall` and `veilor-doctor` were v0.6 background
items — promoted to **headline** features after v0.5.2731 made it
clear that "we ask once, with sane defaults" is what separates a
distro from a kickstart.
- **`veilor-postinstall`** (PROMOTED — flagship of v0.6) — first-login
welcome menu, EndeavourOS-style but cleaner. Single TUI screen:
keyboard layout, locale (deferred from install per v0.5.28),
hostname override, package presets (dev / media / homelab), drivers
(NVIDIA / Intel / AMD), Bluetooth opt-in, USBGuard snapshot, audit
baseline run, `veilor-doctor` first run. Each step skippable, runs
once on first SDDM login, self-deletes the autostart after. This is
the **only** UX feature that ships in v0.6 day one — everything else
builds on it.
- **`veilor-doctor`** (PROMOTED — user-facing, not just dev tool) —
the post-install audit. Walks `getenforce`, `mokutil --sb-state`,
`firewall-cmd`, fail2ban, USBGuard policy, sysctl drift, and reports
drift from baseline. Runs from `veilor-postinstall` on day one, then
weekly via `systemd --user` timer. Plain-English output ("your
firewall is OK", "USBGuard policy has 3 unknown devices"); not a JSON
dump. **Stretch:** machine-readable mode for `veilor-server` later.
- **`veilor-update`** — wraps `dnf upgrade` AND `flatpak update` in
one command. Per `feedback_system_update.md`, partial-update is a
recurring trap; veilor's update tool covers both by default. Adds
pre-check (snapshot available?), auditd pause, post-update SELinux
validation.
- **Opt-in installer ISO** — flip from live-only to live + installer,
user picks at boot menu. Installer uses the v0.5 kickstart with full
LUKS + btrfs subvols + zram.
- **First-boot UX** — replace TTY password prompt with a small
Plymouth-rendered dialog. Less raw.
- **Bluetooth opt-in helper** — single command to enable + bring up
the daemon + add the user to the right group.
---
## v0.7 — public flex + bootc spike
Take veilor-os out of "private repo, contained audience" mode. Order
matters: people demand threat model FIRST when a security distro goes
public, benchmarks come after.
1. **Threat model published** (FIRST — gating item) — what veilor-os
defends against, what it does not. Honest scope. No claim of
anti-state-actor; concrete on lost-laptop, USB-attack, browser
compromise, supply-chain. Reviewers will demand this before reading
anything else.
2. **Public docs site** — Hugo or mdBook on `veilor.org`, generated
from `docs/`. Single source of truth.
3. **Repo public** — flip GitHub visibility, announce.
4. **Comparison + benchmarks** — published numbers vs stock Fedora KDE
on cold boot, idle RAM, idle network egress, suspend/resume time.
After threat model, not before.
5. **Press kit** — wallpapers, logo, screenshots, feature one-liner.
### Hybrid bootc spike — layer on secureblue, install via `ostreecontainer` (REVISED 2026-05-05)
The original v0.7 entry called for a Containerfile-from-scratch
spike on `quay.io/fedora/fedora-bootc:43`. Research on 2026-05-05
(see `docs/STRATEGY.md` and
`docs/research/2026-05-05-agent-wave/`), then a parent-operator
refinement same day, locked the path: **layer veilor's branding +
threat model + UX on top of secureblue's already-shipping
`securecore-kinoite-hardened-userns` OCI image** via a BlueBuild
recipe, and install it directly during the Anaconda pass via the
`ostreecontainer` kickstart directive (no first-boot rebase).
Reasoning:
- secureblue has 30 active contributors, 940 stars, 56 commits
in the last 5 weeks. They've already implemented the hardening
surface we'd need to build alone (sysctl + kargs + SELinux
custom policy + USBGuard + hardened-malloc + Unbound DoT +
cosign-signed OCI build pipeline).
- Containerfile-from-scratch spike: 1 week to first ISO. BlueBuild
recipe extending secureblue: ~2 days. With the `ostreecontainer`
swap (no `veilor-firstboot-rebase.service`, no transition window):
**~1 day**.
- secureblue does NOT publish a threat model. Athena OS does
(their main differentiator, only public threat model in
hardened-Linux 2026). Our `docs/THREAT-MODEL.md` (drafted) gets
us ahead of both on the one axis that matters most for a
security-branded distro.
Hybrid path locked:
- Kickstart ISO stays as the **bootstrap installer** (Anaconda's
LUKS UX is mature).
- `%packages` is replaced with `ostreecontainer
--url=ghcr.io/veilor/veilor-os:43 --transport=registry` so the
install pass populates `/` directly from the OCI image — no
first-boot rebase, no second reboot.
- From boot one onward, `bootc upgrade` is the update channel.
- v1.0 deprecates the kickstart entirely.
Stay on `ostreecontainer` through v0.8. **Do NOT migrate to the new
`bootc` kickstart command until v1.0** — it blocks multi-disk and
authenticated registries (likely needed eventually). **Do NOT use**
`bootc-image-builder anaconda-iso` output — deprecated in
image-builder v44+. Produce OCI image and bootstrap ISO as
**separate artifacts**.
Overrides over secureblue: keep Trivalent as default (their COPR
tracks upstream M147+ within hours; reverses earlier draft that
treated it as override-and-remove); add Mullvad Browser alongside;
gate Thorium behind `ujust install-thorium` with CVE-lag warning;
restore sudo (revert `run0`-only); re-enable Xwayland.
Mesh stack baked in: Tailscale (Day 1, daily driver), Yggdrasil-go
(Day 1, idle warm-fallback), Reticulum/RetiNet AGPL fork (opt-in
via `ujust install-reticulum`). See `docs/STRATEGY.md` mesh stack
section for the layer breakdown and threat-floor table.
Full plan: `docs/STRATEGY.md`. Spike will land in
`bluebuild/recipe.yml` plus `.github/workflows/build-bluebuild.yml`,
on a separate branch — does NOT land in v0.5.x main.
External dependency tracked: Traefik `no-guest@file` ACL on
nullstone is currently an `0.0.0.0/0` allow-all stub. Must be
fixed before veilor-os first-public-ISO ships, otherwise
`tag:guest` provisioning leaks the full vhost surface to every
veilor user. **Parent operator owns the fix; not in veilor-os
scope.**
---
## v1.0 — production
The line where veilor-os is recommended for a non-author user as a
daily driver.
- **Multi-arch.** x86_64 today; v1.0 ships aarch64 ISO too (laptops
on ARM are real now). Build matrix in CI.
- **LTS commitment** — major versions tied to Fedora's release cadence,
patch releases for security only, documented EOL per major.
- **Recovery ISO** — minimal rescue image with veilor tools (LUKS
unlock, btrfs scrub, sysctl reset, fail2ban unban) for "I cannot log
in to my system" days.
- **TPM2 integration** — sealed LUKS unlock against TPM2 PCRs (opt-in,
default stays password). Ships as helper script, not silent default.
- **Signed update channel** — beyond GPG-signed ISOs, a signed metadata
repo so `veilor-doctor` can detect available updates without trusting
Fedora's mirrorlists alone.
---
## Stretch goals — not on the v0.x → v1.0 critical path
These are spin variants that share veilor-os DNA but need their own
kickstart or build tool.
- **`veilor-server`** — no KDE, no GUI, hardened headless Fedora for
homelab / VPS (e.g. nullstone). Same overlay, different package set.
**Not blocked**, but waits on `veilor-doctor` machine-readable mode
(v0.6) so headless installs have a way to report drift without a TUI.
- **`veilor-kiosk`** — single-app Plasma session, locked-down user,
read-only root. **Not blocked.**
- **`veilor-atomic`** — rpm-ostree / bootc-image-builder rebase.
Status now depends on the **v0.7 bootc spike**: if the spike shows
bootc fixes the anaconda-grind class of bugs, `veilor-atomic`
becomes the v1.0+ mainline rather than a stretch variant. If not,
it stays a parallel track.